For Stanford people: the canvas page will have the HWs and more references.

Correlation is not causality. You've probably heard that before in any number of regression classes. If you want to infer causality from data, then the best way is to use randomized experiments. Maybe it is the only way to be sure. This course anchors experimental design within causal inference. After a few lectures you will see that it is different from most causal inferences courses. This one is about what to do when youcanrandomize. Most causal inference courses emphasize what to do with observational data where youcould notrandomize though they will also mention randomization too.In experimental design we look at how to choose the data that we will gather. In addition to being able to make causal conclusions, we also look at how to maximize the statistical efficiency of the generated data set.

Experimental design as a subject is about 100 years old. The methods in this course date back to agricultural field trials. Since then the ideas have seen use in medicine, manufacturing, quality control, computer aided design and electronic commerce. Each new field takes the previous methods and then starts adapting them. Possibly the first clinical trial was that of James Lind in 1747 showing that citrus is effective against scurvy. (It was not immediately adopted and maybe even forgotten for a while.)

There will be some problem sets, a midterm onWednesday October 27and a project. The project will involve designing, carrying out and analyzing a real experiment. This can be from your every day life: cooking, hobbies, exercise routines, etc. There are ordinarily about 4-6 problem sets.

This Mark Rober video (might serve an ad) describes an experiment to study which animals (snake vs turtle vs tarantula) are more likely to be run over by vehicles. The results are interesting. It is also funny.

Learnthe main/classical methods of experimental design so that when it comes time to gather data you can work out the right choice.Seesome of the research frontier in DoE: A/B testing, computer experiments, design for high dimensional regression.Doa designed statistical experiment from conception to execution to analysis.

The reason for R is to enable everybody to use some packages instead of coding by hand. Also, last time most people used R.

- One hundred percent comfort with basic probability (e.g., stat 116): you could explain it to your friends if they're stuck.
- Knowledge of linear regression, t tests and ANOVA: how and why they work, how to do them, what p values and confidence intervals really are.
- Programming skill, including R as a second language.

Missing the prerequisites does not necessarily mean that you'd have trouble passing the course. It is more that you would not enjoy it or get out of it everything that you should. Some things would go over your head and you could feel lost.

See page 2 of the course announcement. I'm expecting and hoping for two guest lectures to displace two of the post-midterm topics.

Here is the full set of notes from last year. The chapters are also given below with this year's lecture dates.

- Sep 20. Introduction History of design. Potential outcomes.
- Sep 22. A/B testing Applications to web companies.
- Sep 27. Bandits Especially Thompson sampling.
- Sep 29. Pairing and blocking Prior Stat 305A ANOVA notes One way analysis.
- Oct 04. ANOVA Prior MC notes ANOVA Includes functional ANOVA.
- Oct 06. \(2^k\) factorials Motivations and notation.
- Oct 11. \(2^{k-p}_R\) fractional factorials Aliasing and data analysis.
- Oct 13. ANCOVA and crossovers Before after comparisons.
- Oct 18. Split-plots and nesting Also cluster randomized trials.
- Oct 20. Taguchi methods Robust design.
- Oct 25. Catchup review And DOE analyses.
- Oct 27. No class. Midterm.
- Nov 01. Response surfaces And optimal design.
- Nov 03. Supersaturated designs Hadamard and random balance.
- Nov 08,10. Computer experiments Design and analysis.
- Nov 15,17,29. Guest lectures and hybrids Networks | Complex clinical trials | Partially randomized data.
- Dec 01. Overview Final comments. Optional student presentations.
- Dec 09. Project due date. There's no exam, this would have been our exam date.

200-030 (History corner) Mon & Wed 1:30 to 3:00Lectures at PhD level, homework at MS level.

3 units and letter grade or CR/NC.

- Art Owen
- Sequoia Hall 130
- My userid is owen at the address stanford.edu
- Office hour: Tues 11:00 - 11:59, Sequoia Hall 130

- Dan Kluger
**Friday 1:30-3:30**, Sequoia Hall 207 (Bowker room) - Kangjie Zhou
**Thursday 9:30-11:30**, On zoom (link sent in canvas message)

Some links below. More may be in canvas.

HW 50%. Midterm 25%. Final project 25%.

- Stefan Wager's causal inference notes (especially lecture 1) and Paul Holland's article on Neyman-Rubin causality.
- Michael Nielsen's explanation of Pearl's do calculus.
- The Lanarkshire milk experiment Less nourished children were more likely to be in the treatment groups. (And other issues.) Student gives a careful analysis.
- Kohavi, Tang and Xu's Trustworthy online controlled experiments book. May also be online in SU library.
- Kohavi, Henne and Sommerfield's article on A/B tests for e-commerce.
- Johari, Koomen, Pekelis and Walsh's always valid p values
- Agrawal and Goyal's article on Thompson sampling
- Bubeck and Cesa-Bianchi's monograph on bandit problems.
- Wikipedia entry on Hadamard matrices for Plackett-Burmann designs
- S. Georgiou's review of supersaturated designs.
- Krahmen and Ward's review of design for compressed sensing
- Dean Eckles' thesis including a network experiment using Facebook data

I expect to send a small number of important emails about problem sets and the homework there. Most other announcements will be made in class. If you email me about the class, be sure to havestat 363orstat 263in yoursubject line. Otherwise, your email won't show when I search for course related emails.

We will count days late on each problem set. Each day late is penalized by 10% of the homework value. Homework more than 3 days late will ordinarily get 0. Upload to gradescope within canvas. For sickness, interviews and other events, up to 3 late days total are forgiven at the end of the quarter. (Work late enough to get zero does not get redeemed though.)