Stat 204: Sampling

Once class starts, most of the new information will go into the canvas system online.

Overview

This course is about sampling methods. Where does data come from? Often it is a byproduct of some other process such as a billing system that logs transactions. Every once in a while it comes from a purposeful experiment. (Hurray for stat 263/363!) The other place is sampling. Somebody sets out to purposely gather the needed data. They sample from some population of people or events or items. There is then a tradeoff between accuracy and cost, and clever tactics can help you get the best outcome.

The classic use case for sampling is surveying for opinions. That subject is in a severe crisis right now. Response rates have fallen below 10%. There is now a lot of interest in methods to mitigate non-response bias, including weighting of survey data, and even moving away from probability sampling.

Sampling happens in lots of other places:


Topics

  1. Simple random sampling
  2. Stratified sampling
  3. Cluster sampling
  4. Ratio and regression estimators
  5. Capture-recapture
  6. Adaptive sampling
  7. Small area estimation
  8. Sampling program input spaces

Goals

  1. Learn basic strategies and their consequences for sampling from finite populations.
  2. Learn about methods on the frontier of sampling; it goes way beyond surveys.

Classes

Monday and Wednesday 1:30 to 2:50
Green Earth Sciences Room 150

Instructor

Art Owen
Sequoia Hall 130
username: owen on stanford.edu
Office: Tuesday 11:00

TAs


Notes and texts

The class text is "Sampling: Design and Analysis", third edition by Sharon L. Lohr
I will add some enrichment topics

Midterm

There will be a midterm in class on May 3. This midterm is closed book and closed to notes too. Just you and your blue book and a pen or pencil. The midterm will count 30%. Homeworks are the other 70%. There is no final.

Problems

They will be posted in canvas
Some R data sets
Students are expected to use R to do the problem sets.

Be sure that Canvas has a working email address:
I expect to send a small number of important emails about problem sets and the homework via Canvas. Most other announcements will be made in class. If you email me about the class, be sure to have stat 204 in your subject line. Otherwise, your email won't show when I search for course related emails.
Late penalties apply:
We will count days late on each problem set. Each day late is penalized by 10% of the homework value. Homework more than 3 days late will ordinarily get 0. If you're travelling, you can email a pdf file. For sickness, interviews and other events, up to 3 late days total are forgiven at the end of the quarter. (Work late enough to get zero does not get redeemed though.)