Introduction: Discussion of syllabus, course website, hypothetical data examples
Scales of measurement (nominal, ordinal, interval)
Random variable (r.v.): a variable whose value is a numerical outcome of a random phenomenon
Discrete random variable: a random variable that only assumes a finite (or countably infinite) number of values
Probability distribution: the set of values that a r.v. can assume, along with the associated probabilities
Cumulative distribution function (CDF): FX(x) = P(X <= x)
Binomial distribution: applicable for a process where a) only 2 outcomes, b) constant success prob., c) independent trials
Continuous random variable: a random variable that can assume a continuous range of values
Probability density function; cumulative distribution function
Normal distribution, standard normal distribution, zp percentile values
Central limit theorem: Asymptotically, the sampling distribution of the sample mean is normal with mean mu and std dev sigma/sqrt(n)
location-scale distributions: f(x) = 1/b h( (x-a)/b )
Other continuous distributions: Uniform, exponential, double exponential (Laplace), Cauchy (note book misprint)
Characteristics of a distribution: skewness, tail weight (kurtosis)
Population; sample; parameter; statistic; point estimate
Interval estimate: Example for normal data
Interpretation of an interval estimate
Hypothesis tests; null hypothesis; alternative hypothesis; test statistic;
Significance level = P(reject H0 when it is true); Power( ) = P(reject H0 when it is false)
Example for normal data
Parametric vs. nonparametric methods
i) Binomial
ii) Permutation methods: Under H0, all permutations of the observations between groups are equally likely. Generally used along with ranks or scores.
iii) Bootstrap methods: Mimic sampling from the population by taking samples with replication from the data.
iv) Smoothing methods: Using local averaging, for example.
v) Non-least squares methods: L1 estimation, M estimation, etc.