Table of contentsLecture #1; Definitions

I will not be so interested in teaching a catalogue of statistical tests, but prefer to discuss questions like:
Another site, funnelweb.utcc.utk.edu (defunct) turned up something that looks more sensible.
"Statistics is [the theory and method of analyzing quantitative data obtained from samples of observations in order to study and compare sources of variation of phenomena, to help make decisions to accept or reject hypothesized relations between phenomena, and to aid in] making [reliable] inferences from empirical observations" (Kerlinger, 1986, p. 175).
Let's condense that to
making inferences about populations from samples
If the mean height of people in the sample is 2m, the mean height
of people in the population is close to 2m.
Population of people, of haploid cells, of 100item samples, of
measurements of a person
50 people, 60 haploids, 70 100person samples, 80 repeated measurements
Robbins example: An experiment has the possible outcomes
E_{1}, E_{2}, ... with unknown probabilities
p_{1}, p_{2}, ... . In n
independent trials suppose that E_{i} occurs
x_{i} times. How can we "estimate" u, the
total probability of unobserved outcomes? (The quotation marks appear
because u is not a parameter in the usual statistical sense.)
Comment (and homework): What does Robbins' parenthetical statement mean?
Answer – Perform an n+1^{st} trial. Note the
proportion of outcomes (out of n+1) that occurred one time.
The proportion (in the population) of outcomes unobserved in the
nsample, is the expected proportion of onceobserved
outcomes in the n+1sample.
Comment: a declarative sentence!
Hypothesis: The universe is half male, half female.
Sample: 10000 individuals, of whom 5100 are female.
Test statistic: chi^{2} = 4
pvalue = 0.04. (Two tailed test)
Comment: if 5200 female, p=0.0001. If 60/100 female, p=0.04
Discussion: accept/reject paradigm
Example: DNA forensics analysts are happy if the population is in HardyWeinberg equilibrium. A test statistic is calculated on a population sample, and converted to a pvalue. If the pvalue is small, e.g. < 0.05, that tends to indicate that the population may not be in HWE.
An analyst proudly testified that out of a large number of such population studies, in only 1% was p<0.05. What's wrong with that?
I said that there must be publication bias. He said, no, the lack of low pvalues was perhaps due to the samples being rather small.
What's wrong with that?
We must remember, that the probability of an event is not a property of the event itself, but a mere name for the degree of ground which we, or someone else, have for expecting it. ... Every event is in itself certain, not probable: if we knew all, we should either know positively that it will happen, or positively that it will not. But its probability to us means the degree of expectation of its occurrence, which we are warranted in entertaining by our present evidence. — J.S. Mill 