# Forensic mathematics glossary

Not really a glossary; just a collection of miscellaneous words, not even including the most important ones (just use a search for those). In fact the intent isn't even definitions, often just comments or links.
baseline prior probability

consanguinous mating — see incest

disaster identification

exclude, exclusion
frequently misused word. Consider "We excluded the man of paternity because of three exclusions." See What's wrong with the "exclusion probability" for a start.
The concept, while appealing, is dubious as well. Don't get me started.

frequency spectrum
The probability distribution of (allele or haplotype) frequencies that occur. For example, for YFiler haplotypes the spectrum shows a very high probability of haplotype frequencies between 0.0001 and 0.0002 (many haplotypes have population frequencies in this range), a much lower probability of haplotype frequencies between 0.0005 and 0.0006.

Theory or data may suggest an expected frequency spectrum, which can then be regarded as prior probabilities for the frequencies of allelic or haplo-types. Such a prior can be used along with Bayes' theorem and a sample reference database of allelic types, to infer allele probabilities.

Brenner's Law is a statement about the frequency spectrum for forensic STR loci.

fallacy
The terms prosecutor's fallacy and defense fallacy were, I think, invented by UC Irvine law Prof. William Thompson.

incest
Ebenezer, the father of Judy, is alleged to be the father of Judy's child. What modification to the normal paternity calculation is appropriate?
Answer — None, normally. So long as the testing uses unlinked co-dominant markers (like standard forensic STRs) and both the mother and the alleged father are tested (normal trio paternity), the fact of the adults' relationship is irrelevant.

mass identification; mass disaster identification
a DNA•VIEW speciality.

likelihood ratio (LR)
The central concept of Forensic Mathematics. The way to quantify forensic — or any — evidence. Any other method is either equivalent to a likelihood ratio or is nonsense.

The framework we consider is trying to judge between two possible hypotheses. A typical pair of hypotheses would be:

Evidence, i.e. information or data such as DNA profiles, may be better explained by one hypotheses

paternity index
a quaint synonym for likelihood ratio used in the context of disputed paternity
probability
JS Mill explained it well.
• vs "frequency"
• Allele probability and allele frequency are two different things. Usually people say allele "frequency" when they mean allele probability. See Why the quotes on "frequency".
• prior probability
• A probability summarizing the value of the evidence prior to inclusion of (for our purposes) the DNA (i.e. scientific) evidence. The prior probability is therefore — at least in principle — a subjective assessment of anecdotal and other evidence that cannot be comfortably quantified. It's worth distinguishing several particular situations:
1. criminal case
2. The prior probability is the evidential value of evidence (such as testimony, documents, demeanor) that the DNA analyst doesn't even hear, and which in any case is the responsibility of the judge or jury to assess. Therefore it is clearly wrong for an expert witness to intrude on the prerogative of the court by making any prior probability assumption. I don't see anything wrong though with advising the court about how mathematics works, such as by a picture or a chart of examples.
3. (civil) paternity case
4. In principle the above applies to any court action. But this essay tries to take a realistic view.
5. disaster identification
6. Links: WTC and tsunami identifications
When the object of the identification is humanitarian it is typically virtually the case that decision making is deferred to the scientists. In that case, two useful concepts are:
1. baseline prior probability
2. If n people are equally missing and all bodies are indistinguishable from one another, then the prior probability for any particular identity is 1/n. This baseline prior can be a reasonable starting point in realistic scenarios as well.
3. requisite prior
4. Suppose a DNA likelihood ratio has been determined, and a posterior probability threshold for identification has been agreed. Then I define the requisite prior as that prior which is just sufficient for declaring identity.

Example: Suppose n=1000 missing, LR=80000 supporting corpse V to be missing person Jim Jones, and that the agreed policy is to declare identification when the probability is at least 99.9%. Using the baseline prior probability of 1/1000, the posterior probability only 98.8% and the threshold is not achieved. A prior of about 1/81, twelvefold larger, is the requisite prior to obtain 99.9%.

Maybe there is some quite useful non-DNA evidence supporting the ID, for example the body V shares a surgical scar approximately like Jim Jones is known to have had, and the stature is about right as well. It would be hard to estimate the exact evidential value of those coincidences, but we don't have to. It is sufficient to judge that they are worth at least LR=12, the shortfall from the DNA evidence.

unrelated
Everyone is related, so why do we say for example "The suspect is either the father, or is unrelated to the child" to describe the alternative hypotheses in a paternity question? Two answers:
1. It's hard to find an accurate wording that is simple;
2. The fiction of literally being unrelated is sometimes a premise of our computational model. (The paternity formula 1/2q rests on a premise of unrelatedness.)
All that said, when I or anyone says "unrelated" there is a good chance that what we really mean is "randomly selected, with no more and no less than a random chance to have any particular close or distant relationship".