Forensic mathematics glossary
Not really a glossary; just a collection of miscellaneous words,
not even including the most important ones (just use a search for those).
In fact the intent isn't even definitions, often just comments or links.
- baseline prior probability
- consanguinous mating — see incest
- disaster identification
- exclude, exclusion
- frequently misused word. Consider "We excluded the man of paternity because of three exclusions."
See What's wrong with the "exclusion probability" for a start.
- The concept, while appealing, is dubious as well. Don't get me started.
- frequency spectrum
- The probability distribution of (allele or haplotype) frequencies that occur.
For example, for YFiler haplotypes
the spectrum shows a very high probability of haplotype frequencies between 0.0001 and 0.0002
(many haplotypes have population frequencies in this range), a much lower probability of
haplotype frequencies between 0.0005 and 0.0006.
Theory or data may suggest an expected frequency spectrum, which can then be regarded as
prior probabilities for the frequencies of allelic or haplo-types. Such a prior can be used
along with Bayes' theorem and a sample reference database of allelic types,
to infer allele probabilities.
Brenner's Law is a statement about the frequency spectrum for
forensic STR loci.
- The terms prosecutor's fallacy
and defense fallacy were, I think, invented by
UC Irvine law Prof. William Thompson.
- Ebenezer, the father of Judy, is alleged to be the father of Judy's child.
What modification to the normal paternity calculation is appropriate?
Answer — None, normally. So long as the testing
uses unlinked co-dominant markers (like standard forensic STRs) and both the mother
and the alleged father are tested (normal trio paternity), the fact of the adults'
relationship is irrelevant.
- mass identification; mass disaster identification
- a DNAVIEW speciality.
- likelihood ratio (LR)
- The central concept of Forensic Mathematics. The way
to quantify forensic — or any — evidence. Any other method is either equivalent to
a likelihood ratio or is nonsense.
The framework we consider is trying to judge between two possible hypotheses. A typical pair of hypotheses would be:
- The suspect is the donor of the rape kit semen, versus the suspect is a random man;
- The tested body B is the missing relative of family F, versus B and F are unrelated.
Evidence, i.e. information or data such as DNA profiles, may be better explained by
- paternity index
- a quaint synonym for likelihood ratio used in the context of
- JS Mill explained it well.
- vs "frequency"
- Allele probability and allele frequency are two different things.
Usually people say allele "frequency" when they mean allele probability.
See Why the quotes on "frequency".
- prior probability
- A probability summarizing the value of the evidence prior to inclusion of (for our purposes) the DNA (i.e. scientific) evidence.
The prior probability is therefore — at least in principle — a subjective assessment of anecdotal and other
evidence that cannot be comfortably quantified. It's worth distinguishing several particular situations:
- criminal case
- The prior probability is the evidential value of evidence (such as testimony, documents, demeanor) that the DNA analyst doesn't even hear,
and which in any case is the responsibility
of the judge or jury to assess. Therefore it is clearly wrong for an expert witness to intrude on the prerogative of the court by
making any prior probability assumption. I don't see anything wrong though with advising the court about how mathematics works,
such as by a picture or a chart of examples.
- (civil) paternity case
- In principle the above applies to any court action. But this essay tries to take a realistic view.
- disaster identification
- Links: WTC and tsunami identifications
- When the object of the identification is humanitarian it is typically virtually the case that decision making is deferred to the
scientists. In that case, two useful concepts are:
- baseline prior probability
- If n people are equally missing and all bodies are indistinguishable from one another, then the prior probability
for any particular identity is 1/n. This baseline prior can be a reasonable starting point in realistic
scenarios as well.
- requisite prior
- Suppose a DNA likelihood ratio has been determined, and a posterior probability threshold for identification has been
agreed. Then I define the requisite prior as that prior which is just sufficient for declaring identity.
Example: Suppose n=1000 missing, LR=80000 supporting corpse V to be missing person Jim Jones, and that the
agreed policy is to declare identification when the probability is at least 99.9%. Using the
baseline prior probability of 1/1000, the posterior probability only 98.8% and the threshold is not achieved.
A prior of about 1/81, twelvefold larger, is the requisite prior to obtain 99.9%.
Maybe there is some quite useful non-DNA evidence supporting the ID, for example the body V shares a surgical scar
approximately like Jim Jones is known to have had, and the stature is about right as well. It would be hard to estimate the
exact evidential value of those
coincidences, but we don't have to. It is sufficient to judge that they are worth at least
LR=12, the shortfall from the DNA evidence.
another example and discussion
- Everyone is related, so why do we say for example "The suspect is either the father, or is unrelated to the child" to describe the
alternative hypotheses in a paternity question? Two answers:
All that said, when I or anyone says "unrelated" there is a good chance that what we really mean is "randomly
selected, with no more and no less than a random chance to have any particular close or distant relationship".
- It's hard to find an accurate wording that is simple;
- The fiction of literally being unrelated is sometimes a premise of our computational model.
(The paternity formula 1/2q rests on a premise of unrelatedness.)
Forensic mathematics home page