- Y-haplotype in identification and paternity
I discuss here basic principles in using Y-haplotype information for identity or paternity.
- Identity
Suppose suspect and crime stain have the same Y-chromosome haplotype. That result is normal
and expected (i.e. 100%) if the suspect is the donor; it is the probably of seeing the haplotype
among random men if the suspect is a random man.
The strength of the evidence is therefore simply expressed as matching odds (or equivalently
as a likelihood ratio) of
matching odds = 1 / Pr(haplotype).
- Paternity ordinary case
Typically father and son share a Y-haplotype just as if the son were a crime scene. Therefore
in the typical case the equation above also gives the paternity index:
PI = 1 / Pr(haplotype).
- Paternity mutation
Of course that's not 100% true; there are mutations. Available data supports that the
mutation rates and behavior for STR loci on the Y-chromosome are typical for the genome; so around
μ=1/400 per locus per generation for single step mutations, but with a lot of variation
depending on the locus.
Suppose a man M has Y-haplotype which we call YM and a boy C has the type
YC which differs from YM by a single step
at just one locus.
Obviously, mutation cannot be ignored in this case. Since μ
is the probability of any mutation, but nearly all (90-95%) STR mutations are one-step
and expansion and contraction are about equally common, to a reasonable approximation
the probability to mutate in either direction between YC
and YM is μ/2.
There are several possible approaches. We use the notation PI for the
paternity index, and
PI = X/Y, where
X = Prob(observed haplotypes | F father of C) and
Y = Prob(observed haplotypes | F unrelated to C).
To evaluate Y, we can write
Y = mc where
m=Prob(YM) and
c=Prob(YC).
X is a little more problematic.
- child-centric approach
The child has YC, inherited from his
father. A mutation between
YC and YM may
have occurred, with probability μ/2.
Therefore, given that a child is type c the probability is
approximately μ/2 that his father is type
YM.
Hence
X = cμ/2 and
LR = X/Y = X/cu = 3μ/2u.
It remains to estimate u.
- father-centric approach
In a symmetrical way we could begin with the alleged father, and
obtain instead the formula
LR = 3μ/2c.
- Which approach is right? How to estimate c and/or u?
Deep questions. What is right depends on such things as what you think
the population database represents grandfather's generation?
the child's? If the population were in drift and mutation equilibrium,
then I suppose all methods would give the same answer.
- Pragmatic estimate of the Y-haplotype evidence
Note that all formulas are equivalent if c = u.
Therefore to be conservative let's take the uncle-centric view and
take c=2/171.
Hence LR = 30.009/2(2/171) = 1.15.
|
The meaning of this neutral result is that the chance to see so
rare a haplotype by mutation is about the same as the chance to see it
at random in an unrelated individual.
- Approach to "frequencies"
Frequencies of an unobserved trait is impossible to know. Fortunately frequency isn't the question.
Probability is.
My papers on rare haplotypes offer several approaches.
Bottom line: simple counting (but add 1) is very conservative.
A pretty accurate method that is not complicated is also given. June 2009
- Mitochondrial ID of a sister
Feb2014
Body B and reference sister S have mitotypes mTB
and mTS which differ only by one base, perhaps a mutation.
What is the LR supporting the hypothesis that B is the lost sister of S?
- Formulate mitochondria LR for identification
LR = X/Y where
X = Pr(sisters have types mTB and mTS),
Y = Pr(randomly chosen people have types mTB and mTS).
The difficulty in evaluating X is that we don't know which sister mitochondria represents a mutation.
We can take a mother-centric approach, considering that their common mother had one type or the other and assuming
that one of the sister mitochondria is a mutation. Then there are two possibilities for the mother type, so
X = Pr(mother=mTB and S is a product of mutation)
+Pr(mother=mTS and B is a product of mutation)
= Pr(mTB)μ + Pr(mTS)μ
= μ[Pr(mTB) + Pr(mTS)].
- Probability of type mTB
As part of the answer we will need mitochondrial random matching probabilities.
As an example evaluate
b=Pr(random person = mTB | mTB observed in body
B).
Let's say mTB occurs x times in our reference database of N
mitotypes. Per AlleleProbability.htm,
b < (x+1)/(N+1)
(where the occurrences of +1 represent conditioning on the casework observation
of mTB).
We can improve on that by accepting the logic of the "κ method" according to
which b = λ(x+1)/(N+1).
Here λ is a factor less than one and independent of x.
(Specifically λ = 1 – κ, i.e. λ is the proportion of the N database mitotypes
observations that are not singletons.)
- Solving the LR expression
What Y is
Evaluating Y is easy. Under the assumption that B and S
are just unconnected random observations,
Y = Pr( ... )
ugh! Can NOT condition on the B & S types!