## Table of contentsHaplotype DNA evidence- Y-chromosome analysis
- Approach to "frequencies"
- Mitochondrial ID of a sister Feb2014
Analysis of Y-haplotype information in a kinship case Forensic mathematics home page Comments are welcome (see home page for email) |

- The genetic rules are simpler – the trait is either known to be passed or known
not to be passed (depending on the sexes involved) to each offspring; there are no
choices or 50% probabilities of transmission as with nuclear DNA.
May2018, note added 5-10 years after initially writing this web page: This page discusses only how to handle the complication of relatives and possible mutation. It doesn't deal with the difficult questions of evaluating what I loosely call "Pr(haplotype)" below, or we'd have many further reasons to list:

- Several markers are linked, i.e. physically chained and inherited together, so they must be considered as a unit. No recombination. The product rule doesn't apply at all.
- Autosomal STR alleles matching probabilities or frequencies can reasonably be estimated as sample frequencies; for Y haplotypes that is far from true.
- When using population data to estimate the significance of an autosomal STR allele match, it's close to adequate to consider only the allele in question. For Y by contrast, it's important to look at the entire database.
- Putting confidence intervals on a matching probability is mathematically ignorant. However, while it's a fairly harmless error in the autosomal domain, it's a crippling mistake when dealing with Y haplotype matching probabilities.
- Obviously all men are related. Forgetting that when dealing with Y misses the fundamental point that nearly all Y matching is from identity by descent. That's not so with a single STR locus as in the autosomal situation.
- "Theta" — the chance of two alleles or haplotypes being IBD —
in autosomal practice is a minor adjustment to the matching chance.
With Y haplotypes it's the main thing; it's a good approximation to the matching
chance and anything
*else*is a minor adjustment. - Explicit models — a careful mathematical approach laying out premises and deriving results from them — can be overlooked in autosomal practice without tragedy. In Y haplotype practice they are vital, and their persistent absence from all early papers many recent ones results in nonsensical recommendations and practice.
- Geographical clustering is a vital concern with Y haplotypes; in autosomal work not so much.

- Y-haplotype in identification and paternity I discuss here basic principles in using Y-haplotype information for identity or paternity.
- Identity Suppose suspect and crime stain have the same Y-chromosome haplotype. That result is normal and expected (i.e. 100%) if the suspect is the donor; it is the probably of seeing the haplotype among random men if the suspect is a random man.
- Paternity – ordinary case Typically father and son share a Y-haplotype just as if the son were a crime scene. Therefore in the typical case the equation above also gives the paternity index:
- Paternity – mutation Of course that's not 100% true; there are mutations. Available data supports that the mutation rates and behavior for STR loci on the Y-chromosome are typical for the genome; so around
- child-centric approach The child has
- father-centric approach In a symmetrical way we could begin with the alleged father, and obtain instead the formula
- Which approach is right? How to estimate
c and/oru ?
Deep questions. What is right depends on such things as what you think
the population database represents – grandfather's generation?
the child's? If the population were in drift and mutation equilibrium,
then I suppose all methods would give the same answer.
- Pragmatic estimate of the Y-haplotype evidence
- Approach to "frequencies" Frequencies of an unobserved trait is impossible to know. Fortunately frequency isn't the question. Probability is.
- Mitochondrial ID of a sister Feb2014
- Formulate mitochondria LR for identification
- Probability of type
mT _{B}
As part of the answer we will need mitochondrial random matching probabilities.
As an example evaluate
- Solving the LR expression

The strength of the evidence is therefore simply expressed as matching odds (or equivalently as a likelihood ratio) of

matching odds = 1 / Pr(haplotype).

Suppose a man M has Y-haplotype which we call _{M}_{C}_{M}

Obviously, mutation cannot be ignored in this case. Since _{C}_{M}**/2**.

There are several possible approaches. We use the notation PI for the
paternity index, and

PI = X/Y, where

X = Prob(observed haplotypes | F father of C) and

Y = Prob(observed haplotypes | F unrelated to C).

To evaluate Y, we can write

Y = _{M}_{C}

X is a little more problematic.

Hence

X = **• μ/2** and

LR = X/Y = X/

LR =

Note that all formulas are equivalent if

Hence LR = 3•0.009/2(2/171) = 1.15. |

The meaning of this neutral result is that the chance to see so rare a haplotype by mutation is about the same as the chance to see it at random in an unrelated individual.

My papers on rare haplotypes offer several approaches. Bottom line: simple counting (but add 1) is very conservative. A pretty accurate method that is not complicated is also given. June 2009

Body

LR = X/Y where

X = Pr(sisters have types _{B}_{S}

Y = Pr(randomly chosen people have types _{B}_{S}

The difficulty in evaluating X is that we don't know which sister mitochondria represents a mutation. We can take a mother-centric approach, considering that their common mother had one type or the other and assuming that one of the sister mitochondria is a mutation. Then there are two possibilities for the mother type, so

X = Pr(mother=_{B}_{S}

= Pr(_{B}_{S}

= μ[Pr(_{B}_{S}

_{B}_{B}

Let's say _{B}*x* times in our reference database of *N*
mitotypes. Per AlleleProbability.htm,
*x*+1)/(*N*+1)
(where the occurrences of _{B}

We can improve on that by accepting the logic of the "κ method" according to
which *x*+1)/(*N*+1).
Here λ is a factor less than one and independent of *x*.
(Specifically λ = 1 – κ, i.e. λ is the proportion of the *N* database mitotypes
observations that are *not* singletons.)

ugh! Can NOT condition on the B & S types!

Go to top