Comments? Questions? Disputes?

The calculations and formulas here can be obtained from the DNA•VIEW stain calculator

Forensic mathematics home page.


This problem comes courtesy of Prof. Nikita Khromov-Borisov of the Department of Medical Informatics in St. Petersburg's IP Pavlov Medical University. The case came to Prof. Khromov-Borisov from Dr. Andrew Smolyanitsky of the Forensic Medicine Bureau in St. Petersburg. (from Tomsk to Omsk to Pinsk to Minsk to me)

It's a nice case because unlike the handful of other cases I've encountered that combine the complications of mixture and kinship, it's not too hard. Therefore it's a good learning example — illustrating the principles of incorporating the extra complication of kinship into a mixture analysis, yet with few enough steps and simple enough algebra that the conceptual points are easy to keep in mind.

Man B, who disappeared, is suspected of being a contributor to the unknown mixture U.

References types are available for the man's daughter and his wife.

Suppose the mixture is assumed to be from two people. Then the competing hypotheses are
HpU is B + one random person
HdU is two random people


I consider here just the combinatorial aspect of the problem – nothing about peak height, dropout, etc. or mutation. Each locus is considered independently.

Consider one locus. If the daughter has a unique paternal allele – that is, her allele not in her mother, or her homozygous allele – call it Q, and Q must be seen in U. If she shares two alleles with her mother, call them PQ where Q, at least, is in U. Designate any remaining alleles of U with letters R, S, T. The lower case letters p, q, r, s, t denote the corresponding allele probabilities.

I will use lower case letters from the beginning of the alphabet, namely b, c, d, e, f for allele-valued variables i.e. alleles that might be any of P, Q, R, S, or T.

When the paternal type is known we can call the man's genotype Qb. The random contributors can be cd, or cd and ef.

For each pattern, we need the likelihood ratio LR = X/Y where
X = Pr(U | Hp)
Y = Pr(U | Hd).

Various genetic patterns

  1. U is QRST. This is probably the typical situation.

    For Hp, the mixture QRST is a combination of the man Qb and an unknown cd. So RST are bcd in some order.

    X = Pr(QRST| Man B contributed)
    = Pr(bcd = RST in some order)
    = Pr( [b=R & cd=ST] or [b=S & cd=RT] or [b=T & cd=RS])
    = Pr(b=R)Pr(cd=ST) + Pr(b=S)Pr(cd=RT) + Pr(b=T)Pr(cd=RS)
    = r2st + s2rt + t2rs
    = 6rst.

    (So it would have been simpler to notice in the first place that the contributions RST come ultimately from some three gametes in the parents of the two contributors, any particular assignment of alleles R, S, T to those three gametes has probability rst, and there are 6 ways to permute the three gametes.)

    Y = Pr(QRST | two random people)
    = 24qrst

    using the parenthetical reasoning above – this time there are 4 progenitor gametes and hence 24 ways to permute them.

    LR = 6rst / 24qrst
    = 1 / (4q).

  2. U is QRS.

    X = Pr(QRS | man contributed)
    = Pr(bcd include RS, and may include QRS)

    Following the idea above, LR = 1 / [6q(q+r+s)].

  3. U is PQRS

    The "P" means the paternal type is not known; the man can be either Qb or Pb.

    X = Pr([man is Qb and Qbcd=PQRS] or [man is Pb and Pbcd=PQRS])
    = Pr(man is Qb and Qbcd=PQRS) + Pr(man is Pb and Pbcd=PQRS)
    and each term can be evaluated similarly to the work above.

  4. Other cases – U is Q, or QR, or PQ, or PQR – similar to above.