Published in the 1999 ISFG proceedings
ContentsAbstractDiscussion |
Powerpoint version of this paper |
Eventually I hit upon a useful heuristic for comparing among multiple possibilities. It consists in arranging the possibilities in a diagram mathematically called a lattice, after which a small amount of work usually eliminates with near certainty all incorrect assignments of identity to body.
Applications where the method discussed may be useful include mass
disasters, multiple graves (as in recent Balkan wars), and some
complicated kinship, immigration, or
inheritance problems.
Discussion
1. Likelihood ratios
In using genetic typing results to decide between two possible
ways that a set of people may be related, using a likelihood ratio is
natural. Suppose, for example, Mother=PS, Child=PQ, Man=RQ are the
genotypes, and suppose that the man either is the father or is
unrelated to the child. Then
X=(2ps)(2qs)(1/4) is the probability of
observing such types if the man is the father and
Y=(2ps)(2qs)(q/2) is the
probability of observing such types if the man is unrelated. The
ratio, PI=X/Y=1/(2q) is the
likelihood ratio favoring paternity.
2. Likelihoods
assumption | father | uncle | unrelated |
---|---|---|---|
relative likelihood of evidence | X/Y | (X/Y + 1)/2 | 1 |
This is quite a feasible approach when there is a
handful of possibilities so long as the number of
possibilities is small enough that one is willing to
make a separate calculation for each possibility.
3. Myriad possibilities
But for a really complicated problem, some further simplification and
systematization is desirable. While trying to confirm body part
identifications from the September 1998 Swissair crash near Halifax,
Nova Scotia, I developed an approach that I call the lattice
method.
The method is illustrated by considering one of the complicated family identification problems that arose. Five members of the X__ family perished in the crash. The child Albon was not on the plane and was the one living reference. Among the DNA profiles from body parts recovered at the crash site, there were five that appeared to form a cluster of relationships including Albon. Further, based on the particular patterns, including amelogenin types, a tentative assignment was made of body parts/profiles to names. In the figure, the letter E represents Albon, and the other letters represent DNA profiles that are tentatively ascribed to people, as suggested by the position that the letter occupies in the family tree.
The favored set of tentative identifications, abbreviated GFDCM, is the most likely possibility but not the only one. We set as a goal a likelihood ratio of at least 106, when the best explanation is compared with the second best. Many potential alternative explanations are conceivable. At a minimum, those combinations like ?FDCM, meaning G is not Sylvie but is instead another, unrelated person, are consistent with the DNA evidence. The number of such combinations, obtained by omitting one or more of the letters G, F, D, C, or M from the diagram, is 32. Besides that, it might be possible to exchange some pairs of letters, or to shuffle them around. There are hundreds of combinations to consider.
Therefore, the linear method used above for father-uncle-unrelated is not attractive.
In the example shown for the X__ family, one of the top-level likelihood ratios is only 300. In practice we could improve this number by taking into account the "closed system" nature of the crash if G is not Sylvie, who else could G be? (Additionally: If Sylvie is not G, where is she?) However, considering the X__ family in isolation ?FCDM is mildly plausible and the 106 goal would be missed.
Further likelihood ratio calculations were then made along the arrows leading down from ?FCDM, in order to assure that there were no other competitor explanations, but there is no need (in this case) to calculate arrows further down the lattice than two levels as shown in the diagram. To see why, consider the likelihood ratio comparing GFDCM at the top, and ??D?? four levels down. It is obtained by multiplying together the labels (representing likelihood ratios) of each arrow along the path. That product is 3·109 already >106 after two terms, and since by the heuristic assumption the remaining terms must be >1, the superiority to106 is assured.
In graphical terms, "trading places" explanations are those that can't be reached from the top of the lattice by following arrows downward. They are connected to the lattice though, because for example G?M?D and GFDCM have a common descendant G????. The lattice structure is helpful for proving that no "trading places" explanation is a good explanation. However, I shall not include examples, and I have omitted them from the example lattice of identifications diagram to avoid clutter.
Acknowledgments
Swissair identification team members Ron Fourney, George Carmody, Benoit Leclair, and Chantal Frégeau were particularly helpful during the development of these ideas.