Brenner's Law — rare alleles are common

  1. Popularities
  2. Allele sizecount or popularity
    818
    916
    1028
    1121
    1214
    132
    141

    The table at the left is a typical reference sample for some STR locus. It shows allele size and number (count) of observations of the allele. Let's just focus on the count or popularity column, and suppose we examine many such tables. What will be the most popular number to appear as a count?

    m=multiplicity of p within database # αp of databases with m types of popularity (count) p fraction of databases with m types of popularity p
    p=1 (singletons) p=2 (doubletons) p=3 (tripletons) p=1 p=2 p=3
    02964675470.370.580.68
    12822372130.350.30.27
    212175350.150.090.04
    3481440.060.020.005
    431620.040.0070.002
    51110.010.001
    6410.0050.001
    750.006
    830.004
    αp=total p counts across 801 databases α1=930 α2=464 α3=303 1.01.01.0
    pαp=total chromosomes accounted for 1·α1=930 2·α2=928 3·α3=909

    ← equal under the Law

    I examined a large collection of mostly published STR reference "databases" or population samples of moderate size. I tabulated 801 of them each having from 100 to 1000 chromosomes (observations). Singletons — allelic types with a count of one — are by a large margin the most popular; occurring in total 930 times among the 801 databases. 63% had one or more once-observed allelic types or "singletons". On average there were 1.16 singletons per database. I suggest the word popularity for the number of times something has occurred. A singleton means an allelic type of count or popularity p=1 in a database. If we denote by αp the number of allelic types of popularity p found in the dataset, then we can say that α1=930 is the popularity of singletons, and that singletons are very popular. Obviously these 930 singletons represent 930 (fragments of) chromosomes.

    Doubletons — types of count p=2 — had a total popularity of α2=464 among the 801 databases. Since each doubleton represents two observations, in total they account for 2·α2=928 chromosomes, nearly the same as the singletons. And the 303 tripletons represent a similar number, 3·α3=909, of total observations.

  3. Brenner's Law
  4. All of which suggests Brenner's Law, the rule that

    The number of p·αp of alleles represented by database popularity p is constant over p.

    How well does it hold up? Look at the dotted line in the image at right. It's not highly accurate; let's call it a rule of thumb. It's moderately supported by the data shown, but it is also suggested by more than the data here presented. I did an earlier study based on RFLP markers; they conform more closely. Most importantly there is a theoretical underpinning. In fact I first investigated this distribution to compare STR markers with Ewens's sampling distribution for the ideal situation of "infinite alleles." Brenner's Law follows from Ewens' formula in the limit as the mutation rate goes to zero. Of course STRs violate all of the assumptions of the infinite alleles model with 0 mutation —

    1. "Infinite alleles" means that mutation is always to a new type; STR mutations are nearly always to an already existing type.
    2. STR mutation rates are about 1/350 per meiosis, not zero.
    3. Real populations grow and have immigration.

    so we cannot expect accuracy. But the main reason of the above that the data doesn't conform to the Law is #1. The fact of convergent mutation for STRs is an influence towards common types. Point #2 compensates somewhat. A very high mutation rate, such as exists for Y-haplotypes, discourages common types.

    The general point is that nature strongly favors rare alleles.

  5. Consequences of Brenner's Law
  6. Brenner's Law is an observation about the comparative prevalence of rare and of common forensic STR allelic variants.


Return to home page of Charles H. Brenner