On DNA Origins

Many relatives have asked me about the ethnic origins estimates from their DNA.
This page answers those questions. -- Wesley Johnston
The Bottom Line

The bottom line is that estimating ethnic origins from DNA still has much to learn. You should not take these estimates as statements of fact but only as estimates. They will change as more is learned and better estimating tools are developed from that new knowledge.

In particular, completely disregard the origins attributed to small percentage (less than 10%) origins, as they are extremely likely to be wrong.

And even the large percentage origins will probably look different 3 years from now. Mine certainly looked different 3 years ago than they do now.

Here is an excellent answer posted 20 Jun 2016 by Jim Bartlett on the Genealogy-DNA e-mail list. Someone had noted that while there are several ethnic origin admixture estimators on GEDmatch, "none of them agree with each other." This is Jim Bartlett's explanation of how that can be so.

Oh - that's just an indication of how imprecise admixture prediction is. Think about it: at 7 generations back (200 yrs ago?) you had 128 ancestors.

  1. You got roughly 1% of your DNA from each one. So that 1% of you is translated into your "admixture" for that ancestor.
  2. Which 1% of each ancestor did you get? Was it a representative 1%, or might it be some very small % of that ancestors admixture.
  3. Do you really know the admixture of each of your 128 ancestors 7 generations back?
  4. If you're thinking of your ancestry 500 years back, the number of different ancestors increases a lot.
  5. Do each of the admixture tools really have accurate reference populations (some are based on on where our 4 grandparents were born)
  6. Do the admixture tools have reference populations that would cover each part of the genome, so that each 1% of your DNA is accurately read. Some report admixture to a hundredth of a % - pretty silly to my way of thinking.
  7. Looking at admixture for my brother and me, you'd never guess we were full siblings with exactly the same ancestry - we each got a quarter of each parent's DNA that the other one didn't.

Take admixture predictions with a grain of salt! Fun to look at, dangerous to rely on too heavily.

Examples of How Estimates Vary or Go Completely Wrong

Example 1: Variations Among Testing Company Estimates

The chart at the top of this page compares the 2 Jan 2016 ethnic origins of the same person, based on their autosomal DNA tests on Ancestry, Family Tree DNA, and 23andMe. The matrix shows the specific estimates made by each company's estimating method. The chart on the right provides a visual comparison of how they compare on their estimates of the top four origin locations of the person.

Clearly, the 23andMe results make it a bit of an apples to oranges comparison. But the East Europe, Scandinavia and West Europe (which 23andMe calls French and German) are apples to apples.

Nevertheless, while Ancestry and FTDNA are similar in their Great Britain estimates (34% and 39%, respectively), 23andMe is drastically lower (7.2% as British & Irish). If you add the 32.1% that 23andMe calls Broadly Northwestern Europe, you have 39.3%, which is very much in line with the other two companies.

The other three main groups are roughly within the bounds of each other, with 23andMe's Broadly European as the wild card.

So on the main four groups, the three companies' estimators are in the same ballpark. But for the small percentages, they vary greatly. Just what is that last 2 or 3 percent? Southern European? Asia Minor? East Asia? At one point, at least one of the companies was also including Neanderthal, but that is gone in the more recent estimates: the DNA did not change, but the estimator changed.


Example 2: The Jewish Child of Non-Jewish Parents

A husband and wife and their two children all tested on Family Tree DNA. FTDNA's estimator showed that neither parent had Jewish ethnic origin, and one of the children had no Jewish ethnic origin. But one child did have Jewish ethnic origin at about 14%. The FTDNA estimator has improved and no longer makes this error. And I did not capture the ethnic origin results when the estimator erred. So I cannot provide the specifics. And that is a good thing. This case illustrates how new knowledge leads to improved estimators and elimination of errors created by earlier estimators. But this case also illustrates how dangerous it can be to make conclusions based on the estimates alone.

And in this case, another member of the family questioned the paternity of the child that the estimator showed with Jewish origin. If the estimator's estimate showed the Jewish ancestry correctly, then that paternity question was a reasonable question to ask. But it was a question easily seen in the autosomal DNA results to prove that the estimator was wrong. The autosomal DNA results clearly show that the husband was indeed the father of both children. So the estimator was wrong: it is impossible for a father and mother who have no Jewish ancestry in their origins to have a DNA-verified child who has 14% Jewish ancestry. And FTDNA has corrected whatever caused this error, even though they never received word of this individual case.

The lesson in this case is to always go back to the raw DNA results. When an origin estimate raises a question, see what the actual DNA shows. It very well may be that the estimator is wrong. Do not jump to conclusions based on the ESTIMATES (keep that word foremost in your mind - estimates are not facts; they are only estimates) of ethnic origins.


Last updated June 23, 2016
Copyright © 2016 by Wesley Johnston - all rights reserved