Science of Admixture

Garrett Hellenthal, statistical geneticist at the Genetic Institute, University College London, gave a very interesting talk at the recent Who Do You Think You Are convention in England on the science of how DNA companies predict ethnicity. Given that ethnicity is the main reason why the majority of clients get their DNA tested at AncestryDNA, it is a very good overview of just how the different DNA testing companies arrive at their ethnicity conclusions.

Gedmatch and WikiTree

Gedmatch is now accepting links to WikiTree, which shows up in the new Ged/Wiki column. This is a welcome alternative to uploading a gedcom, which needs to be deleted and reuploaded each time changes are made to the gedcom, because it allows the tree to be continuously updated in the WikiTree community and always the latest version will be available in the link. To learn more about this new feature click here.

Gedmatch Generations

One of the tools on Gedmatch that is very useful but which causes some confusion is the column headed “Gen”, which means “Generations”. The number in this column means the estimated number of generations back to the common ancestor shared by you and your match. Whilst we easily understand that 2 generations back to a common ancestor means we are cousins, 3 generations back means we are second cousins and so on, but what confuses many people is how can you have in-between numbers like 2.6, 3.9, 4.1?

This number is to be understood as a guestimate or guideline that shows roughly how far back you might start looking for your common ancestor. To quote genealogist Kerry Scott, the generation estimates are “not etched in stone – they are etched in sand at best!” This is due to the random way that DNA is inherited between each generation – it is very common, once you get beyond a couple of generations, for a segment of DNA to remain intact and be passed down over several generations without changing. This means that the number of generations given with your match might also be exactly the same as that shown with your match’s parent or grandparent!

Another problem comes when you might be three generations back to a common ancestor and your match is from a closer generation and they are only two generations back to the same common ancestor. How do you work that out?!

So I had a look at the known connections I have with my genetic cousins who’ve uploaded to Gedmatch and this is the range of how Gedmatch calculated the generation distance to the most recent common ancestors and then I give the actual relationships. As you can see, the further distant the common ancestor, the more varied are the possible relationships.

Gen 1
Well, that’s easy – it’s always going to be a parent-child relationship

Gen 1.2
Oddly, this is always a sibling relationship

Gen 1.4
Half-sibling
Uncle ~ niece
Grandparent

Gen 1.5
Uncle ~ niece
This makes sense: the common ancestors for my uncle are his parents, which is 1 generation, but for me, his niece, it is my grandparents, 2 generations. Therefore, the Gedmatch Generation is calculated as being between 1 and 2 = 1.5

Gen 1.6
Uncle/aunt ~ niece/nephew
Grandparent

Gen 1.9
1C (first cousins), whose common ancestors are their grandparents, which is 2 generations

Gen 2.2
1C (first cousins)
1C1R (first cousins once removed)

Gen 2.3
1C1R (first cousins once removed)

Gen 2.5
1C1R (first cousins once removed)
Again, this makes sense: my cousin is a generation older than me, his grandparents, which is 2 generations, are my great-grandparents, which is 3 generations. Therefore, the Gedmatch Generation is calculated as being between 2 and 3 = 2.5

Gen 2.6
1C1R (first cousins once removed)
2C (second cousins)

Gen 2.9
2C (second cousins)

Gen 3.0
2C (second cousins)
This is the ideal scenario, with the common shared ancestors for me and my match both being 3 generations back.

Gen 3.3
2C1R (second cousins once removed)

Gen 3.5
2C1R (second cousins once removed)
Again, this makes sense: my second cousin is a generation older than me, her G-grandparents, which is 3 generations, are my GG-grandparents, which is 4 generations.Therefore, the Gedmatch Generation is calculated as being between 3 and 4 = 3.5
3C (third cousins)
2C2R (second cousins twice removed)
Here we have a case of our common ancestors being my G-grandparents, 3 generations, but these ancestors are my matches GGG-grandparents, 5 generations: a difference of two generations between me and my match

Gen 3.6
2C1R (second cousins once removed)
3C (third cousins)

Gen 3.7
2C1R (second cousins once removed)
3C (third cousins)
3C1R (third cousins once removed)

Gen 3.8
2C2R (second cousins twice removed)
3C (third cousins)

Gen 3.9
3C1R (third cousins once removed)

Gen 4.0
3C (third cousins)
This is the ideal scenario, with the common shared ancestors for me and my match both being 4 generations back.

Gen 4.1
2C1R (second cousins once removed)
2C2R (second cousins twice removed)
2C3R (second cousins three times removed)
This is quite an unusual because our shared common ancestors are my GGGG-grandparents, 6 generations back, but my match’s shared common ancestors are only his G-grandparents. That’s a difference of three generations, even though my match is just 10 years older than me! This is because I am descended from the eldest child of our common ancestor, but my match is descended from the youngest child, who was 25 years younger; and likewise my ancestors were the eldest of young parents, but my cousin’s parents and grandparents had children late in life, which resulted in this apparent shift of three generations even though me and my match are in the same present-day generation! Yes, just think about it for a moment 😀
3C1R (third cousins once removed)

Gen 4.2
2C1R (second cousins once removed)

Gen 4.3
3C (third cousins)

Gen 4.4
2C2R (second cousins twice removed)
3C (third cousins)
3C1R (third cousins once removed)
3C2R (third cousins twice removed)
4C (fourth cousins)
4C1R (fourth cousins once removed)

AncestryDNA article in Nature Communications

2017-02-08_nature_articleWhen you choose not to opt out of the research component when you first sign up for AncestryDNA, your DNA results as well as your public family tree are used as data for scientific research. Although this can include the use of your results by third party companies with associations with AncestryDNA, it is also used for projects like the new article that has published in Nature Communications. This article examines recent migrations within the United States and reveals ethnic clusters that are supported by DNA data.

You can read about the research on the Ancestry Blog:
Nature Communications publishes AncestryDNA breakthrough on genetic communities

You can read the open access article in Nature Communications:
Clustering of 700,000 genomes reveals post-colonial population structure of North America

What is a genetic genealogist

Blaine Bettinger poses the tricky question, “What is a genetic genealogist?” in his latest blog post. Has the time come when DNA is such an important source of evidence that we no longer need to define it separately? This post also drew a lot of comment from the ISOGG Facebook group with many people supporting this point of view but others who still think that DNA is a sideline to the main event of ‘genealogy’.

What is a genetic genealogist?