Wesley Johnston's
Full (Lower and Upper Bound) DNA Coverage Calculation
Excel Visual Basic Program Source Code and Instructions

Oh, what a tangled web we weave, when first we practice to conceive.


I have submitted my May 2023 paper Fully Calculating Autosomal DNA Coverage Recursively for publication in The Journal of Genetic Genealogy. The paper includes the source code for the Excel Visual Basic program to calculate the lower and upper bound estimates of the range of DNA coverage of anyone in a GEDCOM from the DNA test results of their descendants.

This program is a proof-of-concept. I wanted to prove that a simple program to calculate DNA coverage bounds is possible so that commercial vendors of genealogical database software would see how easy it is to implement a DNA coverage report as a standard feature if they want to be considered as quality products from this time forward. Tens of millions of people now have an autosomal DNA test as a life event so that the products are long overdue in providing this feature.

I want to be clear. I want every genealogical software product to make DNA coverage estimation a standard feature.

So, the code here is neither copyrighted nor patented. It is free to use for anyone who wants to use it.


Instructions for Using the Program
    Setting Things Up
  1. The program works on a GEDCOM export of your family database. But it requires that autosomal DNA testers have a life event to indicate that they have done an autosomal DNA test. So, the first thing to do is to use your family database software to create a user-defined life event called "atDNA" for anyone in your tree who has done an autosomal DNA test. I used Legacy Family Tree to do this. But I think most software allows users to create user-defined live events.
  2. Then export your database to a GEDCOM file. You'll also need to note down the RIN (Record Indentification Number) of the person whose autosomal DNA coverage estimate that you want to calculate.
  3. Open a new Microsoft Excel file and worksheet.
  4. Create a worksheet in the Excel file, and name it "Results Report".
  5. In the Excel spreadsheet, if you do not see a "Developer" tab, right click on the ribbon at the top of the spreadsheet and click on "Customize the Ribbon". Then click the checkbox on the right side for "Developer" and then click "OK".
  6. Click on "Developer" in the top ribbon. Then click on "Visual Basic" at the left.
  7. On the left side of the Visual Basic Editor, you should see "Sheet1(Results Report)". Click on that to enable the right-side panel (a big blank area). Copy the program code from this web page (see below), and paste it into that blank window. Then do a FILE/SAVE from the menu at the top of the Visual Basic Editor. This will save the file as an XLSM file which is the file.

    Entering Your Instructions

  8. To tell the program what you want to do, you must make two changes to the program. These are both right at the top, and there is a comment box there to tell you exactly what to change. You put the RIN of the target person whose DNA coverage you want to calculate in as the value on the right side of the equal sign (=) for XPerson. Then inside the quotes on the GEDCOMFile line, you must enter the complete path name where to find the GEDCOM file.

    Running the Program

  9. Now you are ready to run the program. Click on the "Run" at the top and then on "Run Sub/UserForm" in the pulldown menu. This will popup a box where you have to click "Run" yet again.
  10. The program will run and then display popups as it completes the steps through the program. Just click OK on each pop up to close it and continue the processing of the program.
  11. When it says "DONE", click okay and view the "Results Report" worksheet to see the complete report, which will include every family group needed to do the calculation. The lower bound progress is first, with the target person being the last one. Then the upper bound report follows the same progress.

The Program Code

The diagnostic program code included in my paper prints the large internal Table 1 for each call level (each generation), so that the exact calculations can be traced. This version is fixed as it is in the paper. It has limits. Most significantly it cannot handle families with more than 12 children.

The production code can handle families with 18 children, and it prints the results report without the massive detail of the internal Table 1. In work on actual families, it has handled a family of 17 children. It is set up to handle families of up to 18 children of a parent, but I have not yet found an actual family to test that and have not configured a test case either.

If you want to understand the program, use the diagnostic code that prints out Table 1 for each generation. But if you want to use the code for your real-life ancestors in your GEDCOM, use the production code.

The diagnostic code (same as in the paper) is in this text file.

The production code is in this text file.


Text - Copyright © 2023 by Wesley Johnston
The source code for the program is not copyrighted, and you are free to copy it and use it.
There are no guarantees nor warrantees on its use, and it will not be maintained or upgraded, and I will not answer e-mail questions about it.


First posted May 3, 2023
Last updated June 28, 2023