Royal Society Publishing

Ancient human mtDNA genotypes from England reveal lost variation over the last millennium

A.L Töpf, M.T.P Gilbert, R.C Fleischer, A.R Hoelzel


We analysed the historical genetic diversity of human populations in Europe at the mtDNA control region for 48 ancient Britons who lived between ca AD 300 and 1000, and compared these with 6320 modern mtDNA genotypes from England and across Europe and the Middle East. We found that the historical sample shows greater genetic diversity than for modern England and other modern populations, indicating the loss of diversity over the last millennium. The pattern of haplotypic diversity was clearly European in the ancient sample, representing each of the modern haplogroups. There was also increased representation of one of the ancient haplotypes in modern populations. We consider these results in the context of possible selection or stochastic processes.

1. Introduction

Since the appearance of anatomically modern humans in Europe in the Late Upper Palaeolithic (approx. 40 000 years BP), there have been demographic changes, which can be interpreted from patterns and levels of genetic diversity. This includes an apparent loss of diversity in Europe during the Holocene (e.g. Marth et al. 2004), signatures of population expansions (e.g. Comas et al. 1997) and Neolithic data suggesting primarily Palaeolithic ancestry in modern Europe (Haak et al. 2005). More recent historical events, especially widespread epidemics during the Middle ages, provided the potential for the stochastic loss of diversity (or selection). For example, it is known that while some entire families and even villages were wiped out by the plague, others were immune or resistant to it (Clifford 1989). In this study (after Töpf et al. 2006), we use archaeological material to assess historical matrilineal diversity in Britain around the time of the Roman and Saxon invasions, and find direct evidence for the loss of diversity over the last millennium.

2. Material and methods

(a) Samples

Human skeletal remains exhumed from five archaeological sites dated from the fourth to eleventh centuries were studied (see Töpf et al. (2006) for details and mtDNA HVS-I sequences; table 1). The comparative dataset from modern European and Middle Eastern populations were taken from GenBank, the HVR database (corrected version) and MitoMap (N=6320). In particular, modern sequences from England were from Piercy et al. (1993), Miller et al. (1996), Richards et al. (1996) and Helgason et al. (2001).

View this table:
Table 1

Mitochondrial DNA diversity values. (Combined Modern: England, Wales, northern Germany, Denmark and Norway. Approximate dates: Leicester, fourth century. Norton, fifth to sixth century; Buckland, sixth century, Lavington, sixth century and Norwich, tenth century. Number of original samples is given parenthetically, among which ‘N’ amplifications were successful. Number of haplotypes (k), gene diversity±s.d. (h±s.d.), Shannon diversity index (Hs), standardized Shannon index Embedded Image, haplotypic richness (r(g)—based on a sample g=18), and % nucleotide diversity±s.d. (π±s.d.). Data in plain text are for subsets of ancient Britain and modern England.)

Thorough precautions against the inclusion of contaminant or false sequences were undertaken (see Töpf et al. 2006); however, the possible generation of false diversity from chimeras during amplification is a special consideration for this study. No evidence was detected from sequencing a subset of clones (Töpf et al. 2006), but as an extra precaution we used split decomposition analysis (implemented in SplitsTree v. 4; Huson 1998) to identify networking within a phylogenetic tree as a way to look for possible chimeras (cf. Holmes et al. 1999). Chimeric sequences or post-mortem changes are unlikely to replicate full haplotypes already identified in the modern database (though individual sites may correspond, see Gilbert et al. (2003)); so as a first test, diversity levels were recalculated omitting individuals whose haplotypes were both unique to the ancient sample and involved in any level of networking within the split decomposition tree (test 1 in table 1). More stringent tests omitted all haplotypes involved in networking (test 2) or all haplotypes unique to the ancient sample (test 3).

(b) Summary statistics

Haplotype and nucleotide diversity were estimated taking into account the differences in sample size among the populations. Haplotype diversity was assessed as Nei's unbiased gene diversity (h), the Shannon index and its standardized version (where the index is divided by the maximum possible value for a given sample size; Magurran 1988), and ‘haplotype richness’, denoted as r(g) (based on Petit et al. 1998; implemented in Rarefactor, The proportion of pairwise differences (π) between all pairs of sequences in the sample was calculated, assuming the Tamura–Nei model and a gamma distribution with α=0.26. The level of unbiased gene diversity (h) was statistically compared among populations using the conservative method proposed by Thomas et al. (2002), whereby both bootstrap resampling (10 000 iterations) and standard z-tests are implemented, and the higher of the two p-values used.

(c) Phylogenetic analysis

Sequences were aligned using the Sequencher v. 3.0 (Gene Codes Corporation, USA) and polymorphic positions identified using Mega v. 2.1. Reduced median networks (Bandelt et al. 1995) were constructed using Network v. 2.0. To maximize the resolution, all of the segregating sites found along the 207 bp (including deletions) were used for the analyses. A mismatch distribution was computed using Arlequin and τ calculated from the distribution using the same program.

3. Results and discussion

From 48 individuals, 36 haplotypes were authenticated and included (accession numbers DQ191964–DQ192011; Töpf et al. 2006). The reduced median network (RMN) for the ancient populations (figure 1) shows no indication of strong lineage sorting, as most of the major Eurasian haplogroups and sub-haplogroups can be identified in the network (except for H, preHV, J, U3 and U4).

Figure 1

(a) RMN for the 48 HVS-I sequences from ancient Britain. Early ancient Britain (settlements from fourth to seventh century) with dark circles; Late ancient Britain (ninth to eleventh century) with open circles. Node sizes are proportional to frequency. (b) Split decomposition analysis for ancient samples (samples involved in networking are labelled).

Figure 2 shows the mismatch distributions for the ancient samples (all combined) and the samples from modern England. The crest of the distribution for the ancient population (τ=4.78) is shifted to the right compared with modern England (τ=2.44), suggesting an earlier population expansion for the ancient sample. This effect would be expected following a loss of diversity.

Figure 2

Mismatch distribution for the ancient DNA sample compared with that for modern England. Full line, Modern England; dashed line, ancient Britain.

Estimates of genetic diversity (corrected for differences in sample size) were computed for historical and modern samples. Counter-intuitively, the genetic diversity of the ancient sample was consistently higher (across all measures) than the diversity observed in modern England (all Caucasians; for h, p=0.046), or for a range of probable source populations, including England, Denmark, Germany and Norway (for h, p=0.009; table 1). The relationship held after dividing modern England by region (table 1), even when the sample from Cornwall is excluded from the analysis (which is known to be relatively depauperate of variation; see Richards et al. 1996). A comparable diversity level to our ancient sample was seen for seventh and third century BC Etruscan samples (h=0.98), but modern populations in southern Europe also have relatively higher levels of diversity than that in the north (Vernesi et al. 2004).

All archaeological sites were relatively short lived, spanning just over a century, indicating that the high level of diversity has not been inflated by extended temporal sampling. The difference in diversity level in our sample could be an artefact of small sample size, since most human haplotypes are rare, and therefore a small sample may not include many copies of the more common haplotypes (Helgason et al. 2000). To test this, we randomly resampled the full database for the modern England population for 10 subsamples of 48 individuals (where possible reflecting the geographical distribution of the ancient sample). They showed marginally inflated diversity as expected (h=0.897–0.965, mean=0.942), but all remained lower than h for the ancient sample. From a survey of 35 population samples from Europe and the Middle East ranging in size from 35 to 1199 (total N=6320), only three populations: Turkey (N=72); Palestine (N=117); and Belarus (N=55) showed levels of diversity as high as that seen in the ancient sample (data not shown).

Little networking was observed in the split decomposition tree (same structure found using Kimura 2P, HKY85 and uncorrected p genetic distance models; figure 1), and none showed parallel edges that might indicate conflicting phylogenetic signals (see Holmes et al. 1999). The networking seen is most probably due to stochastic processes. All precautionary tests (see §2) resulted in measures of diversity higher than that seen for the modern populations (table 1). These very stringent tests show that diversity generated post-mortem or during amplification cannot fully explain the greater variation in the ancient sample.

Based on the assumption that modern England is more cosmopolitan, higher genetic diversity in the ancient sample was unexpected. One possible explanation is the stochastic loss of diversity. The Black Death in 1347–1351 resulted in the loss of a maximum of 50% of the population of Europe, although the impact was not uniform. Outbreaks reoccurred frequently until the fifteenth century (Ziegler 1969). During the second major impact, the Great Plague of 1665–1666, a fifth of the population in London died (Ziegler 1969). Because most human haplotypes are rare, lineages could have been forced to extinction even without a severe population bottleneck (millions remained). This is especially true if related families were lost when whole villages perished (either through shared susceptibility or environment). Furthermore, during the plagues some families were apparently more resistant than others (Clifford 1989), which could have led to high variance in the number of daughters surviving among different families and a consequent loss of mtDNA diversity at the population level. The fact that variance around the mean for h is greater in the ancient population compared with modern England is consistent with the loss of rare lineages (table 1; Bartlett's test, Χ12=464, p<0.0001; Χ2=46.8, p<0.0001 when a random subset of N=48 from England is used).

Another possibility is strong selection for a mtDNA haplotype, suggested in our study by the observed frequency of the CRS (Cambridge reference sequence) haplotype: 6.30% in the ancient sample and 19.53% (N=5529) in modern Europe (21.70% in modern England, N=258; average of 18.90% for eight modern European populations; N=32–54). All of these showed significantly higher CRS frequencies compared with the ancient sample (Χ2=4.72–6.21; p=0.01–0.03). Table 1 in Vernesi et al. (2004) also reveals a low frequency (7%) for the CRS in the ancient Etruscan sample. Hypothetically for our study, if the effect had been uniform over ca 300 years of the epidemics (approx. 15 generations), then a selection coefficient of 9% would be sufficient to increase the frequency of this haplotype from 0.063 to 0.217 (newborn frequency among daughters times the viability, assuming no migration or mutation). This alone could explain the observed reduction in gene diversity (by the most conservative estimate it would reduce h from 0.98 to 0.95, based on a proportional reduction of frequency equally distributed among the remaining haplotypes). Other studies have shown selection for human mtDNA lineages (e.g. Ruiz-Pensini et al. 2004), but our only evidence in this case is the differential frequency of the CRS haplotype, and this could also occur by drift (though all modern European populations have the high CRS frequency, not just England, and strong lineage sorting was not seen; see above). Helgason et al. (2003) provide evidence for drift distorting haplotype frequencies in the Icelandic population where the contribution of ancestral lines was apparently highly skewed. In reality, either drift or selection (or both) could have been involved, but further research will be required to estimate their relative importance (with modelling complicated by unavailable data, especially on the differential impact of disease on different families, and the extent to which Saxon villages were isolated).


We thank Jack Dumbacher, Alan Cooper, Trevor Anderson, Mandy Marlow, Lorraine Mepham, John Lucas, Liz Popescu, Peter Forster and Shane Richards.


    • Received May 21, 2007.
    • Accepted July 11, 2007.


View Abstract