## Abstract

Ancient DNA sequences are able to offer valuable insights into molecular evolutionary processes, which are not directly accessible via modern DNA. They are particularly suitable for the estimation of substitution rates because their ages provide calibrating information in phylogenetic analyses, circumventing the difficult task of choosing independent calibration points. The substitution rates obtained from such datasets have typically been high, falling between the rates estimated from pedigrees and species phylogenies. Many of these estimates have been made using a Bayesian phylogenetic method that explicitly accommodates heterochronous data. Stimulated by recent criticism of this method, we present a comprehensive simulation study that validates its performance. For datasets of moderate size, it produces accurate estimates of rates, while appearing robust to assumptions about demographic history. We then analyse a large collection of 749 ancient and 727 modern DNA sequences from 19 species of animals, plants and bacteria. Our new estimates confirm that the substitution rates estimated from ancient DNA sequences are elevated above long-term phylogenetic levels.

## 1. Introduction

Ancient DNA (aDNA) sequences have been proliferating rapidly over the past two decades, with datasets now ranging from population-scale genetic profiles (Lambert *et al*. 2002; Shapiro *et al*. 2004) to palaeogenomic libraries (e.g. Noonan *et al*. 2005). In many cases, the ages of aDNA sequences are known from radiocarbon dating or stratigraphic context, allowing detailed investigations of molecular evolutionary processes through time (Drummond *et al*. 2002). These known ages can be used as calibrations for estimating substitution rates, divergence times and various demographic parameters in the absence of any other calibration points (Rambaut 2000; Drummond *et al*. 2002).

One of the notable characteristics of aDNA datasets is that they have generally yielded elevated estimates of substitution rates (for a summary, see Ho *et al*. (2007)). Typically, such estimates have been intermediate between the mutation rates inferred from pedigree studies (e.g. Denver *et al*. 2000; Howell *et al*. 2003) and the long-term substitution rates obtained from phylogenetic studies (e.g. Brown *et al*. 1979). These results suggest that rate estimates depend on the length of the observational period, with higher rates being measured over shorter periods of time (Ho *et al*. 2005). Hypothesized causes for this pattern include the persistence of slightly deleterious mutations, sequence damage, calibration error or substitutional saturation (Ho *et al*. 2005, 2007; Woodhams 2006). Although the relative contributions of these factors are difficult to quantify, it is clear that the time dependency of rate estimates represents a considerable problem for molecular studies of recent evolutionary events.

Recently, the ability of Bayesian phylogenetic analysis to infer substitution rates from dated aDNA sequences was questioned (Emerson 2007). This criticism was based on anomalous results obtained from a sequence analysis of four dated Neanderthals and four modern humans, in which Emerson (2007) estimated the substitution rate to be 1.98 substitutions per site Myr^{−1}. A subsequent re-analysis was unable to replicate these findings (Ho *et al*. 2007). Validation of various aspects of the method has been performed previously (Drummond *et al*. 2002, 2006), but not in a simulation study using parameter settings that are biologically relevant to aDNA data. Comprehensive testing of the method is desirable due to its pivotal role in many aDNA studies (e.g. Lambert *et al*. 2002; Shapiro *et al*. 2004).

In order to assess the accuracy and precision of the substitution rates estimated using Bayesian analysis, we present a study based on datasets generated by simulation under known conditions. The robustness of the estimates to the assumed demographic model is also assessed. We then estimate intraspecific substitution rates from a collection of nearly 1500 sequences obtained from a diverse range of animals, plants and bacteria. Demographic model selection is performed using Bayes factors.

## 2. Material and methods

### (a) Simulations

Simulations of sequence evolution were performed on 4000 random coalescent trees with 31 dated tips. For the first 2000 trees, the ages of the tips were 0, 1000, 2000, …, 30 000 years, reflecting a population that has been uniformly sampled over a period of time (e.g. Shapiro *et al*. 2004). For the second 2000 trees, there was one modern sequence (age 0) and 10 sequences each of ages 10 000, 20 000 and 30 000 years, reflecting samples from a limited number of sites or layers (e.g. Coolen & Overmann 2007). These two sampling regimes are hereafter referred to as ‘uniform’ and ‘layered’, respectively; examples of simulation trees from these two regimes are given in figure 1.

For each of the two sets of 2000 trees, the first 1000 were generated assuming a constant population size of 10^{5} and the second 1000 assuming an exponentially growing population, with a growth rate of 10^{−5} per year and a final population size of 10^{5}. Simulations were performed on all 4000 trees using Seq-Gen v. 1.3.2 (Rambaut & Grassly 1997), using the Jukes–Cantor model of nucleotide substitution with a rate of 5×10^{−7} substitutions per site yr^{−1}, to produce alignments of length 1000 bp.

Substitution rates and root ages were estimated from these alignments using the Bayesian phylogenetic software Beast v. 1.4.3 (Drummond & Rambaut 2006), with sequence ages used as calibrations. For each alignment, analyses were performed using three demographic models: (i) constant population size, (ii) exponential growth and (iii) the Bayesian skyline plot with five groups (Drummond *et al*. 2005). Posterior distributions of genealogies and parameters were obtained by Markov chain Monte Carlo (MCMC) sampling, with samples from the posterior drawn every 5000 steps over a total of 5 000 000 steps.

### (b) Ancient and modern DNA

Nucleotide sequences were obtained from published sources for 19 species, with the total dataset comprising 749 ancient and 727 modern sequences (table 1; see also electronic supplementary material). Most sequences were from the mitochondrial D-loop. All of the ancient sequences were of known age. Following manual alignment, several population genetic statistics were computed using DnaSP v. 10.4.9 (Rozas *et al*. 2003) for the modern sequences. First, the number of segregating sites (*S*) and the average number of pairwise nucleotide differences (*k*; Tajima 1983) were calculated. High *S*/*k* ratios (‘expansion coefficient’; Peck & Congdon 2004) imply an increase in population size over time, while lower ratios should be indicative of a relatively constant long-term population size. We also used a test statistic, *R*_{2}, that draws information from the polymorphism frequency. *R*_{2} contrasts the number of singletons and the mean number of differences; it approaches low positive values in the case of a recent population growth event (Ramos-Onsins & Rozas 2002). Several other statistics were calculated and are presented in the electronic supplementary material.

Nucleotide substitution models were selected for each dataset by comparison of Akaike's information criterion scores. Substitution rates were estimated from each of the 19 alignments using Beast. For each alignment, analyses were performed using the three different demographic models: (i) constant population size, (ii) exponential growth and (iii) the Bayesian skyline plot with five groups. Samples from the posterior were drawn every 20 000 MCMC steps over a total of 20 000 000 steps. Acceptable mixing and convergence to the stationary distribution were checked by inspection of posterior samples. Demographic model selection was performed by Bayes factors calculated using the harmonic means of sampled marginal likelihoods and support was assessed following Jeffreys (1961).

## 3. Results

### (a) Simulations

The simulation study revealed that estimates of substitution rates made using Bayesian analysis were robust to the assumed demographic model (table 2). Analyses using the exponential growth model and Bayesian skyline plot yielded slightly higher rate estimates than those assuming a constant population size, but mean rate estimates were strikingly consistent across different demographic models. The estimation accuracy, defined here as the percentage of datasets for which the true rate (5×10^{−7} substitutions per site yr^{−1}) was contained in the 95% highest posterior density (HPD) of the estimated rate, remained at approximately 95% regardless of the demographic model assumed during the analysis.

The precision of the estimates, defined here as the average size of the 95% HPD on the estimated rate, was also remarkably consistent across the different conditions. Results were similar between the uniform and layered sampling regimes, although the rates estimated from the former were slightly less accurate but more precise. This final result differs from that obtained in a similar study of viral sampling regimes (Seo *et al*. 2002), but this could be the result of differences in simulation conditions and the analytical method. Nevertheless, the performance of the Bayesian method using layered sampling data is reassuring, because layered sampling is more economical and sometimes unavoidable in practice.

### (b) DNA data

Two of the nine modern DNA datasets yielded significant values for *R*_{2} (table 1; see also electronic supplementary material), reflecting a departure from the null model of constant population size and presenting evidence of population expansion. There were some disagreements with the demographic priors selected using Bayes factors (table 1), but this could be due to the exclusive use of modern sequences for calculating the population genetic statistics.

Estimated substitution rates from animal D-loop datasets were generally high (figure 2), with mean estimates ranging from 1.3×10^{−7} (95% HPD: 1.3×10^{−8}–2.6×10^{−7}) in cave bear to 2.9×10^{−6} (95% HPD: 1.6×10^{−6}–4.4×10^{−6}) in boar. The rate estimated from *Chlorobium* was lower than all other rates by two orders of magnitude, suggesting that either the species evolves extremely slowly or the ancient sequences are spurious.

## 4. Discussion

The simulation study validates the Bayesian phylogenetic estimation of substitution rates from heterochronous data using Beast. It also demonstrates that estimates of substitution rates are relatively robust to the assumed demographic model, at least for biologically realistic simulation conditions. It is probable that estimates made from alignments with lower information content will be less robust to the demographic prior, although the extent of this influence requires further investigation using a wider set of simulation parameters.

The analyses of aDNA data have yielded a collection of substitution rate estimates that are higher than those obtained on phylogenetic time scales. In some cases, they confirm previous estimates made using similar methods (e.g. Shapiro *et al*. 2004).

Our analyses demonstrate that elevated rates estimated from aDNA are not artefacts of the Bayesian estimation method, as recently claimed (Emerson 2007). With developments in sequencing technology and the growing availability of data, such methods will become increasingly important in extracting the molecular evolutionary information contained in ancient sequences.

## Acknowledgments

We thank Rob DeSalle for comments on the paper, Andrew Rambaut and Alexei Drummond for their advice, and Ross Barnett for providing unpublished data. S.Y.W.H. was supported by the Leverhulme Trust. S.O.K. was supported by a Conservation Genetics Research Fellowship from the AMNH Sackler Institute for Comparative Genomics and the Cullman Program in Molecular Systematics.

## Footnotes

- Received July 16, 2007.
- Accepted August 14, 2007.

- © 2007 The Royal Society