Host species switches by bacterial pathogens leading to new endemic infections are important evolutionary events that are difficult to reconstruct over the long term. We investigated the host switching of Staphylococcus aureus over a long evolutionary timeframe by developing Bayesian phylogenetic methods to account for uncertainty about past host associations and using estimates of evolutionary rates from serially sampled whole-genome data. Results suggest multiple jumps back and forth between human and bovids with the first switch from humans to bovids taking place around 5500 BP, coinciding with the expansion of cattle domestication throughout the Old World. The first switch to poultry is estimated at around 275 BP, long after domestication but still preceding large-scale commercial farming. These results are consistent with a central role for anthropogenic change in the emergence of new endemic diseases.
Staphylococcus aureus is a leading cause of hospital- and community-associated human infections. It has also adapted to other hosts, including wild birds and livestock, causing substantial losses to the dairy and poultry industries [1,2]. Genetic analyses have demonstrated that livestock-associated strains evolved adaptively following human-to-animal host jumps, leading to endemic clones that are largely host-restricted [2–4]. Some of these strains may also act as zoonoses for human infection . A better understanding of these host jumps, and their ecological context, is necessary in order to design ways to prevent the emergence of new pathogens.
Here, we examine the long-term evolutionary history of S. aureus, using a panel of strains that represents the breadth of the species’ diversity. To deal with the challenges of these data, we introduce two methodological innovations. First, to add a temporal scale to our phylogeny, we estimate rates of substitution from whole-genome sequences of S. aureus with known dates of isolation in the recent past [6–8]. We use these estimates to constrain rates at sites least likely to be affected by weak purifying selection (which can decrease rates over the longer term ), while also allowing for rate variation across the tree. Second, we adapt a method from Bayesian phylogeography , to infer uncertain associations with ancestral hosts. Together, our method estimates topology, substitution rates, divergence dates, ancestral host states and number of host switches in a single simultaneous analysis.
2. Material and methods
To represent the global diversity of S. aureus, we selected 123 genotypes of multilocus sequence type data  (electronic supplementary material, table S1). Human strains comprised nasal carriage, common epidemic hospital and community-associated strains, as well as four divergent sequence types (STs) of clonal complex (CC) 75 . We also include the major clones specific to bovid hosts (cattle, sheep and goats) and avian hosts (poultry, reared and wild birds). Despite no formal evidence of recombination (electronic supplementary material, methods), we conservatively excluded two genes with unusual patterns of conservation (electronic supplementary material, figure S1), leaving an alignment of 2265 bp from five genes. Although our data represent the global diversity of S. aureus, standard tests do not provide evidence of sequence saturation (electronic supplementary material, methods).
Bayesian phylogenetic analyses were performed in Beast v. 1.7.1  using Beagle . We used the HKY85 + Γ substitution model, and the uncorrelated lognormal model of changes in substitution rate , and partitioned our alignment by codon position .
To infer a temporal scale, we estimated the rate of nucleotide substitution using published whole-genome data from ST239, sampled over a 20-year period . A rate estimate obtained from these data was used to inform a prior applying only to third codon positions. Such sites are largely synonymous, and so less affected by weak purifying selection that is effective over longer timescales (see electronic supplementary material, methods). To reconstruct host switching, we applied a phylogeographical model , replacing geographical locations with host types (electronic supplementary material, methods). Each global strain was classified as human, avian or bovid, and hosts of ancestral strains were inferred jointly with the other parameters. We did not distinguish between strains isolated from the bovid genera Bos, Ovis and Capra because they group closely within the tree and some strains are found in all three genera. For any given tree in the Markov chain Monte Carlo (MCMC) sample, the minimum date of the earliest jump from human to livestock corresponds to the oldest node with a livestock host state.
For our data, prior information strongly suggests that the ancestral host state was human, namely, the pseudogenization in animal strains of proteins involved in human colonization or pathogenesis [3,4] and the presence of related outgroup strains in indigenous human communities  and New World monkeys (Staphylococcus simiae) . Accordingly, we extended published methods to allow us to constrain the root state in our analyses (electronic supplementary material, methods).
3. Results and discussion
Figure 1 shows the maximum clade credibility tree of our global sample of S. aureus. The four strains from CC75 form a well-supported outgroup to the other strains, with an estimated root age of 69 738 BP (24 678–142 433).
For the livestock-associated strains, figure 1 indicates five human-to-bovid jumps (table 1), which agrees well with estimates from the complete posterior sample (median 5.12 jumps; 95% Bayesian credible interval (BCI): 4.98–7.08), and from reconstructions using alternative methods implemented in Mesquite  (electronic supplementary material, methods). Each of the groups contains Bos, Ovis and Capra hosts with the exception of CC97 and ST126/694 identified only in cattle (Bos). The bounds on the ages of the jumps range from 0 to 9000 BP, but all occurred post-domestication [17,18], suggesting intimate contact between humans and animals as a principal driver of transmission and subsequent spread of S. aureus in domesticated animals. Incorporating uncertainty in ancient host associations, and the relative ages of the different jumps, we estimate the first transmission of S. aureus from human to bovids at 5512 years ago (BCI: 3656–9007), which corresponds well to the period of expansion of agriculture throughout the Old World—the Neolithic revolution  (figure 2a). The influence of domestication on human diseases in the agricultural age is well established , but this is the first dating study to imply its role in the emergence of animal diseases.
We also estimate that two bovid strains have subsequently jumped back into humans (median 2.06; BCI: 1.98–2.16). The first putative back-jump, ST25, presumably switched host very recently; ST25 is unique in lacking mecA-mediated resistance to methicillin but still exhibiting borderline oxacillin resistance. It has been suggested that the administering of antibiotics during non-lactating periods as a prophylaxis for mastitis in cattle could have selected for this borderline resistance phenotype . The other bovid-to-human jump, involving ST59, occurred around 500 BP (table 1). The ST59 clone is a major epidemic clone in Southeast Asia and many strains demonstrate mecA-mediated resistance. However, the date of the host switch predates the introduction of methicillin and so the resistance phenotype cannot be due to farm management practices. These instances demonstrate the benefit of understanding historical host associations.
In contrast to bovids, the two strain types from poultry cluster together with approximately 100 per cent probability (figure 1), and so our data indicate a single human-to-avian jump that led to the establishment of an endemic clone (median 1.03; BCI: 0.97–1.21). We estimate that the first transmission occurred around 274 years ago (BCI: 75–785), well after the domestication of the modern-day chicken (about 8000 years ago; figure 2b) [21,25]. The recent age for this host jump is unexpected given the wide range of domesticated and wild avian hosts that are infected with these strain types . This may imply more stringent host species-dependent barriers for transmission from humans to birds than to cattle (although our data preclude examination of very recent host jumps that have not led to endemic strains, such as the jump of an ST5 strain around 40 years ago ). With current data, we cannot predict if the initial host was a wild or domesticated bird, but we speculate that the higher frequency of interactions between humans and domesticated birds may have provided the opportunities for the initial host jump. During the period of the inferred jump, poultry was still farmed by small-holdings  but the poultry industry started to expand considerably about 140 years ago [21,22] presumably contributing to the dissemination of S. aureus among flocks.
We have employed novel Bayesian methods to investigate the frequency and timing of S. aureus host switching events that led to the emergence of global livestock pathogens. A correlation of the first human-to-bovid host jump with the spread of domestication in the Old World is consistent with increased opportunities for human to animal transmission. However, it is currently unclear whether our results are general, i.e. whether other pathogens associated with both humans and bovids (e.g. Streptococcus agalactiae) underwent host jumps around the time of the spread of domestication. This is due to the absence, for most bacterial pathogens, of datasets that yield reliable molecular timescales. However, with the increased use of next-generation sequencing of serially sampled and ancient data [7,26], this is certain to change in the near future, and our approach could be used to infer the ecological context of host shifts in a comparative framework.
We thank Michael Stanhope for S. simiae sequences, Francois Balloux and Andrew Leigh Brown for support, two anonymous reviewers for comments and the MRC and BBSRC (L.W. and J.R.F.), the NSF and NIH (M.A.S.) and the NESCent (M.A.S., A.R. and P.L.) for funding. Initial research received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 278433 and ERC grant agreement no. 260864.
- Received April 2, 2012.
- Accepted April 30, 2012.
- This journal is © 2012 The Royal Society