Hippidions were equids with very distinctive anatomical features. They lived in South America 2.5 million years ago (Ma) until their extinction approximately 10 000 years ago. The evolutionary origin of the three known Hippidion morphospecies is still disputed. Based on palaeontological data, Hippidion could have diverged from the lineage leading to modern equids before 10 Ma. In contrast, a much later divergence date, with Hippidion nesting within modern equids, was indicated by partial ancient mitochondrial DNA sequences. Here, we characterized eight Hippidion complete mitochondrial genomes at 3.4–386.3-fold coverage using target-enrichment capture and next-generation sequencing. Our dataset reveals that the two morphospecies sequenced (H. saldiasi and H. principale) formed a monophyletic clade, basal to extant and extinct Equus lineages. This contrasts with previous genetic analyses and supports Hippidion as a distinct genus, in agreement with palaeontological models. We date the Hippidion split from Equus at 5.6–6.5 Ma, suggesting an early divergence in North America prior to the colonization of South America, after the formation of the Panamanian Isthmus 3.5 Ma and the Great American Biotic Interchange.
All contemporary equids belong to a single genus, Equus . Their most recent common ancestor (MRCA) lived in North America approximately 4.0–4.5 million years ago (Ma) , before dispersing into other continents and radiating into a wide range of extant and now-extinct forms. The ancestors of South American equids are believed to have left North America through intercontinental migrations after the formation of the Panamanian Isthmus 3.0–3.7 Ma [3,4], which promoted the Great American Biotic Interchange. No American equid survived the megafaunal extinction approximately ten thousand years ago (kya) .
Two main equine lineages are found in the South American Pleistocene fossil record . The first defines a subgenus of its own, Equus (Amerhippus; but see ), which appeared approximately 1 Ma and showed horse-like anatomical features and genetic affinities . Hippidiforms represent the second lineage, which appeared approximately 2.5 Ma and includes two genetically similar generalists (Hippidion principale, H. saldiasi)  and one genetically distinct high-altitude specialist (H. devillei) [7–9].
Like Amerhippus, hippidiforms were stocky, but displayed very long nasal bones and extremely deep nasoincisival notches . Palaeontological data support affinities with the northern American Pliohippus [3,8,10], which lived 6–14.5 Ma. This considerably predates the time when hippidiforms first entered South America , implying an early divergence from the Equus lineage. However, recent molecular studies based on short mitochondrial sequences from 20 fossil specimens have suggested clustering of hippidiforms within Equus [6,7,11,12], or possible affinities with Dinohippus , a North American lineage that lived 5–8 Ma . Therefore, the evolutionary origin of Hippidion remains unclear.
Mitochondrial genomes (mitogenomes) have recently resolved the evolutionary tree underlying the main radiation of equids , which was confirmed by whole-exomes . This established mitogenomes as reliable phylogenetic markers for equids. We therefore used target-enrichment capture and next-generation sequencing to characterize mitogenomes of Hippidion and resolve its evolutionary origin.
We analysed nine Hippidion samples excavated in northern Chile, southern Chile and Argentina (table 1 and the electronic supplementary material, table S1). The specimens represent two morphospecies from the Late Pleistocene (H. principale and H. saldiasi) with six samples radiocarbon-dated to 12.6–17.4 calibrated kya.
(b) Molecular analyses
We drilled samples (94–485 mg powder), built DNA libraries and set up PCR amplification in ancient DNA facilities following published methods (electronic supplementary material, table S1). We first shotgun-sequenced one/two amplified libraries per sample on Illumina HiSeq 2000 platforms, except for libraries from EQB and EQF, which were directly processed for target-enrichment capture. As sequencing revealed a limited number of mitochondrial sequences, we performed target-enrichment capture for equine mitogenomes for all but JW251 and OH samples (electronic supplementary material, table S2). We followed  and built probes from mitochondrial PCR amplicons from living equids (electronic supplementary material). Enriched libraries were amplified for an additional 12 cycles post-capture before being quantified on an Agilent 2100 Bioanalyser, pooled and sequenced on an Illumina HiSeq 2000 platform. DNA contamination was monitored using mock controls during DNA extraction, library construction, PCR amplification and target enrichment. Contamination levels were estimated from sequence reads using the maximum-likelihood (ML) method from Fu et al.  and a database of 245 equine mitogenomes (electronic supplementary material, table S3).
(c) Sequence analyses
DNA read processing (adapter trimming), mapping (read alignment, PCR duplicate removal, indel realignment) and damage analyses were performed using the PALEOMIX pipeline . Seeding was disabled for mapping, a relaxed edit distance (−n 0.03) was used, and reads showing mapping qualities less than 25 were excluded. To ensure that all equine-like reads could be identified in the absence of a mitochondrial reference for Hippidion, we first recovered, for each sample, high-quality hits mapping uniquely against at least one equine mitogenome and/or partial Hippidion sequence (electronic supplementary material, table S4). We then used a majority rule and a minimum depth-of-coverage of 2 to call a preliminary consensus sequence for each sample, which was used for read mapping with default parameters. A new and final consensus was then called using a majority rule, a minimum depth-of-coverage of 3 and filtering for base scores less than 30. Alignment statistics and DNA damage parameters were calculated based on a final mapping against each individual mitochondrial sequence.
(d) Phylogenetic analyses
We reconstructed the phylogenetic tree of equid complete mitogenomes using both ML (PhyML v. 3.0; ) and Bayesian algorithms (Beast v. 1.8.0; ). We constructed a 40-taxon mitochondrial dataset by aligning Hippidion mitochondrial sequences against equine orthologous sequences, adding the white and black rhinoceroses as outgroups (electronic supplementary material, table S5). As Beast allows the application of different models of substitution to partitions in datasets, we divided the alignment into six partitions (one per codon position, rRNA, tRNA, control region CR). For PhyML analyses, we merged these partitions into one alignment, as PhyML does not handle multiple partitions. The best substitution model for each partition/merged dataset (with and without outgroups) was determined with Modelgenerator v. 0.85  on the basis of its Bayesian information criterion, and applied to the corresponding partition/dataset in Beast (unlinked substitution models) and PhyML analyses (electronic supplementary material, tables S6–S7). Beast analyses were performed assuming a birth–death serially sampled tree model, and divergence dates were estimated using a log-uncorrelated molecular clock model together with tip-sampling (electronic supplementary material, table S8) and assuming a time for the MRCA at 4.0–4.5 Ma for Equus .
To further support tree topology, we performed approximate unbiased (AU) and Shimodeira–Hasegawa (SH) topological tests in CONSEL  using five trees differing in their Hippidion placement and site-wise-likelihood values determined by PhyML.
Finally, in order to compare our Hippidion CR sequences with those characterized previously, we ran ML analyses under a GTR + I + Γ8 model (GTR, general time reversible model; I, invariant sites; Γ8, 8-class gamma distribution; electronic supplementary material, table S9).
3. Results and discussion
We characterized eight Hippidion mitogenomes at 3.4–386.3-fold coverage (table 1 and electronic supplementary material, table S2). Compared with shotgun sequencing, target-enriched libraries showed a 3.0–147.8-fold increase (average 42.4-fold; electronic supplementary material, table S2) in the number of high-quality hits mapping uniquely against the Hippidion mitogenome. An additional sample (EQB) showed poor DNA preservation and resulted in limited mitogenome coverage (0.6-fold). Several lines of evidence support our data as authentic: (i) presence of typical nucleotide mis-incorporation and ancient DNA fragmentation patterns , i.e. increasing C → T (G → A) substitutions toward read starts (ends), preferential read starts (ends) at genomic positions following purines (pyrimidines; electronic supplementary material, figures S1–S9), (ii) 6.9–32.5-fold higher cytosine deamination rates at overhangs than in double-stranded DNA (electronic supplementary material, table S10), (iii) contamination levels less than or equal to 4.5% (per library) and 1.4% (per sample; electronic supplementary material, table S11), (iv) ML phylogenetic clustering with all H. saldiasi and H. principale CR sequences previously characterized (figure 1a). In addition, albeit only partially characterized owing to limited coverage, sample JW251 CR sequence clustered with the sequence previously published for this specimen .
ML analyses (figure 1b) and Bayesian phylogenetic inference rooted on rhinoceros outgroups revealed Hippidion as a monophyletic assemblage basal to all Equus. Similar Bayesian phylogenetic placement was found when excluding rhinoceros outgroups (data not shown). We furthermore used AU and SH topological tests to show that this topology provided a significantly better alternative than four other topologies where Hippidion was forced to nest within different Equus clades (p-values ≤ 0.001 and ≤ 0.044; electronic supplementary material, figure S10), following all possible alternative topologies highlighted in reference . Our analyses therefore contrast with previous results based on partial mitochondrial sequences [6,7,11,12] and support palaeontological models, which place Hippidion outside the range of variation of all extinct and extant Equus species. Additionally, we found that H. devillei clusters outside a paraphyletic assemblage consisting of H. principale and H. saldiasi, as shown in reference . Whether the latter should still be described as two distinct morphospecies remains to be determined with nuclear data.
We estimated the divergence between Hippidion and Equus at approximately 5.6–6.5 (confidence range = 4.6–7.6 Ma; table 2), before Hippidion entered South America following the formation of the Panamanian Isthmus 3.0–3.7 Ma . This is supported by fossils found in southern North America, which exhibit a more primitive nasal notch than South American hippidions, and could represent earlier hippidiforms .
4. Conclusion and perspectives
In this study, we sequenced for the first time complete mitogenomes of Hippidion saldiasi and H. principale. We reject previous claims that Hippidion is nested within Equus, and estimate an early split between Hippidion and Equus 5.6–6.5 Ma. Although the Equidae family flourished during the Miocene, all present-day equids originate from an MRCA that lived 4.0–4.5 Ma . With closest living relatives diverging 55 Ma, the early evolutionary stages of the genus Equus are difficult to reconstruct based on present-day genetic information alone. The survival of Hippidion until 10 kya and the possibility to sequence whole genomes from extinct species  offer a unique opportunity to access a closer phylogenetic outgroup for equids, and thereby reveal the genetic foundation of Equus.
Hippidion mitogenome sequences can be accessed from GenBank (KM881671–KM881679).
L.O. conceived and designed the study. C.D.S., J.T.V., A.S.O. and L.O. performed ancient DNA extraction, DNA library construction and amplification. C.D.S. performed target-enrichment capture experiments. M.S. and L.O. performed sequence analyses. D.E., J.W., M.T.A., F.M., P.M.L., J.L.P., A.P., C.J.D., T.W.S., E.W. and L.O. contributed samples, reagents and methods. All authors participated to the interpretation of the data. C.D.S., J.T.V. and L.O. wrote the article.
This research was supported by the Marie Curie Career-Integration Grant (293845), the Danish Council for Independent Research (Natural Sciences), the Danish National Research Foundation (DNRF94), the “Chaires d'Attractivité 2014” IDEX programme (University of Toulouse), and the Lundbeck Foundation (R52-A5062). F.M. was supported by the FONDECYT 1100822 and CD MAG0901 grants.
We have no competing interests.
We thank the Danish National High-throughput DNA Sequencing Centre; Tina Brand and Pernille Selmer-Olsen for laboratory assistance; Philip Johnson for sharing R-scripts estimating contamination levels.
- Received December 15, 2014.
- Accepted February 11, 2015.
- © 2015 The Author(s) Published by the Royal Society. All rights reserved.