Resolving the phylogeny of lizards and snakes (Squamata) with extensive sampling of genes and species

John J. Wiens, Carl R. Hutter, Daniel G. Mulcahy, Brice P. Noonan, Ted M. Townsend, Jack W. Sites, Tod W. Reeder

Abstract

Squamate reptiles (lizards and snakes) are one of the most diverse groups of terrestrial vertebrates. Recent molecular analyses have suggested a very different squamate phylogeny relative to morphological hypotheses, but many aspects remain uncertain from molecular data. Here, we analyse higher-level squamate phylogeny with a molecular dataset of unprecedented size, including 161 squamate species for up to 44 nuclear genes each (33 717 base pairs), using both concatenated and species-tree methods for the first time. Our results strongly resolve most squamate relationships and reveal some surprising results. In contrast to most other recent studies, we find that dibamids and gekkotans are together the sister group to all other squamates. Remarkably, we find that the distinctive scolecophidians (blind snakes) are paraphyletic with respect to other snakes, suggesting that snakes were primitively burrowers and subsequently re-invaded surface habitats. Finally, we find that some clades remain poorly supported, despite our extensive data. Our analyses show that weakly supported clades are associated with relatively short branches for which individual genes often show conflicting relationships. These latter results have important implications for all studies that attempt to resolve phylogenies with large-scale phylogenomic datasets.

1. Introduction

Squamate reptiles are one of the most diverse and well-known vertebrate groups, with approximately 9000 species among 61 families [1]. Squamates offer outstanding model systems in ecology and evolution, especially for studying origins of asexuality, viviparity, body form and venom [1]. Some squamates are also an important cause of human mortality, with tens of thousands of snakebite deaths every year [2].

Understanding these diverse aspects of squamate biology requires a well-resolved phylogeny. Recent molecular analyses have suggested a phylogeny that differs dramatically from morphological hypotheses, especially in placing iguanians with snakes and anguimorphans [35]. Although these molecular studies are generally concordant, several issues remain unclear [39], including: (i) the sister group to other squamates, (ii) the sister group to snakes, (iii) interrelationships of major snake clades and (iv) relationships of iguanian families. Many of these questions have resisted resolution even with datasets of 20 genes or more [5,7,9].

Here, we analyse squamate phylogeny using extensive sampling of taxa (161) and characters (44 loci), the largest dataset yet assembled. We also present the first analysis of higher squamate relationships using species-tree methods [10,11]. We generate a strongly supported hypothesis and reveal some surprising results. However, some branches remain weakly supported, and our analyses shed light on this unexpected pattern.

2. Material and methods

We sampled 161 squamate species and 10 outgroup taxa, including mammals (Homo, Mus, Tachyglossus), crocodilians (Alligator, Crocodylus), birds (Dromaius, Gallus), turtles (Chelydra, Podocnemis) and a rhyncocephalian (Sphenodon). We included all extant squamate families (excepting a few recently recognized groups, such as Cadeidae, Blanidae, Phyllodactylidae and Xenophiidae [1]), with two or more representatives from most families. We sequenced portions of 44 nuclear genes (exons approx. 500–1500 base pairs in length) carefully selected based on comparisons of vertebrate genomes [12], targeting single-copy genes evolving at appropriate rates. Standard methods of DNA extraction, amplification and sequencing were used. Nucleotide sequences were translated to amino acids to aid alignment, and alignment was straightforward. The total alignment consisted of 33 717 base pairs (available through the Dryad data depository (doi:10.5061/dryad.g1gd8)). Voucher and GenBank numbers and the names, lengths and sampling of genes are provided in the electronic supplementary material, appendices S1–S3).

On average, each gene had data for 143 species (84% complete). Simulations and empirical analyses suggest that missing data need not be problematic for concatenated phylogenetic analyses, particularly when sampling many characters [13].

Two general approaches to data analysis were used (concatenated and species tree). First, we performed a concatenated analysis of all taxa using likelihood (RAxML, v. 7.2.0; [14]), using 1000 bootstrap replicates integrated with 200 searches for the optimal tree. We used the GTR + Γ model and partitioned the data by genes and codon positions (see the electronic supplementary material, appendix S4). We also performed a Bayesian concatenated analysis using MrBayes v. 3.1.2 [15] (see the electronic supplementary material, appendix S4), with branch support based on posterior probabilities (Pp).

Second, we performed Bayesian species-tree analyses (using *BEAST v. 1.6.2 [11]). We used 31 species (29 squamates, two outgroups, Sphenodon and Gallus) for computational feasibility. The selected species represented all major squamate clades, and had relatively complete sampling of genes (mean = 41.5 genes), given that the impact of missing data on species-tree methods remains poorly known. Details of methods are provided in the electronic supplementary material, appendix S4.

To address why some clades are strongly versus weakly supported, we used 49 interfamilial clades from the concatenated-data likelihood tree and analysed relationships between bootstrap support (bs), branch lengths and congruence among genes [7]. Branch lengths from the concatenated-data tree were used, and these lengths were strongly correlated with mean lengths (for comparable clades) from separately analysed genes (Rho = 0.840; p = 0.0001; see the electronic supplementary material, appendix S5). We evaluated the proportion of separately analysed genes supporting each node in the concatenated tree, and the bootstrap support for supporting and conflicting clades. Relationships were tested using non-parametric Spearman's rank correlation (data in the electronic supplementary material, appendix S5).

3. Results

Analyses of the concatenated data using likelihood (figure 1) and Bayesian (see the electronic supplementary material, figure S1) methods yield similar phylogenies and support values and provide strong support for most higher-level squamate relationships. Many aspects of the tree are consistent with other recent molecular analyses [e.g. 35] such as the clade of snakes, anguimorphs and iguanians (Toxicofera).

Figure 1.

Phylogeny of squamate reptiles from concatenated likelihood analysis of 44 nuclear genes (see the electronic supplementary material, figure S1 for Bayesian tree). Uncircled numbers at nodes indicate bootstrap values >50% (branches too short to depict here have clades indicated with an open circle); circled numbers correspond to clades in electronic supplementary material, appendix S5. Branch lengths are estimated by likelihood (length for root arbitrarily shortened to facilitate showing ingroup branch lengths).

However, we also find some surprising relationships (figure 1; electronic supplementary material, figure S1). First, we find that dibamids and gekkotans are together the sister group to other squamates (bs = 76%; Pp = 1.00), whereas previous studies have generally placed either dibamids or gekkotans as sister to other squamates [4,6,16], but not both (but see [5]). Second, we find strong support for non-monophyly of Scolecophidia, the blind snakes (Anomalepididae, Leptotyphlopidae and Typhlopidae). Specifically, leptotyphlopids + typhlopids are strongly supported as sister to all other snakes, whereas anomalepidids are sister to all non-scolecophidians (bs = 100%; Pp = 1.00). We also find a strongly supported clade (bs = 97%; Pp = 1.00) within pleurodont iguanians that is inconsistent with relationships from 29 nuclear loci [9]. This clade includes oplurids, leiosaurids, polychrotids, dactyloids, liolaemids and phrynosomatids, whereas the earlier study found phrynosomatids as sister to other pleurodonts (bs = 88%; Pp = 1.00).

Unexpectedly, some aspects of squamate phylogeny still remain weakly supported. These include placement of uropeltids among snakes and relationships among many pleurodont iguanian families (figure 1; see the electronic supplementary material, figure S1). We find a strong relationship between bootstrap support and branch lengths, and between congruence and branch lengths, such that shorter branches tend to be weakly supported and have greater incongruence among genes (table 1).

View this table:
Table 1.

Relationships between likelihood branch lengths, support and congruence for 49 clades of squamate reptiles.

The phylogenetic estimate from the species-tree analysis (see the electronic supplementary material, figure S2) is generally similar to the concatenated-data trees (but with some differences), and strongly supports some of the more surprising relationships. Specifically, there is very strong support for placing dibamids and gekkotans as sister taxa, and for non-monophyly of scolecophidians (but with a different arrangement of taxa). Interestingly, this analysis strongly supports iguanians and anguimorphs as sister taxa, a particularly controversial aspect of squamate phylogeny. This clade is also supported by the concatenated analyses (figure 1; see the electronic supplementary material, figure S1), as sister to snakes.

4. Discussion

We present the most extensive analysis of higher squamate phylogeny to date (in terms of characters and taxa), and the first to apply species-tree methods to these relationships. Our results support some aspects of previous hypotheses, but also show some surprising findings. First, both approaches support dibamids and gekkotans as the sister group to all other squamates, in contrast to most previous studies. Second, we find strong support for paraphyly of scolecophidian snakes. This result has appeared in some previous studies [e.g. 7], and we are unaware of molecular studies that have both included all three families and strongly supported scolecophidian monophyly.

Paraphyly of scolecophidians is surprising in that these families share many morphological and ecological traits, including highly reduced eyes [17]. All scolecophidians are specialized burrowers. Thus, paraphyly of scolecophidians at the base of snake phylogeny suggests that snakes may have been burrowers ancestrally, and that most snake species evolved from an ancestor that subsequently returned to surface dwelling. This hypothesis is also supported by the morphology of snakes relative to other limb-reduced lizards: for example, snakes have short tails and elongate trunks (as do burrowing snake-like lizards), whereas surface-dwelling snake-like lizards have elongate tails relative to the trunk [16]. However, some scolecophidian features may have evolved convergently in the two clades, rather than being ancestral for snakes.

Our results also show that some aspects of squamate phylogeny remain weakly supported, especially relationships among some snake and iguanian families. Many of these relationships were also weakly supported in analyses of 20–29 loci [5,7,9]. Thus, roughly doubling the number of loci fails to lead to strong support. In contrast, many relationships supported here were also found in analyses of only one or two nuclear loci [3].

Our analyses of branch lengths, support and congruence suggest that these patterns are related to branch lengths [7]. Specifically, we find weaker support and greater incongruence among genes (and more strongly supported conflicts) on shorter branches, suggesting that these nodes may continue to be problematic as more loci are added. This incongruence seems to arise from incomplete lineage sorting on short branches [7] and may plague many other phylogenomic studies [18]. Importantly, many shorter branches are strongly resolved by the coalescent-based species-tree approach, which incorporates incomplete lineage sorting [10,11]. Our results highlight the potential value of this general approach for resolving short branches with phylogenomic data, although further advances may be needed to make these methods practical for large datasets with incomplete sampling of genes.

Acknowledgements

This project was supported by a US National Science Foundation-AToL grant, with awards to T.W.R. (EF 0334967), J.W.S. (EF 0334966) and J.J.W. (EF 0334923). We thank the many institutions and individuals who provided tissue samples (see the electronic supplementary material, appendix S1), Jyoti Aggrawal and Arianna Klein for assistance with congruence analyses, and two anonymous reviewers for helpful comments on the manuscript.

  • Received July 27, 2012.
  • Accepted August 29, 2012.

References

View Abstract