Major analytical and conceptual shortcomings in a recent taxonomic revision of the Procellariiformes – a reply to Penhallurick and Wink (2004)

Frank E. Rheindt; Jeremy J. Austin

doi:10.1071/MU04039

COMMENT AND RESPONSE

Previous Next Contents Vol 105(2)

Major analytical and conceptual shortcomings in a recent taxonomic revision of the Procellariiformes – a reply to Penhallurick and Wink (2004)

Frank E. Rheindt ^A ^B ^C and Jeremy J. Austin ^A

+ Author Affiliations

- Author Affiliations

^A Sciences Department, Museum Victoria, Carlton, Vic. 3053, Australia.

^B Department of Genetics, University of Melbourne, Parkville, Vic. 3010, Australia.

^C Corresponding author. Email: frheindt@museum.vic.gov.au

Emu 105(2) 181-186 https://doi.org/10.1071/MU04039
Submitted: 28 July 2004 Accepted: 3 March 2005 Published: 30 June 2005

Abstract

A recent taxonomic revision of Procellariiformes by Penhallurick and Wink (2004) based on cytochrome b sequence data contains analytical and conceptual flaws that compromise the validity of the taxonomic recommendations. We identify two major shortcomings in the work. First, we question the practice of basing taxonomic recommendations on tree clades that receive no statistical support, and also highlight inconsistences in the tree-searching methods used. Second, we question Penhallurick and Wink’s claim to be following the multidimensional biological species concept, because they have put forward taxonomic proposals that violate this species concept (as well as the phylogenetic species concept). We discuss these analytical and conceptual shortcomings and make recommendations against the taxonomic rearrangements proposed by Penhallurick and Wink.

Introduction

In a recent paper, Penhallurick and Wink (2004) collated previously published cytochrome b sequences of almost 90 species of the order Procellariiformes. From these, they derived phylogenetic tree topologies through distance, parsimony and likelihood methods and compared these against the current taxonomy of the Procellariiformes. Additionally, they calculated genetic divergences for many of the pairwise comparisons within their dataset, and used these divergences to date splits between lineages and to make inferences about the species status of many seabirds.

Penhallurick and Wink (2004) affirm that they consider genetic characters superior to ‘traditional’ data, since ‘… molecular data have the great advantage that convergence does not impair an analysis to the same degree as morphological data do’. They point out that molecular data provide a rough time frame of the evolutionary processes under investigation through the use of a molecular clock. They emphasise that they embrace Mayr’s (1996) multidimensional biological species concept and point out theoretical shortcomings of other species concepts when applied to seabird systematics.

We find Penhallurick and Wink’s (2004) general approach valid. Theirs is the first broad-scale taxonomic review of an entire avian order that incorporates molecular data, apart from Sibley and Ahlquist’s (1990) monumental opus on DNA–DNA hybridisation. Penhallurick and Wink’s (2004) general assertion that DNA divergences can be used as a yardstick for inferences on the taxonomic status of allopatric taxa in the absence of good biological knowledge is sound and has been applied in avian systematics previously.

However, much of Penhallurick and Wink’s (2004) methodology is seriously flawed. Consequently, so are many of their taxonomic conclusions. In the following, we wish to list several serious methodological shortcomings in Penhallurick and Wink’s (2004) work, and we attempt to point out some of their taxonomic recommendations that should be discarded.

Analytical shortcomings

Failure to provide a consistent best-fit evolutionary model

When analysing sequence data, systematists are forced to make a number of assumptions about their dataset (e.g. rates of nucleotide substitution). Thus, every phylogenetic tree search is based on an evolutionary model that reflects assumptions about the parameters specified by the systematist. By failing to specify model parameters, the systematist implies that the simplest assumptions hold true for the respective parameters. In the earliest days of molecular systematics, subjecting a phylogenetic analysis to a complex evolutionary model was difficult, but these days there are a number of tools that make it easy to search for the best fit among a number of well-founded models and to incorporate its assumptions into likelihood and distance-based searches (e.g. Posada and Crandall 1998).

Penhallurick and Wink (2004) only mention in passing (in the caption of their fig. 3) which model they employed for their maximum likelihood search, but they fail to let the reader know whether the evolutionary model they employed provided the best fit from among a number of models that are nowadays used in molecular phylogenetics. The same procedural shortcoming is revealed in their neighbour joining analysis, which is based on Jukes–Cantor distances that reflect the simple one-parameter model of rate substitution (Jukes and Cantor 1969). We suspect that HKY-based distances would have provided a better estimate of genetic distance, since that is the model Penhallurick and Wink (2004) chose for their likelihood search. Yet no-one knows, since they do not inform us whether they tested if the HKY85 model best represented their dataset, and – if so – why their neighbour joining distances are based on the Jukes–Cantor model. A further inconsistency relates to the pairwise genetic distances provided in their tables 2–9 that were used to split or lump taxa. These are uncorrected ‘p’ distances, which will provide an underestimate of true genetic divergence and will rapidly become meaningless as genetic divergence reaches asymptotes among more and more distantly related taxa.

Penhallurick and Wink’s (2004) failure to provide a consistent best-fit model in their analysis potentially has a significant impact on the outcome of their trees. Therefore we have little trust in a large proportion of their nomenclatural recommendations. Additionally, they present a bootstrap neighbour joining cladogram in fig. 2 that includes branch lengths (i.e. the tree is actually a phylogram), but purport that branch lengths in their fig. 3 are meaningful when in fact the tree in fig. 3 is a consensus cladogram (i.e. topology only). These errors question the rigour of the analyses and therefore the conclusions drawn from the resulting trees.

Failure to provide branch support measures

Methodological inconsistencies are revealed by the fact that Penhallurick and Wink (2004) base many of their taxonomic recommendations on clades for which they have not gathered any measure of branch support in their three analyses. In fact, the only analysis for which they do acquire bootstrap values as an actual measure of branch support is their neighbour joining tree, which – as a distance method of obtaining a tree topology – has fallen into disfavour among many systematists (e.g. Huson and Steel 2004). However, for both their maximum likelihood tree and their maximum parsimony tree, they only provide the best topology estimate without putting us into a position to assess the reliability of their nodes with a measure of branch support. With respect to maximum likelihood and – considering their large dataset – possibly also with respect to maximum parsimony, we acknowledge that branch support is very time-intensive to compute, but this should not mean that taxonomic recommendations can be put forth on the basis of best-estimate topologies alone. Additional, and commonly used, topological tests exist using both likelihood (Goldman et al. 2000) and parsimony (Templeton 1983) criteria and these should have been applied to critical phylogenetic arrangements to test whether the data supported the topologies presented before major taxonomic revisions were made. Penhallurick and Wink (2004) are too ambitious about the level of analytical certainty they wish to obtain with their dataset. Ironically, in most cases, partitioning their dataset into taxonomic subsets would have been sufficient to make their data digestible for today’s computer generation to provide bootstrap estimates. As easy as that sounds, their failure to do so greatly compromises the validity of the results they actually present.

Examples of taxonomic conclusions drawn by Penhallurick and Wink (2004) that may be affected by this methodological weakness are manifold.

Their internal rearrangement of storm-petrel species into genera is unfounded. In their neighbour joining tree, they provide high bootstrap support for the split of storm-petrels into two lineages, each consisting of a number of genera, but within those lineages only one clade has high bootstrap support. Yet despite the lack of statistical support, Penhallurick and Wink (2004) conduct a complete revision of storm-petrel taxonomy and re-assign species into different genera solely based on best-estimate topologies. This is not standard systematic procedure and should not be done without any additional corroboration from other data sources.
Regarding the placement of the storm-petrels within Procellariiformes, Penhallurick and Wink (2004) strongly recommend placing these with the albatrosses rather than in their traditional position, even though their own neighbour joining bootstrap consensus (their fig. 2) provides conflicting evidence. Thankfully, they stop short of making formal taxonomic recommendations, because their ‘… study is only based on one gene …’.
Penhallurick and Wink (2004) place Lugensa with the shearwaters contra previous authors (see their references) who have associated Lugensa with gadfly petrels and fulmars on the basis of morphology and feather lice. Penhallurick and Wink’s (2004) rationale behind this formal taxonomic rearrangement is the fact that all three trees show Lugensa next to shearwaters, and that divergences between Lugensa and the shearwaters are slightly smaller than those between Lugensa and the gadfly petrels and fulmars. However, we strongly doubt that the difference in divergence is significant, especially considering the fact that they did not correct their divergences for saturation and multiple substitutions. Additionally, we note that the new position of Lugensa receives no bootstrap support whatsoever in their neighbour joining tree, the only tree for which support values are given. Knowing how easily those unsupported basal clades collapse or change in composition as taxa are added or deleted from the dataset, we question Penhallurick and Wink’s (2004) conclusions.

Subjective discarding of conflicting tree evidence

Another analytical weakness that aggravates methodological shortcomings is the practice of giving more credence to one tree over the other simply on the basis of the shallowness or depth of the branch lengths involved. This unacceptable practice of interpreting data leaves readers with the impression that Penhallurick and Wink (2004) subjectively selected trees and discarded uncomfortable results in favour of those that fitted their taxonomic reasoning, though we are certain that was not the authors’ intention.

The best example of this practice is their suggestion that Procellariiformes should be radically re-grouped into two families (Diomedeidae and Procellariidae), which is only supported by their maximum likelihood and maximum parsimony trees, but not by neighbour joining. Curiously, they state that their reason for not making a formal recommendation for this new arrangement is because ‘… the relevant branches in fig. 3 [their maximum likelihood tree] are unfortunately shallow, and hence less reliable …’. It should strike readers to find out that shallow branches served two purposes, namely to discard their neighbour joining results and to declare their remaining results less reliable. As to the basal family arrangement of Procellariiformes, we think that none of their results merits consideration, because there is no statistical branch support for any of their conclusions.

Conceptual shortcomings

Penhallurick and Wink (2004) assert they embrace Mayr’s (1996) multidimensional biological species concept. Yet, as we shall see below, they use their own divergence estimates to override morphological, behavioural and genetic studies that have already established the species status of a number of taxa in question – sometimes even beyond doubt.

Should cytochrome b divergence be used to overturn taxonomic conclusions derived from detailed knowledge of genetic population structure? The case of the albatrosses

A phenomenon inherent to seabirds that seems largely responsible for a great part of Penhallurick and Wink’s (2004) biased judgement is the fact that allopatry and sympatry are difficult to define in pelagic species. Most seabirds breed on isolated islands, yet their foraging grounds may range over thousands of kilometres of open ocean. It is therefore easy to claim that two populations, which are characterised by low genetic divergence, are mere subspecies simply because their breeding grounds are allopatric, even though in many cases it has been shown that they are entirely capable of hybridising with each other but refrain from doing so owing to significant isolating mechanisms.

In their revision of albatross systematics, Penhallurick and Wink (2004) make reference to Robertson and Nunn’s (1998) taxonomic review and reverse the latter authors’ taxonomic decisions (based on the phylogenetic species concept) that involved the assignment of species status to all existing subspecies. Like many other large birds, albatrosses are characterised by low genetic divergence, and this seems reason enough for Penhallurick and Wink (2004) to re-lump all the subspecies. Yet they fail to address more recent studies that have uncovered new evidence for the species status of at least some of those forms (Burg and Croxall 2001, 2004; Abbott and Double 2003a, 2003b.

For instance, based on low genetic divergence Penhallurick and Wink (2004) advocate the lumping of Shy and White-capped Albatrosses (Thalassarche cauta–steadi), and likewise of the four constituent taxa of the wandering albatross complex (Diomedea exulans), and of the two black-browed albatrosses (Thalassarche melanophris–impavida). They overturn taxonomic recommendations based on painstaking research into the mitochondrial DNA and microsatellite loci of these taxa (Burg and Croxall 2001 , 2004; Abbott and Double 2003a, 2003b), which – taken together – strongly suggest an absence of contemporary intermigration and a demographic independence that warrants recognition as biological species of most of the taxa involved. The recent surge in studies on albatross genetics puts us into a position to know that albatrosses are capable of maintaining a panmictic population structure across the entire Southern Ocean (Burg and Croxall 2001, 2004). In the Wandering Albatross, for instance, there is high gene flow among D. [e.] exulans colonies from around the southern hemisphere, but apparently no gene flow at all between some of those same D. [e.] exulans populations and geographically adjacent colonies of D. [e.] dabbenena or D. [e.] antipodensis (Burg and Croxall 2004). By arguing that all wandering albatrosses are one species because of low genetic divergence, Penhallurick and Wink (2004) confuse pattern and cause. To allude to Simpson’s (1961) widely cited analogy: twins are not twins because they are alike, but they are alike because they are twins. On an equal footing, Penhallurick and Wink (2004) should not argue that those taxa are conspecific because they are similar, but that they are similar because they are conspecific. Yet if they are conspecific, why don’t they interbreed despite their perfect capability to do so (Burg and Croxall 2001, 2004)

The ‘temporal allopatry’ fallacy – the case of Macronectes

One of the central themes of Mayr’s (1996) multidimensional biological species concept is the notion that isolating mechanisms do not have to be acquired through a teleological process, but can arise as ‘… a by-product of the process of divergence …’ (Mayr 1996, p. 265). Therefore, biological species are prevented from interbreeding not only through post-zygotic isolating mechanisms (which are likely to be reflected in patterns of genetic divergence), but also through pre-zygotic isolating mechanisms, which often go along with high genetic divergences but sometimes don’t. Mayr (1996, p. 266) himself asserts that isolating mechanisms ‘… include not only purely genetic mechanisms, but also the use of ecological and life history factors and … a number of behavioural devices …’.

The breeding of morphologically distinct populations of seabirds on the same islets at different times of year has been interpreted by the vast majority of ornithologists as precisely such a behavioural isolating device and has consequently been used to distinguish these populations as members of different biological species. For instance, after finding that two morphologically divergent but sympatric populations of the Herald Petrel (Pterodroma heraldica) on Henderson Island mated assortatively and at different times of year, Brooke and Rowe (1996) used this newly acquired ecological knowledge to split into species what had originally been assumed to be colour morphs of one species.

However, Penhallurick and Wink (2004) reinterpret this ecological isolating mechanism as ‘temporal allopatry’ and set out to lump forms that an overwhelming majority of proponents of the multidimensional biological species concept have recognised as good biological species based on sound ecological, behavioural and morphological knowledge.

The most outstanding case in which Penhallurick and Wink (2004) use this fallacious argument to justify taxonomic rearrangements is their treatment of Macronectes. Two Macronectes taxa that occur sympatrically without regular interbreeding were shown by Bourne and Warham (1966) to be good biological species, yet Penhallurick and Wink (2004) argue that ‘… it would seem consistent on the basis of cytochrome b distances to treat halli as a subspecies of giganteus …’ because they had used similar distances to lump albatrosses previously. In their own defence, they point out that on South Georgia hybridisation (albeit at low rates, at an incidence of 2.46%) has been recorded, yet they fail to recognise that the father of their very own multidimensional biological species concept accords species status to taxa that regularly hybridise at far higher frequencies as long as their gene pools do not fuse (Mayr 1996, p. 265). The lack of hybridisation between the two Macronectes taxa at other sympatric sites is belittled by Penhallurick and Wink (2004) as ‘temporal allopatry’, since the timing of their breeding is not synchronous. In summary, they seemingly wrestle for explanations to salvage their cause of genetic divergence as a taxonomic ‘tell-all’ for the sake of consistent interpretation of cytochrome b distances. To revert to Simpson’s (1961) analogy, they declare two sisters twins because they are so similar. By doing so, they violate the very heart of the biological species concept: two taxa whose gene pools fail to amalgamate upon secondary contact are full biological species per definitionem, no matter how similar they are genetically. Asynchronous breeding is not a haphazard phenomenon that happens to be in the way of proving their cross-breeding potential, but it is the very isolating mechanism that helps them retain their species integrity. And last but not least, a low but stable incidence of interbreeding does not make a case for lumping, but demonstrates that effective pre-zygotic barriers are at work even though post-zygotic compatibility is still upheld.

Should we trust genetic divergences more than life history data in our assignment of species status? The case of the prions

The role of vocalisations in the maintenance of species integrity has long been recognised in songbirds, but their importance has only recently been shown in seabirds (see Bretagnolle 1990, 1995>; Bretagnolle et al. 1990; Zonfrillo 2004; and references therein). In an exemplary study on four taxa of prion (Pachyptila), Bretagnolle et al. (1990)) showed that although some species investigated displayed a slight overlap in morphological measurements, all of them were characterised by some level of genetic differentiation and – more importantly – by considerable differences in vocalisations. In areas of sympatry they are additionally segregated in terms of ecological requirements and timing of breeding. Moreover, Bretagnolle et al. (1990) demonstrated that previous claims of hybridisation are questionable. All in all, Bretagnolle et al. (1990) presented sound biological data from areas as varied as behavioural ecology, morphology and genetics to make a strong case for the recognition of four species-level taxa.

Penhallurick and Wink (2004) included three of those four prions in their study and found low divergences between them. It is curious to what lengths they go to defend their case for lumping these prions against the solid biological evidence that had been put forth by Bretagnolle et al. (1990). The very fact that there is sympatry among most of these prion taxa was again shrugged aside as ‘temporal allopatry’ (see above). Moreover, Penhallurick and Wink (2004) completely ignored Bretagnolle et al.’s (1990) morphological, acoustic and ecological data that support species status of the prions. Then, in an unconventional act of taxonomic procedure, Penhallurick and Wink (2004) use Bretagnolle et al.’s (1990) electrophoretic distance measures to calibrate them against their own cytochrome b divergences, which – they believe – puts them in a position to make inferences about the taxonomic status of Pachyptila belcheri, the one species that was missing in their own dataset.

Some of the caveats to heed when applying genetic divergence to taxonomy

The notion that sequence divergence can serve as a means of making inferences about the time scale of evolutionary processes is straightforward and intuitive. This approach has been used in quite a number of recent influential bird studies (Klicka and Zink 1997; Avise and Walker 1998; Johnson and Cicero 2004) though few ornithologists before Penhallurick and Wink (2004) have been so rigid in its application to taxonomic decision-making.

Some of these previous studies have helped uncover valuable insights, such as mitochondrial divergences of >2% between most Nearctic passerine sister species (though see Johnson and Cicero 2004) or an average of 2% divergence per million years among a number of bird families (Lovette 2004). However, many systematists have become complacent that we can just generalise some of the findings of those studies and apply them to any avian taxonomic grouping.

Yet the use of molecular calibrations is far more complicated than that. First, saturation and multiple substitutions are a serious problem as one goes deeper in the phylogeny. To avert this problem, Penhallurick and Wink (2004) opted to use a molecular clock for their amino acid translations rather than for their nucleotide data.

A second problem is the difference in rates of molecular evolution among bird lineages. For avian mitochondrial DNA, calibrations are available for five different orders, and they do not coincide with one another as well as they could (see Lovette 2004 for an excellent review). Nunn and Stanley (1998) derived three divergence calibrations for procellariiform birds (ranging from mitochondrial sequence divergences of 0.62% per million years through 0.92% per million years), one for larger families, one for medium-sized families, and one for the tiny storm-petrels (in ascending order). Other studies (Johnson and Sorenson 1998; Johnson and Cicero 2004) make it abundantly clear that some of the basal avian lineages, such as ducks, can have as little as 0.1–0.5% mitochondrial divergence between such distinct species as Northern Pintail (Anas acuta) and Mallard (Anas platyrhynchos) or Blue-winged (Anas discors) and Cinnamon (Anas cyanoptera) Teals. If Penhallurick and Wink (2004) had conducted a taxonomic revision of ducks, they might have ended up lumping the 35–40 currently recognised species of dabbling ducks (Anas) into a handful of species. Yet another problem with mitochondrial genetic divergences is the potential for a brief spell of past hybridisation of contemporary unequivocal species. Weckstein et al. (2001) found zero genetic divergence between two uncontested species of Zonotrichia sparrow, which belong to a family that is usually characterised by rather high divergences between sister taxa (Klicka and Zink 1997). They hypothesised that this pattern arose from past hybridisation among two species that are ecologically well segregated nowadays. The very same pattern has been found among skuas (Andersson 1999), a bird group with a life history that is very similar to Procellariiformes.

If Nunn and Stanley (1998) found pronounced differences in rates of genetic evolution within the Procellariiformes, how can Penhallurick and Wink (2004) do that order justice by using one calibration? Considering that their amino acid calibration was not even derived from seabird fossils, but from the taxonomically remote ratites, we have serious doubts as to the correctness of their datings.

Throughout their paper, Penhallurick and Wink (2004) tend to reject separate species status for any taxon pair with a divergence of <2%. Yet not once do they give a justification for this arbitrary cut-off. They do assert that such cut-off divergences have to be extrapolated by calculating distances of closely related but unequivocal species and using those as a yardstick. Yet as we have discussed above, wherever mitochondrial distances between good species pairs are lower than ~2%, they prefer to lump those taxa by overriding previous findings, rather than to use that low divergence as a new yardstick.

References

Abbott, C. L. , and Double, M. C. (2003a). Phylogeography of shy and white-capped albatrosses inferred from mitochondrial DNA sequences: implications for population history and taxonomy. Molecular Ecology 12, 2747–2758.
| Crossref | GoogleScholarGoogle Scholar | PubMed | Jukes T. H., and Cantor C. H. (1969). Evolution of protein molecules. In ‘Mammalian Protein Metabolism’. (Ed. H. N. Munro.) pp. 21–32. (Academic Press: New York.)

Klicka, J. , and Zink, R. M. (1997). The importance of Recent Ice Ages in speciation: a failed paradigm. Science 277(5332), 1666–1669.
| Crossref | GoogleScholarGoogle Scholar | Robertson C. J. R., and Nunn G. B. (1998). Towards a new taxonomy for albatrosses. In ‘Albatross Biology and Conservation’. (Eds G. Robertson and R. Gales.) pp. 13–19. (Surrey Beatty: Sydney.)

Sibley C. G., and Ahlquist J. (1990). ‘Phylogeny and Classification of Birds.’ (Yale University Press: New Haven, CT.)

Simpson G. G. (1961). ‘Principles of Animal Taxonomy.’ (Columbia University Press: New York.)

Templeton, A. R. (1983). Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and apes. Evolution; International Journal of Organic Evolution 37, 221–244.

Weckstein, J. D. , Zink, R. M. , Blackwell-Rago, R. C. , and Nelson, D. A. (2001). Anomalous variation in mitochondrial genomes of White-crowned (Zonotrichia leucophrys) and Golden-crowned (Z. atricapilla) Sparrows: pseudogenes, hybridization, or incomplete lineage sorting? Auk 118, 231–236.

Zonfrillo, B. (2004). Obituary: Paul Alexander Zino 1916–2004. Ibis 146, 575–576.
| Crossref | GoogleScholarGoogle Scholar |