Volume 63, Issue 7 p. 528-537
Critical Review
Free Access

How a neutral evolutionary ratchet can build cellular complexity

Julius Lukeš

Julius Lukeš

Biology Centre, Institute of Parasitology, Czech Academy of Sciences, and Faculty of Sciences, University of South Bohemia, České Budĕjovice (Budweis), Czech Republic

Search for more papers by this author
John M. Archibald

John M. Archibald

Centre for Comparative Genomics and Evolutionary Bioinformatics, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada

Search for more papers by this author
Patrick J. Keeling

Patrick J. Keeling

Department of Botany, University of British Columbia, Vancouver, BC, Canada

Search for more papers by this author
W. Ford Doolittle

W. Ford Doolittle

Centre for Comparative Genomics and Evolutionary Bioinformatics, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada

Search for more papers by this author
Michael W. Gray

Corresponding Author

Michael W. Gray

Centre for Comparative Genomics and Evolutionary Bioinformatics, Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS, Canada

Tel: +1-902-494-2521. Fax: +1-902-494-1355

Department of Biochemistry and Molecular Biology, Sir Charles Tupper Medical Building, Dalhousie University, 5850 College Street, Halifax, Nova Scotia B3H 4R2, CanadaSearch for more papers by this author
First published: 22 June 2011
Citations: 147

Abstract

Complex cellular machines and processes are commonly believed to be products of selection, and it is typically understood to be the job of evolutionary biologists to show how selective advantage can account for each step in their origin and subsequent growth in complexity. Here, we describe how complex machines might instead evolve in the absence of positive selection through a process of “presuppression,” first termed constructive neutral evolution (CNE) more than a decade ago. If an autonomously functioning cellular component acquires mutations that make it dependent for function on another, pre-existing component or process, and if there are multiple ways in which such dependence may arise, then dependence inevitably will arise and reversal to independence is unlikely. Thus, CNE is a unidirectional evolutionary ratchet leading to complexity, if complexity is equated with the number of components or steps necessary to carry out a cellular process. CNE can explain “functions” that seem to make little sense in terms of cellular economy, like RNA editing or splicing, but it may also contribute to the complexity of machines with clear benefit to the cell, like the ribosome, and to organismal complexity overall. We suggest that CNE-based evolutionary scenarios are in these and other cases less forced than the selectionist or adaptationist narratives that are generally told. © 2011 IUBMB IUBMB Life, 63(7): 528–537, 2011

INTRODUCTION

In a recent Science Perspective (1), we highlighted a neutral evolutionary theory, called constructive neutral evolution (CNE) by Stoltzfus (2), emphasizing how such a process could lead to what we term “irremediable complexity”: the seemingly gratuitous, indeed bewildering, complexity that typifies many cellular subsystems and molecular machines, particularly in eukaryotes. We offered (in fact reoffered) the CNE paradigm as a counterpoint to purely adaptationist/selectionist schemes that are often favored by biologists, and molecular biologists in particular, to explain the evolution of structural and biochemical complexity. We argued that continued failure to consider CNE alternatives impoverishes evolutionary discourse and, by oversimplification, actually makes us more vulnerable to critiques by antievolutionists, who like to see such complexity as “irreducible.” Here, we expand on this idea by presenting in more detail “case histories” that illustrate how CNE might have operated in the emergence of several complex systems, including RNA editing, the spliceosome, and the ribosome, and how it might be invoked more broadly as an evolutionary paradigm underlying cellular complexity in general.

The ability to explain complex adaptations was seen by Darwin as a crucial test of his theory, though his focus of course was at the supramolecular level. He wrote in The Origin of Species (3), “If it could be demonstrated that any complex organ existed, which could not possibly have been formed by numerous, successive, slight modifications, my theory would absolutely break down.” Darwin knew of no such organ, nor do we. But evolutionists are often conflicted about what might be the driving force behind successive modifications in specific complex cases, and importantly about whether there might be a tendency for Life to get more complex in general. Indeed, endorsing such a general tendency seems dangerously close to discredited beliefs in “evolutionary progress” (4, 5).

Although evolutionary biologists now widely accept that much molecular sequence evolution is, as long ago suggested by Kimura (6) and by Jukes and King (7), effectively neutral, they generally view complex macromolecular or organismal structures and processes—especially those with many necessary subunits or steps—as products of selection. This selection might be seen necessarily to entail general complexification in at least five ways. First, complexity might increase overall for the obvious reason that the evolution of many organs or cellular organelles with incrementally improvable function—such as the vertebrate eye or the bacterial flagellum—might often require the accretion of many parts (8, 9). Second, regulation of increasingly numerous, individually selected activities might require more elaborate hierarchical control systems. For instance, the greater complexity of eukaryotic genome structures compared to prokaryotic, in size, chromosomal organization, number of repeats, and nongenic transcription, has long been interpreted as serving such a global function, possibly via RNA–RNA interactions (10). A third complexifying factor might have to do with the external relations of an organism more than its internal machinery. For instance, interactions with parasites are thought by some to have been a driving force behind redundancy in signaling networks and the origins and maintenance of sexual reproduction in their hosts (11, 12). The latter habit itself makes more complex the physiology and behavior of a species, and sexual selection might be seen as a fourth general kind of selected complexification. Finally, many complex genomic features, especially of eukaryotes, may promote “evolvability,” which itself might be selected for at a species or higher clade level (13).

In Full House, Gould (5) noted that there is a trivial sense in which we need not invoke selection of any kind to explain an increase in overall complexity since Life's origins, when by definition it had zero complexity. A selectively neutral random walk (Gould called it a “drunkard's walk”) through complexity space will likely not return Life to this starting point, even if there is no directional force. But, most living things are still at the prokaryotic level of organization, Gould noted. Only in some lineages—on which we have focused our attention in part because one of them leads to us—has there been substantial further complexification. Even among these lineages there are instances of secondary simplification: extreme reduction in genome size and phenomic capacity are known in parasitic derivatives of many phyla.

In his recent critiques of the adaptationist bias in evolutionary explanation, Lynch (14) offers a more reasoned and nontrivial, but still neutralist, explanation—small population size—for why complexity will have increased in some lineages. Many aspects of molecular and cellular biology, including some very bizarre genomic structures, are best explained as “products of nonadaptive processes” (fixation by drift of neutral or mildly deleterious mutations), most effective in small populations. Thus, eukaryotes, with smaller population sizes than prokaryotes, have more genomic features such as introns, editing, and vast excesses of noncoding DNA whose origins challenge the skills of even the most imaginative panadaptationist (15).

Moreover, directionality can be imposed on neutral processes, in the form of evolutionary ratchets. Maynard Smith and Szathmary in their book The Major Transitions in Evolution (16) proposed that one such ratchet, which they called contingent irreversibility, might have pushed Life inexorably toward greater complexity. As a result of several major transitions, such as the assembly of independent replicators into chromosomes, the acquisition of (and loss of autonomy by) the endosymbiotic bacteria that were to become mitochondria and chloroplasts, and the origins of sexual reproduction or multicellularity, previously independent units became interdependent for replication. Reversion to independence might have subsequently been selected against or the potential for reversion simply lost through disuse. Thus, contingent irreversibility serves as a neutral evolutionary ratchet, a directional force that might drivecomplexification within some lineages, without positive selection.

Abbreviations

CNE, constructive neutral evolution; gRNA, guide RNA; RNP, ribonucleoprotein.

Described in detail more than a decade ago by Stoltzfus (2), CNE can be understood as a similar ratchet process, combining mutation, drift, epistasis, and negative selection. Such processes or forces are still most often ignored in explanations by molecular biologists of the intricacies of the molecular machinery of cells, their explanations being generally selectionist or pandaptationist in character. The one exception to this neglect is the “neutral subfunctionalization” model for the retention of gene duplicates proposed at about the same time and since convincingly elaborated by Lynch (14). Subfunctionalization is a special case of CNE. And, Lynch's perspective on population size expands the explanatory potential of CNE, because the presuppressive interactions (Fig. 1) that become fixed via CNE need not be assumed to be completely neutral. We must emphasize, however, that CNE is not simply the neutral theory of evolution nor is it simply a necessary consequence of small population sizes. Instead, it posits more specifically that cellular functions will inevitably come to depend on the interactions of more and more components—that function will “diffuse”—as a consequence of the inevitable gratuitous pre-existence of potentially suppressive molecular interactions.

Details are in the caption following the image

Constructive neutral evolution of biochemical complexity. Schematic depicts (i) a generic enzymatic reaction carried out by cellular component A, (ii) fortuitous (and presuppressing) neutral interactions (yellow dots) with component B, (iii) mutation in A (red dot) that inactivates its activity but that is suppressed by existing interaction with B, (iv) additional mutation in A that is also presuppressed by interaction with B, and (v) coevolving A:B interaction arising later. At stage (ii), A is able to function whether or not B is present and interacting with it, but at stages (ii) and beyond, A is not able to function in the absence of B.

COMPLEXITY THROUGH PRESUPPRESSION

A generalized representation of CNE as an inevitable and possibly widespread evolutionary tendency is cartooned in Fig. 1. A biochemical reaction under selection (green arrow) is catalyzed by a cellular component A (nucleic acid or protein) that fortuitously interacts with component B either directly, by binding, or indirectly, through the products of B's own selected activity. (Here, we define “fortuitous interaction” as a chance but nevertheless thermodynamically specific interaction between the binding partners.) The interaction, though not under selection, permits (suppresses) mutations in A that would otherwise inactivate it. Under these conditions, mutations will unavoidably occur, making A dependent on B. Reversion to independence might also happen, but if there are multiple sites at which the first and further dependencies between A and B can arise (or multiple ways in which A can become dependent on B's activities, when the interaction is mediated indirectly) a random walk through dependency space—just like Gould's “drunkard's walk” through complexity space—is unlikely to restore A to its original state of independence from B. In simplest terms, if there are more ways for intermolecular or interprocess dependence to increase than decrease, then an increase is unavoidable. CNE, thus, comprises an evolutionary ratchet, like Maynard Smith and Szathmary's contingent irreversibility, but one relating to any function, not only replication. Because of CNE, we expect molecular machines to accumulate more and more subunits even when there is no improvement in their function, which is to say that we expect “function” to “diffuse” among cellular components over evolutionary time. Organisms, like human institutions, will become ever more “bureaucratic,” in the sense of needlessly onerous and complex, if we see complexity as related to the number of necessarily interacting parts required to perform a function, as did Darwin. Once established, such complexity can be maintained by negative selection: the point of CNE is that complexity was not created by positive selection.

In our earlier Perspective (1), we highlighted a simple biological example first described by Atkins and Lambowitz (17). Neurospora mitochondrial group I introns that are dependent for splicing on a nucleus-encoded mitochondrial tyrosyl tRNA synthetase (mtTyrRS) could have arisen as in Fig. 1, as follows. Self-splicing introns (as component A) fortuitously bound mtTyrRS (component B). This binding allowed the accumulation in the intron of several mutations that destroyed its ability to self-splice, because by binding to and stabilizing the intron RNA the mtTyrRS “presuppressed” them. Once several such mutations have happened, it is no longer likely that a random mutational (drunkard's) walk through “dependency space” will restore the intron's initial independence. Thus, the acquisition of protein dependence by the intron could be seen as simultaneously “accidental,” selectively neutral, and inevitable.

Such an explanation inverts the order proposed by Paukstelis and Lambowitz (18), in which the mtTyrRS binding was described as having arisen to compensate for “structural defects” acquired by the intron sequence. We maintain that such an ordering of events put the cart before the horse: introns bearing such defects would be at serious selective disadvantage. It is unlikely that the mutations in question would be fixed in populations before the binding suppressed their deleterious effects.

This chain of events is relatively easy to envision in a two-component, autocatalytic, selfish element found only in an organelle genome of a single fungus. However, below, we outline how CNE might have played a major role in the emergence of a number of complex cellular machines, some narrowly distributed phylogenetically, others universal within a particular domain or among all domains of Life.

RNA EDITING

A bewildering array of RNA editing systems, mostly involved in retailoring transcripts of protein-coding genes, has been described in eukaryotic organelles, particularly mitochondria (19). The patchy distribution of these systems argues forcefully that they are derived traits. Various benefits (many cited below) have been ascribed to RNA editing. All comprise “Just So Stories” in the rich adaptationist tradition (14), but the origin and evolution of editing were also the subject of an early CNE model (20). Below, we expand this argument in light of new insights into the biochemistry of two radically different RNA editing systems, illustrating how each could have evolved without positive selection.

Editing in Kinetoplastid Protozoa

Kinetoplastid RNA editing (Fig. 2A) involves the post-transcriptional insertion and/or deletion of uridine (U) residues in mitochondrial mRNAs. Sequence information for editing is provided by a multiplicity of guide RNA (gRNA) molecules, typically encoded in separate minicircles organized as a concatenated network (kinetoplast DNA, kDNA) that also contains larger maxicircles, bearing the genes whose transcripts undergo editing. In Trypanosoma brucei, more than 1,000 different gRNAs mediate 2,965 U insertions and 318 U deletions. The processes of matching cognate mRNAs and gRNAs, deleting and inserting U residues, and the addition of polyU tails are executed by at least five protein complexes (MRP1/2, MRB1, and three similar core editing complexes), plus accessory factors and other interacting complexes—altogether more than 70 proteins (21).

Details are in the caption following the image

RNA editing in kinetoplastid and plant mitochondria. (A) The Trypanosoma brucei mitochondrion depicting the kinetoplast DNA (kDNA) disk and simplified scheme of mitochondrial RNA metabolism. Arrows denote major processing pathways for rRNAs, (pre-)mRNAs, and guide RNAs (gRNAs). The numbers of the various RNA species are also indicated. Known protein complexes involved in editing and processing of mRNA and its translation are shown along with the number of their protein components. (B) U insertion/deletion editing of a region of atp6 mRNA, encoding subunit 6 of electron transport Complex V (ATP synthase). (C) The mitochondrion of the land plant Arabidopsis thaliana. Each site to be edited is selected by a different nucleus-encoded, mitochondrion-targeted specificity factor (PPR protein). (D) C-to-U substitution editing of a region of nad5 mRNA, encoding subunit 5 of electron transport Complex I (NADH:ubiquinone oxidoreductase).

Since the discovery of U insertion/deletion editing in 1986 (22), numerous explanations for its evolutionary origin have been proposed, almost always based on some selective advantage. For some, editing is a relic of a proposed (23) RNA World, possibly involved in primordial error correction (24); however, the narrow phylogenetic distribution of this editing system [only in all kinetoplastid protozoa (25) and perhaps also in diplonemids (26)] makes this notion remarkably nonparsimonious. A case for contemporary error correction has also been argued, given that the mitochondrial editing system is dispensable during part of the trypanosome life cycle (27). However, in this case, the editing system is expected to engender additional mutations, and such scenarios cannot explain the origin of editing, which predates the parasitic lifestyle (25, 26). Moreover, although 12 mRNAs are edited in the procyclic stage (in tse-tse fly) of T. brucei (21), editing remains essential for at least one transcript in the bloodstream stage (in mammals) (28). The opposite argument, that editing generates variability over evolutionary time, has also been proposed (29), and although this may rationalize maintenance, it cannot easily explain origins. Expansion of the mitochondrial proteome through translation of occasional partial edits or misedits has been reported (30), but function of multiple products has not been demonstrated and seems unlikely. Defense against viruses and transposons was suggested (31), but none is known in kinetoplastids. Regulatory roles are also possible (32), but even mRNAs unneeded at a given stage continue to be edited (21). Indeed, the entire editing apparatus remains operational not only in all life cycle stages (33) but remarkably also in the so-called petite mutants of T. brucei that have lost all their mitochondrial DNA and hence RNA (34).

U insertion/deletion editing also illustrates, in a unique way, the “cart-before-horse” problem in error-correction scenarios. Such scenarios envision U deletions/insertions in coding genes arising first, then gRNAs to correct them. But, as Stoltzfus (2) remarked, because gRNAs specify the original sequence of the gene, they must have appeared before the errors they now correct were fixed in the population. Because gRNAs interact with their target by base pairing, they reveal the order of events more clearly than RNA–protein or protein–protein interactions.

Overall, adaptationist explanations for U insertion/deletion editing may be relevant to its maintenance, but they fail to address its origin. A CNE origin of this type of RNA editing (2) assumes pre-existing endonuclease, exonuclease, uridylyl transferase, RNA ligase, and other activities, capable of templated insertions/ deletions of U residues in duplex RNA structures. Such activities are presumed to have arisen from cellular enzymes serving other functions, through a process of gene duplication and divergence (20). Infrequent and unselected insertions of cDNA gene segments into the genome would have generated potential templates for antisense transcripts (ancestral gRNAs), setting up a system permissive for the accumulation of otherwise lethal insertions or deletions in kDNA (20). Furthermore, to explain the existence of noncanonical G:U base pairing between gRNAs and mRNAs, the activity of cytidine deaminase was invoked, which converted some C residues to U in the antisense RNAs (35). Alternatively, dispersed gene fragments flanked by common repeats, recently found in the mtDNA of Euglena gracilis, a distant relative of kinetoplastids, may be viewed as a preadaptation from which minicircle-like molecules specifying primordial guide-like RNAs might have arisen (36). In either case, the first correctable insertion/deletion mutation in an essential gene would make the nascent editing apparatus at least temporarily essential. This first step might be readily reversed; however, because having only a single mutation could lead to reversion, whereas having many mutations would lead to additional editing sites, an increase in editing is more likely. Such first steps will keep occurring until one is effectively fixed by a deep random walk into editing space.

RNA Editing in Land Plant Organelles

A very different but similarly rampant type of editing occurs in land plant organelles, particularly mitochondria (37). In flowering plants, mitochondrial mRNAs undergo C-to-U substitutions at some 300 to 500 different positions, almost always resulting in an amino acid change (38). C-to-U editing involves base modification by a cytidine deaminase or transaminase rather than base or nucleotide exchange (39, 40).

No guide-type RNAs have been found in land plant organelles, despite extensive searches; instead, evidence points to cis-acting sequence elements working in concert with trans-acting proteins to facilitate editing. The protein editing factors appear to be encoded by a large gene family (>200 members in Arabidopsis thaliana) recently recognized in the genomes of land plants (41, 42). This family is characterized by tandem arrays of a degenerate 35-amino acid element termed the pentatricopeptide repeat or PPR (41). Genetic and biochemical studies implicate PPR proteins in organellar RNA metabolism in general (43), including editing (44). The current view is that a given PPR protein directly interacts with one or a few specific sites in a target transcript and recruits generic enzymes responsible for RNA maturation, such as C deaminase (editing) or RNA endonuclease (processing) (37).

Once again, several adaptationist models have been advanced to explain coevolution of complex organellar RNA metabolism and the PPR protein family in land plants. Maier et al. (45) have argued that “several chloroplast-specific mechanisms evolved in land plants to remedy point mutations that occurred after the water-to-land transition,” and that chloroplast PPR proteins exist for “the transgenomic suppression of point mutations, fixation of which occurred due to an enhanced genetic drift exhibited by chloroplast genomes.” Shikanai (37) accepted the CNE model (20) as an explanation for the emergence of C-to-U RNA editing per se, and goes on to say, “As evolution progressed, PPR proteins may have allowed the number of editing sites to be increased. By multiplying the family members with variations, plants may have easily managed the newly occurring mutations.” Others (44) have mused, “... do the huge numbers of PPR proteins provide terrestrial plants with unparalleled regulatory control over organellar gene expression, or are they merely a curious historical accident?” As with U insertion/deletion editing, C-to-U editing is either seen to have some ephemeral benefit or to have emerged to correct a deleterious intermediate. The alternative view (46) is that “… RNA editing systems, far from evolving in response to a need to “correct” a problem, actually allow the problem to emerge in the first place, i.e., they permit DNA-encoded genetic information to degenerate progressively. Viewed in this light, RNA editing systems are part and parcel of both the problem and its solution.”

Expansion of the PPR protein family, and especially a class (PLS) correlated with C-to-U editing (44), evidently occurred at the base of the land plant lineage. This expansion produced a diverse collection of RNA-binding proteins almost exclusively targeted to mitochondria and chloroplasts and binding in a site-specific but not functionally predetermined fashion to particular regions of various organellar transcripts. These specifically bound PPR proteins could interact with other proteins having various catalytic activities, effectively recruiting these enzymes and “preadapting” them to participate in a range of functions having ultimately to do with various aspects of organellar RNA metabolism (including C-to-U RNA editing). Occasionally, a PPR protein having a binding site in the vicinity of a potentially editable site will have recruited an activity (e.g., a C deaminase) able to reverse the deleterious effect of mutation at that particular position, allowing such a mutation to become fixed in the mitochondrial genome. Each individual edited site is potentially revertible, which would render its cognate PPR protein nonessential. However, when the number of such sites becomes large, a ratchet-type effect ensures that there is a vanishingly small likelihood of a return to a state in which there are no edited sites. At this stage, the editing system is “locked in.”

The CNE scenarios for kinetoplastid U insertion/deletion editing and plant C-to-U editing share two important principles: (i) nascent RNA editing machinery must emerge (probably via duplication and divergence of genes for pre-existing activities) before there is any need for editing, and (ii) from an evolutionary perspective, RNA editing is itself mutagenic.

SPLICING AND THE SPLICEOSOME

The examples so far sketched represent processes that evolved relatively recently in the organelles of individual species or lineages, but there is no reason why CNE might not have operated earlier in evolution. Indeed, the same series of events that led to the need for a splicing factor in the Neurospora group I intron can progressively build complexity to a much greater extreme: for example, in the best-studied splicing machine, the eukaryotic spliceosome. Spliceosomes comprise five small RNAs (snRNAs) and >300 proteins (47), which must be assembled de novo and then disassembled at each of the many introns interrupting the typical nascent mRNA (48). The current consensus concerning the origin and evolution of spliceosomal snRNAs sees them derived from group II introns and is grounded in Sharp's 1991 “Five Easy Pieces” scenario (49), which seems ever more appealing on mechanistic (50), comparative genomic (51), and experimental evolutionary (52) grounds. Most plausibly, some group II introns fragmented early on to yield primordial snRNAs, which then allowed the subsequent disintegration of other introns because the primordial snRNAs could facilitate splicing in trans. Such fragmentation would be ratchet-like, because reversal by correct reassembly at the DNA level from fragmented intron pieces would be extraordinarily rare. No positive selection need be invoked: Sharp's scheme was quintessentially CNE.

An even more extraordinary part of this transformation was the addition of the hundreds of proteins that serve now to make the spliceosome “the most complicated macromolecular machine in the cell” (53). Even Darwin might be reluctant to advance a claim that eukaryotic spliceosomal introns remove themselves more efficiently or accurately from mRNAs than did their self-splicing group II antecedents, or that they achieved this by “numerous, successive, slight modifications” each driven by selection to this end. Although eukaryotic splice site recognition does require many proteins, it is the greatly expanded length and poorly defined structure of eukaryotic introns vis-à-vis their group II ancestors that makes this necessary, and conversely the presence of so many proteins that has allowed intron expansion and loss of definition—a coevolutionary walk of many drunkards. And, in one sense, eukaryotic splicing is arguably quite inaccurate: most or all of our own multiintronic genes are alternatively spliced. Of course, some alternative splicing events are regulated and under selection, and alternative splicing may have been essential for the “expansion of the eukaryotic proteome,” but few would argue that most mRNA or encoded protein isoforms are functionally differentiated.

Nor is it reasonable to suppose that introns were already numerous and spliceosomes complex in early eukaryotic evolution [which comparative genomics reveals to be the case (51)] just so that multicellular animals and plants might much later enjoy greater phenotypic plasticity and evolvability. Other rationalizations of the spliceosome's wealth of protein components—that they provide a platform to facilitate regulated export and expression of mRNA or linkage to other nuclear processes for instance (53)—explain why we might not successfully reduce spliceosomal complexity now, but do not explain how it originally came to be.

In a neutralist evolutionary narrative, chance interactions with pre-existing RNA-binding proteins—from which many spliceosomal factors are indeed clearly derived—presuppressed and therefore made inevitable further increases in the size and decreases in the structural definition of eukaryotic introns, building up the contemporary spliceosome, step by unselected step. If early eukaryotes had small populations, then slightly deleterious steps will also have played a role in this CNE complexifying ratchet.

THE RIBOSOME

The processes discussed above share two important features that affect how we view their origin: not only are they phylogenetically restricted, but also there is no undeniable benefit to the organisms that bear them. Both features make it easier to accept the possibility of complexity growing by neutral means; however, if CNE is able to generate nonadaptive complexity in such machines, there is no reason it could not also have operated on machines of more ancient and central importance to cellular function. One such example is the ribosome, a structurally and functionally complex cellular machine composed of separate large and small ribonucleoprotein (RNP) subunits and common to all cellular life. The ribosome not only provides the scaffold on which the translation machinery operates but also directly mediates the fundamental chemical reaction of this process: peptide bond formation, an intrinsic property of the large ribosomal subunit (54). Moreover, X-ray crystallographic structures of the ribosome (e.g., ref. 55) in combination with functional assessment have strongly supported what had long been suspected: that the ribosome is a ribozyme, at least insofar as peptide bond formation is concerned (56-61).

The view that the ribosome is fundamentally an RNA machine is consistent with early suggestions that the primordial ribosome consisted solely of RNA (62-67). How the ribosome evolved has been the subject of much discussion and speculation (e.g.,68-73), but there is general consensus that ribosome evolution occurred in a modular fashion, with suggestions that the primordial ribosome comprised a collection of small, noncovalently interacting RNAs (74, 75). In this scenario, ribosomal proteins that now contribute to ribosome structure and dynamics, as well as to the accuracy and precision of translation, are considered to be later additions.

In the evolutionary transition from RNA to RNP, typically envisaged as the progressive addition of proteins to the rRNA core, it is generally assumed that each new interaction was selected to have a positive effect on protein synthesis. Addition of new ribosomal proteins (69, 76) would increasingly depend on protein–protein interactions with ribosomal proteins acquired earlier. Today, the stepwise formation of ribosomal subunits in vitro from their constituent rRNAs and proteins recapitulates one aspect of this evolutionary pathway: assembly is initiated by the binding of several ribosomal proteins directly to the rRNA, with subsequent maturation involving addition of the remaining proteins in a stepwise fashion, dependent on the prior binding of partner ribosomal proteins (73). What is less clear, however, is that the addition of all or even most new proteins was favored by selection, or that the core function of protein synthesis was improved by such a great increase in complexity. Indeed, the same principles that we argue to explain the origin of phylogenetically restricted and ostensibly selfish processes such as splicing and editing can just as easily be applied to the conversion of the ribosome from an RNA to a large RNP complex.

A CNE origin of the ribosome would progress much as described above, except that the initial interaction between the rRNA and some or many of the RNA-binding proteins is fortuitous; but, once bound, these proteins presuppress subsequent mutations in the rRNA that ultimately make the binding essential for function. A clear difference between the ribosome and examples such as splicing or editing is that the process took place much earlier in evolution. Nevertheless, we may ask whether such an ancient and critical machine can still be affected by CNE.

Not surprisingly given the ribosome's essential role in cellular metabolism, its structure and components are tightly conserved in evolution. In bacteria, the prototypical (e.g., E. coli) ribosome contains three RNA species of ∼2,900, ∼120, and ∼1,540 nucleotides, plus 55 different ribosomal proteins (Fig. 3). In contrast, the human cytoplasmic ribosome contains four rRNA species of ∼4,800, ∼160, ∼120, and ∼1,900 nucleotides, plus 79 proteins (Fig. 3). Although there are variations on this general theme both within and between domains, the degree of conservation of the various components, both rRNA and protein, is striking. Overall, the impression one gets is that of an exquisitely tuned, evolutionarily static machine whose interacting RNA and protein components are locked into place by rigid functional constraints underlying the ribosome's fundamental role in translation, as well as the numerous extraribosomal functions performed by ribosomal proteins (76, 77). If CNE once played a role in the evolution of eukaryotic cytoplasmic ribosomes, it would not appear to be doing so any longer.

Details are in the caption following the image

Ribosome complexity in bacteria and eukaryotes. The cartoon on the left summarizes the complexity of the ribosome of Escherichia coli, on the right, the human cytoplasmic and mitochondrial ribosomes. In each case, the number of proteins comprising the small and large ribosomal subunits is provided, as is the approximate size and number of ribosomal RNA (rRNA) species and the number of messenger RNAs (mRNAs) translated.

When one looks at the ribosomes of mitochondria, however, what emerges is an entirely different picture, one of extraordinary evolutionary plasticity. In keeping with their endosymbiotic origin, mitochondrial ribosomes in some species have strikingly bacteria-like compositions. However, in other lineages, drastic changes to rRNA size and structure, as well as protein composition, have occurred (75). Most relevant here are cases where a marked reduction in the size of rRNA components has occurred concomitantly with a substantial increase in ribosomal protein complexity. For example, the human mitochondrial ribosome contains rRNA species that are about half the size of their bacterial counterparts, but the number of proteins has increased in both subunits to a complexity closer to that of cytoplasmic ribosomes (Fig. 3) (78). Clearly, the human mitochondrial ribosome has lost substantial RNA and gained substantial protein in the course of its evolution from a bacterial progenitor, reversing the usual protein:RNA ratio (33:67) to become protein-rich (69:31) (79). An even more extreme situation is seen in the kinetoplastids (80, 81). Here, rRNA shrinkage has resulted in Trypanosoma mitochondrial rRNAs of only 610 and 1,150 nucleotides, with additional proteins among a total of 133 (vs. 55 in E. coli) evidently compensating for this loss. Notably, the novel mitoribosomal proteins do not have detectable homologs outside of the kinetoplastids, and only a low degree of conservation and/or divergent function within this lineage.

This process appears to have been accompanied by a substantial remodeling of ribosome structure. In the human mitochondrial ribosome, many proteins occupy new positions, and intersubunit bridges consist mainly of protein rather than RNA (82). Especially notable is the absence of 5S rRNA in the large subunit of the mammalian mitoribosome; instead, proteins occupy the site where this RNA species normally sits, suggesting that a protein element may assume some of the roles of 5S rRNA (82). An even more extreme situation developed in the RNA-poor mitoribosome of kinetoplastid flagellates, which is more porous than other known ribosomes and where functionally conserved sites, such as the mRNA channel, the transfer RNA passage, and the exit site for nascent polypeptides are occupied by newly acquired ribosomal proteins rather than familiar ones (80).

In short, a CNE scenario can be used to rationalize not only the emergence of the ribosome as an RNP per se but also its peculiar “degeneration” in certain systems, notably mitochondrial, where constraints on ribosome function are presumably limited only to synthesizing a very small number of proteins. Additional aspects of the translation system may also have emerged via a CNE pathway. In considering the transition from nonencoded to encoded protein synthesis, Bernhardt and Tate (83) saw proto-mRNAs “as appearing first simply as serendipitous binding partners, forming complementary base-pair interactions with the anticodon loops of tRNA pairs” (see also ref. 73). This scenario fits perfectly within the CNE rubric.

CONCLUDING REMARKS

We have described how CNE could underlie the neutral origin of complex machines in early or recent evolution, and affect selfish systems or those central to cell function. In doing so, however, we have restricted our application of CNE to RNP assemblies and more specifically to illustrating how proteins may progressively assume the role of RNAs, resulting in ever more complex machines. This focus is not because we imagine that CNE is restricted to RNP machines, but rather because the distribution of function between RNA and protein serves as a particularly clear illustration of how such a process might work. Indeed, there are not only other examples of RNP assemblies that could have evolved through the same process (e.g., RNAse P or snoRNPs) but also any other kind of macromolecular complexity. In particular, consideration of protein–protein interactions exposes a vast array of intricate cellular processes to a new way of thinking about how they might have originated as a multitude of drunkards walking through complexity space. As we pointed out previously (1), machines of marvelous complexity such as light-harvesting antennae in photosynthesis, RNA and DNA polymerases and their attending initiation, elongation, and termination complexes, apparatuses for import, folding, and degradation of proteins, or the cytoskeleton and its motors, all might have grown to their current form through a process of CNE accretion. The same argument could apply to large and complex regulatory networks, which are often described as being “finely tuned” but might be better interpreted as “runaway bureaucracy” or biological Rube Goldberg machines (84) where what could be a relatively simple task is performed though many steps by an unnecessarily complex machine.

It is also worth noting that it is often difficult to distinguish definitively between CNE and adaptation with regard to the origin of any given example of complexity. However, there are three ways in which we can immediately simplify the problem. First, it is of critical importance to distinguish between the origin of a process and its current role. A system that originated by neutral means through CNE could later acquire an additional beneficial activity, even though it did not evolve to perform that function. For example, some spliceosomal introns are known to play a role in gene regulation, but this current function in no way implies that introns in general evolved for that specific purpose. Second, it is important to consider the order of events more carefully when reconstructing the origin of a molecular machine: commonly articulated scenarios with a “problem” leading to the evolution of the “solution” require deleterious intermediates, whereas the fortuitous existence of the “solution” allowing the “problem” to originate and spread does not require such improbable conditions. Third, CNE and adaptation are not mutually exclusive for a given process once that process reaches a substantial level of complexity—it is likely that both play a role in the origin of the most complex systems. Some proteins likely were added to the ribosome as a result of selection, but this does not mean complexity is itself adaptive. Indeed, because CNE is a ratchet-like process that does not require positive selection, it will inevitably occur in self-replicating, error-prone systems exhibiting sufficient diversity, unless some factor prevents it. Development of in vitro experimental systems with which to test CNE will be an important step forward in distinguishing complex biology that arose due to adaptation versus nonadaptive complexity, as part of a larger view to understand the interplay between neutral and adaptive evolution, such as the intriguing long-term evolution experiments of Lenski and coworkers (85). At present, however, molecular biologists look to adaptation almost exclusively to explain even the most irremediable complexity, but we submit that this view is too narrow; in fact, the onus should be on us to first exclude the inevitable, nonadaptive drunkard's walk.

Acknowledgements

The authors are members of the Program in Integrated Microbial Biodiversity of the Canadian Institute for Advanced Research (CIFAR), whose financial support is gratefully acknowledged. This work was supported by operating grants from the Canadian Institutes of Health Research (CIHR) to MWG (MOP-4124), JMA (ROP-85016), PJK (MOP-42517), and WFD (MOP-4467), and awards from the Ministry of Education of the Czech Republic (LC07032, 2B06129, and 6007665801) and the Praemium Academiae to JL. JMA holds a CIHR New Investigator Award and PJK is Senior Scholar of the Michael Smith Foundation for Health Research. They thank Arlin Stoltzfus for comments on an earlier version of this manuscript.