2. Evolvability of Directed Cycles through Junk Sequences
The analysis suggests that junk DNA extends the Kolmogorov complexity of programs that can be generated by the human genome through its effects on the flow of information from DNA to RNA. In humans, much of the junk arises from endogenous retroelements (EREs) that have contributed sequences to over 50% of the genome. The rewriting of genomic information by EREs occurs through the reverse transcription of the RNAs they transcribe into DNA. While initially dismissed as genomic fluff, it is now appreciated that EREs are essential regulatory components of genes. Sequences derived from Alu retroelements, of which there are over a million copies in the human genome, can change both the splicing and polyadenylation of nascent RNA (for recent reviews, see
[1][2]). The EREs involved alter the transcript produced, depending on the position of their insertion, by implementing simple programming rules to change how RNA is processed. Rather than all being hard-wired into the genome, the outcomes are soft-wired and conditional on context. The splicing cascade that programs the sex of flies indicates the potential complexity of these events
[3].
Flipons also offer the opportunity to modulate their conformations to alter downstream events. The single-stranded regions they form in the DNA duplex as they flip from one conformation to another expose binding sites that allow the sequence-specific docking of small RNAs, especially those derived from the same family of repeats as the flipon. The binding of these small RNAs to flipons allows the targeting of the cellular machinery to these genomic locations to edit and modify the transcripts produced. The proteins involved can be generic and bind in a structure-specific manner. They need not specifically recognize any particular nucleic acid sequence. The assembly of these complexes is directed only by the sequence-specificity of the RNA. This design has a number of evolutionary advantages. Importantly, the RNA sequence space available to target flipons in a sequence-specific manner is much larger than for developing sequence-specific proteins, where problems with folding and loss of function constrain the span of possible protein variations.
Changes in the RNA space are also not all or nothing, so there is no loss of any adaptations that proved successful in the past. The altered processing just increases the number of isoforms produced. In contrast, protein variants abandon the previous versions. Similarly, the spread of flipons through the genome creates many possible ways to alter the local DNA conformation to generate variant transcripts. The digital nature of flipons enables many different combinations. It is unlikely that the genomes of any two cells are set identically. As a consequence, the selection of cells at the tissue level can enable the responses that are most adaptable to local stressors (recently reviewed
[4]). Small RNAs that are transmitted through germ cells also have the potential to bootstrap embryonic development by modulating the flipon conformation during early embryonic development
[5]. These effects are likely modulated through the extraembryonic endoderm, which induces highly conserved programs within the rapidly dividing embryo and could possibly involve the reverse transcription of these small RNAs into the extraembryonic genome.
2.1. Evolvability of Directed Cycles through Peptide Patches
There are other ways to evolve a directed cycle through junk DNA. The peptide patches I discussed earlier as part of the cell’s wetware can act as Velcro to pull proteins together to create new assemblies (
Figure 2A). The output from one of the sequestered proteins then potentially acts as an input to another. Eventually, a self-sustaining cycle arises through a set of protein interactions that positively reinforce each other’s output. This strategy assumes that proteins are more multifunctional than is currently presented in textbooks. In reality, the patched-together proteins often contain multiple different domains. Though many domains have well-studied functions, others remain uncharacterized. With the patchwork design just described, peptides with no enzymatic function can create new opportunities to unmask proteins with multiple personalities and are able to perform unexpectedly. Frequently, experimentalists find the newly discovered properties of a well-characterized protein surprising. They then write papers entitled “Hidden protein functions and what they may teach us”
[6] and “Protein moonlighting: what is it, and why is it important?”
[7].
Figure 2. DCs as cellular building blocks. (A) Assembly of DCs into larger complexes through peptide patches, with the multiple connections between the input A and the output B increasing system robustness. (B) Interactions between directed cycles to produce hypercycles that are autocatalytic.
The new cycles established by patching proteins together may initially depend on inputs from the milieux to bridge any missing links. The Krebs cycle that researchers depend upon to extract energy from sugars likely developed in such a way. At an early stage, the reactions depended on environmentally derived metals for catalysis. More efficient reactions arose when binding sites for metals were incorporated into genetically encoded proteins. Many of these strategies based on autocatalytic chemistries, along with the history of this field, have been reviewed
[8]. Even today, some DCs still rely on environmentally derived factors to function. The dependency on these essential nutrients is so complete that, without them, certain DCs fail to regenerate. Humans, for example, do not synthesize vitamin C, even though other organisms solved this biochemical challenge long ago.
2.2. Evolvability of Directed Cycles through Hypercycles
The evolution of DCs can proceed through the organization of self-replicating molecules connected in a cyclic, autocatalytic manner, as originally proposed by Manfred Eigen
[9] (
Figure 2B). Due to the way they interact, the cycles are self-propagating, with each cycle forming a node coupled to a larger cycle (
Figure 2). The interactions between different cycles allow them to amplify themselves, each other, and the hypercycle. The hypercycles further favor systems that store the information necessary to continuously regenerate themselves (
Figure 2B). In the simplest form, the earliest steps in a pathway did all that was required to produce a particular output. Steps were added that closed the circuit, leading to the self-amplification of that particular cycle. The cycle underwent further elaboration by connecting to other cycles that further assured their mutual perpetuation (
Figure 1, dΣ
c). The creation of genetic systems to transmit this information to subsequent generations was a natural consequence of hypercycle evolution.
2.3. Evolvability of Directed Cycles through Genome Duplication
The rewriting of directed cycles in DNA during evolution can occur in many ways different from those that Eigen imagined. There may be more complex processes involved. On occasion, genes may undergo duplication in ways that Susumu Ohno demonstrated were important during evolution
[10]. Fortuitous mutations affecting the level of gene expression, the processing of transcripts, and the non-templated modification of proteins then altered the character of each duplicated gene. At some point, changes to one paralog or the other provided a selective advantage, leading to the creation of new DC variants.
Occasionally, whole genomes undergo duplication. Many plants have a history of expanding their genomes in this manner and are consequently highly polyploid. As a result, they have multiple copies of each gene
[11]. The process allows DCs to be reconstituted in different ways or with different combinations to generate new elaborations. The process of genome duplication has also been observed in yeast following a sudden and adverse change in the environment
[12]. The high mutation rates that accompany this process drive additional genomic diversity and the elaboration of DCs that enable their regeneration and the survival of progeny in the new environment.
2.4. Evolvability of Directed Cycles through Endosymbiosis
Another way to acquire all of the components necessary to make a new DC is simply by obtaining all of them in one step from another organism. With bacteria, this means gaining an entire operon where all of the genes required for the regulation, expression, and scripting of a cycle are organized into one DNA segment. These outcomes are enabled by bacterial conjugation, the prokaryotic version of sex first observed by Joshua Lederberg and Edward Tatum
[13]. To do the same in eukaryotes would require a genomic organization similar to the operons of bacteria and a truly giant virus to transmit the much larger eukaryotic genes that embed all of the required information. It is now possible, using a variety of technologies, to introduce into cells large genomic assemblies with all of the genes required. The most extreme transplant of genes so far performed is the transfer of entire normal mitochondria to replace the defective ones transmitted to an embryo from a parent. Of course, the only reason that eukaryotes have mitochondria in the first place is that at one point in time, the whole set of DCs that another free-living organism had successfully evolved was subsumed to generate energy with available substrates. The most recent proponent of this idea was Lynn Margulis, who also noted that chloroplasts are endosymbiont cyanobacteria
[14]. Even today, osteoclasts can source replacement mitochondria from osteomorphs to remain functional
[15].
2.5. Evolvability of Directed Cycles through Bioengineering
Experimental approaches aimed at modifying DCs depend on first identifying the minimal set of components required for a DC to regenerate itself. Such studies can be performed in vitro by purifying each element and reconstituting a DC from these parts. These approaches helped elucidate many of the DCs, such as the Krebs cycle, involved in cell metabolism. These studies can also be performed using genetic approaches to identify the different DC components. Over the years, bacteria and yeast have proven particularly powerful in establishing many of the factors that modulate DCs in single cells.
Collectively, these approaches identify the proteins essential for regenerating DCs. The methods also uncover redundancies and scaffolds that enhance the performance and robustness of DCs (Figure 1, dΣb). Further, the results inform on which DC steps can be modulated therapeutically. Drugs to break DCs are part of the pharmacopeia positioned to kill cancer cells. The targeting approach yields valuable insights into the differences between normal and diseased cells. This work identifies multiple pathways between nodes in normal tissue and those that are no longer present in cancer cells. The vulnerability of tumors arises due to mutations that inactivate one or more of the redundant connections between nodes. The tumors are then susceptible to drugs that target the residual pathway. The drugs and mutations synergize to selectively kill the tumor while sparing normal cells.
Drugs that induce synthetic lethality in tumors are important in the clinic. In many cases, tumors are able to mutate and become resistant to most drugs that are used as single agents. The tumors then continue growing
[16]. A drug cocktail that targets multiple DCs to induce synthetic lethality through different pathways is often needed to thwart the escape of cancer cells from eradication. The challenges to curing cancers despite the high-precision targeting of molecules underscore the overall resilience of DCs in cells. The intransitive programming based on DCs enhances their adaptability. Winning strategies just require the rewiring of the path between two nodes (
Figure 1, dΣ
b).
The therapeutic potential to alter DC function by programming flipons with small RNAs exists. The interventions can be used to prevent the expression of an essential DC component, to regulate its processing, or to recode the amino acids in key functional domains. There is also the possibility of rewiring connections in DCs to improve their design to engineer new functions. The nature of DCs allows researchers to drive their evolution in cells and to bulk manufacture their outputs by cell culture.
The patchwork approach to generating new DCs also offers opportunities. Researchers do not know how far this strategy can be pushed to engineer new DCs. Experimentally, researchers could ask whether they can tag well-folded functional domains with interacting peptide patches to Velcro together new protein assemblies with defined properties. Can researchers then evolve a DC with a desired output (
Figure 2A)? Or can researchers expose existing DCs to alternative chemistries to create completely new reaction schemes that have never before existed in nature? Already, DCs have been adapted to use synthetic chemicals in preference to their natural substrates. For example, Madeleine Bouzon and Philippe Marlière substituted 4-hydroxy-2-oxobutanoic acid for the amino acids serine and glycine as a carbon source for one particular metabolic pathway
[17].
An underexplored area is the use of repeat-derived RNAs to build scaffolds. As shown by the assembly of the spliceosome, many proteins exist that bind to simple sequence motifs exposed on single-stranded RNAs. In principle, these motifs could be used in a combinatorial fashion to create novel RNA scaffolds on which to assemble existing proteins in a cell into new assemblies and then select for a phenotype of interest. The targeting of the cellular machinery to triplex-forming flipons by noncoding RNAs through this mechanism has been previously reviewed
[1].