In 1958, Francis Crick formulated the sequence hypothesis and central dogma of molecular biology.1 The formulation of the sequence hypothesis was, as follows: “The specificity of a piece of nucleic acid is expressed solely by the sequence of its bases, and this sequence is a (simple) code for the amino acid sequence of a particular protein.” The formulation for the central dogma was, as follows: “This states that once ‘information’ has passed into protein, it cannot get out again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but the transfer from protein to protein, or from protein to nucleic acid is impossible. Information means here the precise determination of sequence, either of bases in nucleic acids or amino acids in the protein.”1
In 1970, F. Crick suggested a diagram that described the central dogma of molecular biology (Fig. 1).2 According to F. Crick,2 the arrows in the diagram represent “the directional flow of detailed, residue-by-residue, sequence information from one polymer molecule to another”. “Solid arrows show general transfers; dotted arrows show special transfers.” However, as stated by F. Crick, “it says nothing about what machinery of transfers is made of,” and “it says nothing about control mechanisms.” Since then, the deciphering of the machinery of information transfer and control mechanisms has become the agenda of molecular biology.
Some of the important results of this work included the theory of regulatory genome and gene regulatory networks,3 the realization of the increase in regulatory complexity during the evolution of multicellular organisms,4,5 and the understanding of the self-organizing principles of genome organization and function.6 However, the authors considered that the determination of how the genome structure correlates to function remains an open question.7 Furthermore, the validity of the sequence hypothesis has recently been challenged based on the assessment of the peptide-to-protein folding process, and the public health concerns raised by the genomic approach to health and disease.8 The author of the present study adds that we are still further away from understanding how new functions and morphological novelties originate during progressive evolution, and what the role of gene expression is in this process. The present study was devoted to this problem. It appears that the fundamentals of molecular biology need to be revised, and that new paradigms are needed to address the unresolved biological and medical problems.
The author of the present study is working on a new theory of carcino-evo-devo, or the theory of the evolutionary role of hereditary tumors.9 According to the main hypothesis of the carcino-evo-devo theory, hereditary tumors contribute to the emergence of new cell types, tissues, and organs by providing extra cell masses for the expression of evolutionarily novel genes and gene combinations. A special chapter of the carcino-evo-devo theory was devoted to carcino-evo-devo diagrams that described the main postulates of the theory.
The three diagrams used in the present study, which were obtained from previous articles published by the author, are the multilevel reciprocity diagram obtained from a reference (Fig. 2),10 and the carcino-evo-devo diagrams obtained from another reference (Fig. 3).11 In Figure 2, the processes of evolution at different levels of structural organization were interconnected using feedback arrows. That is, these are mutually interdependent. The multilevel reciprocity diagram was coined to explain the neutralization of competitive interrelations between genes that emerged during genome evolution at the cellular and multicellular levels. The diagrams that described the evolution of development (evo-devo) according to the carcino-evo-devo theory are presented in Figure 3.
According to the carcino-evo-devo theory, tumor-bearing organisms (Carcino) represent the transitional forms in the evolution of development. Normal ontogenies cannot directly participate in progressive evolution. This is prohibited by the carcino-evo-devo diagram (the lack of Devo→Evo arrows). The unfolded carcino-evo-devo diagram describes the four successive steps in the progressive evolution of development (marked by colored arrows). Kozlov11 provided a more detailed description of the carcino-evo-devo diagrams.
In the present study, the author suggested the modification of the diagram of the central dogma to describe the evolution of gene expression, drew a cellular diagram to describe the origin of new cell types, and used these together with Figures 2 and 3 to construct a formula that describes the evolution of gene expression, the origin of new cell types, evo-devo, and the complexity growth during progressive evolution.
The diagram construction was based on the ideology of central dogma and carcino-evo-devo theory, and on the analysis of empirical data and essential biological connections at and among the macromolecular, cellular and multicellular levels of structural organization.
Modification of the central dogma diagram: the diagram that describes the evolution of gene expression
The striking similarity between Figures 1 and 3a suggested that the central dogma diagram can be unfolded upwards along the evolutionary scale, in a manner similar to that presented in Figure 3b. The author notes that the dotted arrow RNA—>DNA in Figure 1 may describe the origin of evolutionarily new genes by retroposition. Thus, the unfolded central dogma diagram would indeed describe the genome, transcriptome and proteome evolution, and may be constructed in the following manner (Fig. 4).
Similar to Figure 3b, which describes the four successive steps in the evolution of development (evo-devo), Figure 4 describes the four successive steps (marked by colored arrows) in genome evolution, and the evolution of gene expression. Each successive step in the genome evolution in Figure 4 may involve a number of evolutionarily novel genes that participate in the origin of new differentiated cell types and morphological novelties.
Construction of the formula for the coevolution of gene expression and evo-devo
Since Figures 3b and 4 describe the four successive steps in the evolution of gene expression and evo-devo, the author attempted to construct a single formula that described the coevolution of these processes. The processes described in Figure 4 belong to the macromolecular level of organization. The processes described in Figure 3b belong to the multicellular level of organization. Figures 3b and 4 can be connected to Figure 2 using the math symbol of subset ⊂ to obtain the formula for the coevolution of gene expression and evo-devo at two levels (MML and MCL) in the course of progressive evolution (Fig. 5).
Thus, Figure 5 describes the four consecutive steps in the coevolution of gene expression and evo-devo. The evolution of development (evo-devo) resulted in evolutionary innovations and morphological novelties, which constitute the complexity growth.
Cellular diagrams
Figure 5 suggests that one diagram is missing, that is, the cellular diagram. The author constructed similar triangle diagrams for the cellular level of organization (Fig. 6). Stem/embryonal cells are cells that participate in normal development, differentiated cells are terminally differentiated cells, and tumor cells include cancer stem cells and the hierarchy of parenchymal cells at different stages of (abnormal) differentiation. The arrows in Figure 6 represent the possible transitions among stem/embryonal, tumor and differentiated cells. The embryonal →differentiated arrow represents the main route of development towards terminally differentiated cells. The embryonal →tumor arrow designates the possibility of the tumor origin from embryonal cells.12 The tumor →differentiated arrow represents the capability of tumor cells to differentiate with the loss of malignancy.13 The tumor → embryonal arrow represents the main postulate of the carcino-evo-devo theory. That is, the possibility that hereditary tumor cells would become an integral part of normal development after the expression of evolutionarily novel genes and gene combinations, and the acquisition of new functions.11,13 The curved arrows represent the capability of stem cells and cancer stem cells to replicate. The absence of arrows from terminally differentiated cells means that these have very little (or no) capacity to replicate, generate tumors, or de-differentiate (although the author is aware of the reprogramming and trans-differentiation experiments, and that few examples exist in nature).14–17 Thus, the omitted arrows represent the prohibited transitions.
In summary, Figure 6a describes the possible transitions among stem/embryonal, tumor and differentiated cells, and Figure 6b describes the origin of new differentiated cell types, and the incorporation of hereditary tumor cells into normal development in the course of progressive evolution. The number of differentiated cell types in multicellular organisms is considered a measure of complexity.13
The complete unfolded formula for the evolution of gene expression, the origin of new cell types, evo-devo, and the multilevel complexity growth in progressive evolution
Using Figure 5 and the unfolded Figure 6b, the complete unfolded formula for the evolution of gene expression, the origin of new differentiated cell types, evo-devo, and the multilevel complexity growth in progressive evolution was constructed (Fig. 7). Figure 7 describes the four consecutive steps in the evolution of gene expression, the origin of new differentiated cell types, evo-devo, and the multilevel complexity growth in progressive evolution. For simplicity, Figure 7 may be drawn in a regular folded form (Fig. 8).
Evolution of gene expression as part of a multilevel evolutionary process
In previous publications, the author investigated the evolution of gene expression against the background of genome evolution, and the increase in gene number in genomes of evolving organisms.10,18 The advanced treatment of competitive interrelations between genes revealed that when the gene number increased in evolving genomes, the enforcement of gene competition and appearance of antagonistic relations between genes took place. The evolution process would result in the neutralization of incompatibility between genes through the disconnection of incompatible gene products at both the cellular and multicellular level.10,18 The functional organization of the genome in groups of compatible genes, with antagonistic relations between at least some genes of different compatibility groups, would follow after such coevolution.10 Each transcriptome or proteome in Figure 4 corresponds to the result of the expression of compatible genes. The various transcriptomes (transcriptome 1, transcriptome 2, etc.) or proteomes (proteome 1, proteome 2, etc.) in Figure 4 may include the products of incompatible genes.
Figure 2 presents the relationship between evolutionary processes at different levels, and the reciprocity between these. In the figure, the processes of evolution at different levels of structural organization were interconnected using feedback arrows, that is, these are mutually interdependent.10
The role of tumors in the evolution of gene expression
In a series of publications that followed, the author formulated the hypothesis of evolution by tumor neofunctionalization.13 According to this hypothesis, hereditary tumors provide additional cell masses for the expression of evolutionarily novel genes (which originate in the DNA of germ cells, but not in tumor cells) and gene combinations that may lead to the origin of new cell types, tissues, and organs in evolution.
This is the first time that the hypothesis of evolution by tumor neofunctionalization considered the genome evolution and evolution of gene expression in the interaction with processes of evolution at the cellular and multicellular level, and the role of tumors in this interaction. This hypothesis explains what previously formulated hypotheses (evolution by gene duplication and the genetic theory of morphological evolution) could not: the source of cells,19,20 in which evolutionarily novel genes or gene combinations are expressed, and thereby participate in the origin of new cell types, tissues and organs.
The role of tumors in the evolution of gene expression was confirmed through the discovery of a new class of tumor-specifically expressed, evolutionarily novel (TSEEN) genes in the laboratory of the author. TSEEN genes are expressed in a broad range of tumors, but are not expressed or weakly expressed in normal cells, including embryonic and stem cells.21,22 The author proposed that TSEEN genes can be considered as a new superclass of novel and evolving genes that are only expressed in tumors. There are several classes and families of TSEEN genes that include TSEEN genes from different phyla of organisms.22
Carcino-evo-devo theory
With the accumulation of supporting evidence and experimental data from the author’s laboratory and other laboratories, the hypothesis of evolution by tumor neofunctionalization grew into the theory of the evolutionary role of hereditary tumors. The content of the theory was published in a series of theoretical articles by the author,9 and as a monograph in English,13 Russian,23 and Chinese24 languages.
The major postulates of the carcino-evo-devo theory are, as follows: (1) tumors participate in the evolution of development; (2) hereditary tumors provide evolving multicellular organisms with extra cell masses for the expression of evolutionarily novel genes and gene combinations, and thereby participate in the origin of new cell types, tissues and organs; (3) populations of tumor-bearing organisms serve as transitional forms in progressive evolution; (4) tumors may be considered as search engines for new gene combinations and morphological novelties in the space of biological possibilities.9
Several non-trivial predictions of the carcino-evo-devo theory have been confirmed in the author’s laboratory.22 Furthermore, several non-trivial explanations of the carcino-evo-devo theory have been discussed, and the relationship between the carcino-evo-devo theory and other biological theories was examined.9 The demonstration that human orthologs of fish TSEEN genes acquire progressive functions not encountered in fish was the direct confirmation of the carcino-evo-devo theory.25
The carcino-evo-devo theory has its own structure, which consists of several chapters that describe the different aspects of the theory. One of the chapters is devoted to carcino-evo-devo diagrams.
Carcino-evo-devo diagrams
The carcino-evo-devo diagram (Fig. 3a) and its unfolded form (Fig. 3b) were suggested in a study to describe the major postulates of the carcino-evo-devo theory.11 The carcino-evo-devo diagram (Fig. 3a) originated independently of the central dogma diagram (Fig. 1), and describes the different processes. However, the diagrams had striking similarities. The shape of both diagrams was triangular, and both diagrams contained important prohibitions: the arrows, which were omitted from the diagrams (refer to the discussion on prohibitions below). The unfolded carcino-evo-devo diagram (Fig. 3b) suggested an idea to unfold the central dogma diagram (Fig. 1) into Figure 4. Other carcino-evo-devo diagrams that described evolutionarily novel tumor-like organs have been published.26
Formula for the evolution of gene expression, the origin of new cell types, evo-devo, and complexity growth in progressive evolution
Formula in Figure 7 describes the evolution of gene expression, the origin of new cell types, evo-devo, and multilevel complexity growth in the course of progressive biological evolution. This consisted of four diagrams (Figs. 2, 3b, 4, and 6b), which were connected using the subset symbol ⊂ oriented in accordance with the diagram structure. Figure 4 describes the evolution of gene expression as related to genome evolution. Figure 6b describes the origin of new differentiated cell types in progressive evolution. Figure 3b describes the evolution of development (evo-devo) with hereditary tumor-bearing organisms as transitory forms (carcino-evo-devo). Figure 2 connects Figures 3b, 4 and 6b, and describes the relative independence and reciprocity of the processes of progressive evolutionary development at different levels of structural organization. To our knowledge, formula in Figure 7 is the first to present the compact description of the progressive evolution of living organisms at three structural levels of organization.
The initial regular diagrams (Figs. 1, 3a, and 6a) are the triangular diagrams. Figure 2 may be drawn as a triangle. All diagrams discussed in the present study contained biologically important prohibitions. The similar features of the diagrams that Figure 7 comprises may open new insights into the fundamental properties of life.
Figure 7 describes the four steps of progressive evolution, which may be extended to include the subsequent steps of complexity growth. The last arrow (Devo 5 →) in the figure points to the future steps in evolution of development. Formula in Figure 7 presents the considerable concentration of knowledge. It describes the basic statements of the carcino-evo-devo theory, and the biologically significant coincidences of relatively independent events at various levels (proteome 2, differentiated 2, and Devo 2; proteome 3, differentiated 3, and Devo 3; proteome 4, differentiated 4, and Devo 4) frozen in progressive evolution (Evo1, Evo 2, Evo 3, and Evo 4) by functional feedbacks and natural selection. The frozen coincidences constitute the appearance of new progressive forms in evolution, with the expression of evolutionarily novel genes and gene combinations, new differentiated cell types, and morphological innovations. The populations of tumor-bearing organisms (Carcino 1, Carcino 2, Carcino 3, and Carcino 4) served as the intermediate transitory forms.
The relationship with Darwinism and other evolutionary theories
The hopeful coincidences of the steps in diagrams of (Fig. 7) were subjected to natural selection. That is, the carcino-evo-devo theory and Figure 7 do not contradict Darwinism. The carcino-evo-devo theory is complementary to Darwinism, because it introduces the mechanism of complexity growth in progressive evolution, which is tumor neofunctionalization.9
Furthermore, the coincidences of the steps in diagrams of (Fig. 7) may correspond to the “punctuated equilibrium” model of evolution.27
The relationship between the carcino-evo-devo theory and other evolutionary theories was discussed in more detail in another article.9
Biocomputational processes, compatibility search, and biologically significant multilevel coincidences
The author considered biological computation and compatibility search in the possibility space as mechanisms of complexity growth during progressive evolution.28 The search for biologically significant multilevel coincidences may be realized through biological computational processes, with tumors as the search engines.28 The starting point of formula in Figure 7 was genome 1, which was in correspondence with the central role of DNA computation in the space of unrealized biological possibilities.28
The complexity growth in progressive evolution is a multilevel process
The consecutive steps of evolution (proteome 1, proteome 2, proteome 3, and proteome 4; differentiated 1, differentiated 2, differentiated 3, and differentiated 4; Devo 1, Devo 2, Devo 3, Devo 4, and Devo 5) describe the increase in complexity at each level. That is, the complexity growth in the progressive evolution of organisms is a multilevel process. Some of the quantitative characteristics of this process are known: 411 differentiated cell types have been described in humans,29 and 479 morphological characters have been scored as an index of vertebrate morphological complexity.30 These quantitative characteristics may be used in computer science to model the processes of macroevolution and its interaction with the evolution of gene expression.
Prohibitions
As Karl Popper stated,31 “Every good theory is a prohibition: it forbids certain things to happen. The more a theory forbids, the better it is.” Information flow from proteins is forbidden in diagrams in Figures 1 and 4. In Figure 2, MML cannot directly interact with MCL, but only through CL. In Figure 3, normal development (Devo) cannot directly participate in progressive evolution (Evo), but only with the aid of an intermediary transitory form (Carcino). In Figure 6, terminally differentiated cells cannot replicate, de-differentiate, or generate tumors.
Figure 7 contains more prohibitions. This shows the prohibition of the simultaneous expression of all genes that constitute the genome, especially the simultaneous expression of incompatible genes that originate during genome evolution, in a differentiated cell type. Thus, Figure 7 describes the rise and evolution of differential gene expression, which is connected with genome evolution, the origin of new differentiated cell types, and the evolution of development. The prohibitions described in Figure 7 may be ranked among the fundamental properties of life.
The use of diagrams in biology and in other branches of science
Diagrams are widely used in biology, such as diagrams of metabolic pathways, signaling pathways, immunological pathways, and other biological pathways and networks. In these diagrams, arrows are usually used to connect different molecular structures. The central dogma describes the information flow between different types of macromolecules within the same level of organization.
The arrows in Figures 2 and 3 connect entities that belong to different levels of organization, or describe the qualitatively different biological processes. In the earlier work conducted by the author, arrows were used to describe the interactions between different levels of organization,10 the evolution of development,11 and the origin of evolutionarily novel tumor-like organs.26 These descriptions differed in kind from previous diagrammatic biological descriptions.
The unfolding of the central dogma diagram, which has been unaltered for more than 50 years, on the evolutionary scale, described the evolution of the genome, transcriptome and proteome (Fig. 4), demonstrating the viability of the present diagrammatic approach.
Arrows (“morphisms”) were used in the mathematical category theory, allowing these to connect “objects” of different natures. The category theory is presently being used to build new theoretical physics. This also seeks examples from other sciences that may be formalized within the framework of the category theory. The author considers that the present study may suggest that the category theory participates in building new chapters of theoretical biology.
The diagrams and formulas obtained in the present study may be used in computer science to model the processes of macroevolution.
Diagrams and empirical data
The diagram of the central dogma of molecular biology published in 19702 was based on the previous analysis of a vast amount of empirical data on protein synthesis published by F. Crick in 1958.1 The same is true for the carcino-evo-devo diagrams. First introduced in 2019,11 the carcino-evo-devo diagrams were based on the theory that summarized a vast amount of empirical evidence obtained from several fields of biology. More than one thousand references have been cited in the monograph published by the author in 2014.13 Cellular diagrams (Fig. 6) in the present study were based on the empirical data described in the references.11–17 Thus, the diagrams in the present study represent the generalization of empirical data, and reflect the actual biological processes. The category theory uses a similar approach.
The formula in Figure 7 possesses predictive power. This predicts that at least 411 transcriptomes are necessary to determine the development of existing mammalian differentiated cell types. This emphasizes the importance of studying the newly discovered class of TSEEN genes in different groups of organisms, in order to understand the evolution of gene expression, including studies of epigenetic modifications. The non-trivial predictions of the carcino-evo-devo theory and its experimental confirmation were discussed in more detail in a reference.22
Although the central dogma mentioned nothing on the machinery of information transfers and control mechanisms, it stimulated studies of corresponding molecular mechanisms. Similarly, the carcino-evo-devo theory and its diagrams would stimulate studies on the molecular, cellular and multicellular mechanisms of the evolution of gene expression and complexity growth.
Where does the information unrelated to gene sequence come from?
The author intends to respond to an important question: where does the information that is unrelated to that in gene sequence come from?8 The answer to this query would be higher levels of structural organization, as specified by Figure 7. The second significant issue that was brought up in a reference concerned the upward vs. downward transmission of information between genotypes and phenotypes.32 Based on the findings of the present study, the correct response would be, as follows: in Figure 7, there is reciprocity between the processes of evolutionary development at different levels of structural organization, that is, the information is being transferred in both directions.
In conclusion, the diagram of the central dogma unfolding upwards along the evolutionary scale describes the evolution of gene expression (Fig. 4). The formula that describes the evolution of gene expression, the origin of new cell types, evo-devo, and the multilevel complexity growth in the course of progressive biological evolution was obtained (Fig. 7). This formula explains several fundamental questions about progressive evolution. Similar features of the diagrams in Figure 7 may open new insights into the fundamental characteristics of life. The coincidences of events at different levels described in Figure 7 may correspond to the punctuated equilibrium model of progressive evolution.
The diagrams and formulas obtained in the present study may stimulate new experimental studies of gene expression, particularly the newly discovered class of TSEEN genes. These diagrams and formulas may be formalized within the framework of the category theory, and used in computer science to model the interrelationship of gene expression with macroevolutionary processes.
The theory of carcino-evo-devo and the formula of complexity growth considers evolutionary, individual and neoplastic development at three levels of structural organization within one theoretical framework. This is the reason why the carcino-evo-devo theory has the potential to become a unifying biological theory.
Abbreviations
- Carcino:
ontogenies with neoplastic development
- carcino-evo-devo:
coevolution of hereditary tumors with normal development, the theory of the evolutionary role of hereditary tumors
- CL:
cellular level of structural organization
- Devo:
normal ontogenies
- evo-devo:
evolution of development
- Evo:
progressive evolution of ontogenies
- MML:
macromolecular level of structural organization
- MCL:
multicellular level of structural organization
- TSEEN genes:
tumor specifically expressed, evolutionarily novel genes
Declarations
Funding
This study was funded by the Ministry of Science and Higher Education of the Russian Federation, under the Strategic Academic Leadership Program “Priority 2030” (Agreement 075-15-2023-380 dated 20 February 2023).
Conflict of interest
APK has been an editorial board member of Gene Expression since June 2023. The author has no other conflict of interest related to this publication.