What is Code Biology?
Codes and conventions are the basis of our social life and from time immemorial have divided the world of culture from the world of nature. The rules of grammar, the laws of government, the precepts of religion, the value of money, the rules of chess etc., are all human conventions that are profoundly different from the laws of physics and chemistry, and this has led to the conclusion that there is an unbridgeable gap between nature and culture. Nature is governed by objective immutable laws, whereas culture is produced by the mutable conventions of the human mind.
In this millennia-old framework, the discovery of the genetic code, in the early 1960s, came as a bolt from the blue, but strangely enough it did not bring down the barrier between nature and culture. On the contrary, a protective belt was quickly built around the old divide with an argument that effectively emptied the discovery of all its revolutionary potential. The argument that the genetic code is not a real code because its rules are the result of chemical affinities between codons and amino acids and are therefore determined by chemistry. This is the ‘Stereochemical theory’, an idea first proposed by George Gamow in 1954, and re-proposed ever since in many different forms (Pelc and Welton 1966; Dunnil 1966; Melcher 1974; Shimizu 1982; Yarus 1988, 1998; Yarus, Caporaso and Knight 2005). More than fifty years of research have not produced any evidence in favour of this theory and yet the idea is still circulating, apparently because of the possibility that stereochemical interactions might have been important at some early stages of evolution (Koonin and Novozhilov 2009). The deep reason is probably the persistent belief that the genetic code must have been a product of chemistry and cannot possibly be a real code. But what is a real code?
The starting point is the idea that a code is a set of rules that establish a correspondence, or a mapping, between the objects of two independent worlds (Barbieri 2003). The Morse code, for example, is a mapping between the letters of the alphabet and groups of dots and dashes. The highway code is a correspondence between street signals and driving behaviours (a red light means ‘stop’, a green light means ‘go’, and so on).
What is essential in all codes is that the coding rules, although completely compatible with the laws of physics and chemistry, are not dictated by these laws. In this sense they are arbitrary, and the number of arbitrary relationships between two independent worlds is potentially unlimited. In the Morse code, for example, any letter of the alphabet could be associated with countless combinations of dots and dashes, which means that a specific link between them can be realized only by selecting a small number of rules. And this is precisely what a code is: a small set of arbitrary rules selected from a potentially unlimited number in order to ensure a specific correspondence between two independent worlds.
This definition allows us to make experimental tests because organic codes are relationships between two worlds of organic molecules and are necessarily implemented by a third type of molecules, called adaptors, that build a bridge between them. The adaptors are required because there is no necessary link between the two worlds, and a fixed set of adaptors is required in order to guarantee the specificity of the correspondence. The adaptors, in short, are the molecular fingerprints of the codes, and their presence in a biological process is a sure sign that that process is based on a code.
This gives us an objective criterion for discovering organic codes and their existence is no longer a matter of speculation. It is, first and foremost, an experimental problem. More precisely, we can prove that an organic code exists, if we find three things: (1) two independents worlds of molecules, (2) a set of adaptors that create a mapping between them, and (3) the demonstration that the mapping is arbitrary because its rules can be changed, at least in principle, in countless different ways.
Two outstanding examples
The genetic code
In protein synthesis, a sequence of nucleotides is translated into a sequence of amino acids, and the bridge between them is realized by a third type of molecules, called transfer-RNAs, that act as adaptors and perform two distinct operations: at one site they recognize groups of three nucleotides, called codons, and at another site they receive amino acids from enzymes called aminoacyl-tRNA-synthetases. The key point is that there is no deterministic link between codons and amino acids since it has been shown that any codon can be associated with any amino acid (Schimmel 1987; Schimmel et al. 1993). Hou and Schimmel (1988), for example, introduced two extra nucleotides in a tRNA and found that that the resulting tRNA was carrying a different amino acid. This proved that the number of possible connections between codons and amino acids is potentially unlimited, and only the selection of a small set of adaptors can ensure a specific mapping. This is the genetic code: a fixed set of rules between nucleic acids and amino acids that are implemented by adaptors. In protein synthesis, in conclusion, we find all the three essential components of a code: (1) two independents worlds of molecules (nucleotides and amino acids), (2) a set of adaptors that create a mapping between them, and (3) the proof that the mapping is arbitrary because its rules can be changed.
The signal transduction codes
Signal transduction is the process by which cells transform the signals from the environment, called first messengers, into internal signals, called second messengers. First and second messengers belong to two independent worlds because there are literally hundreds of first messengers (hormones, growth factors, neurotransmitters, etc.) but only four great families of second messengers (cyclic AMP, calcium ions, diacylglycerol and inositol trisphosphate) (Alberts et al. 2007). The crucial point is that the molecules that perform signal transduction are true adaptors. They consists of three subunits: a receptor for the first messengers, an amplifier for the second messengers, and a mediator in between (Berridge 1985). This allows the transduction complex to perform two independent recognition processes, one for the first messenger and the other for the second messenger. Laboratory experiments have proved that any first messenger can be associated with any second messenger, which means that there is a potentially unlimited number of arbitrary connections between them. In signal transduction, in short, we find all the three essential components of a code: (1) two independents worlds of molecules (first messengers and second messengers), (2) a set of adaptors that create a mapping between them, and (3) the proof that the mapping is arbitrary because its rules can be changed (Barbieri 2003).
A world of organic codes
In addition to the genetic code and the signal transduction codes, a wide variety of new organic codes have come to light in recent years. Among them: the sequence codes (Trifonov 1987, 1989, 1999), the Hox code (Paul Hunt et al. 1991; Kessel and Gruss 1991), the adhesive code (Redies and Takeichi 1996; Shapiro and Colman 1999), the splicing codes (Barbieri 2003; Fu 2004; Matlin et al. 2005; Pertea et al. 2007; Wang and Burge 2008; Barash et al. 2010; Dhir et al. 2010), the signal transduction codes (Barbieri 2003), the histone code (Strahl and Allis 2000; Jenuwein and Allis 2001; Turner 2000, 2002, 2007; Kühn and Hofmeyr 2014), the sugar code (Gabius 2000, 2009), the compartment codes (Barbieri 2003), the cytoskeleton codes (Barbieri 2003; Gimona 2008), the transcriptional code (Jessell 2000; Marquard and Pfaff 2001; Ruiz i Altaba et al. 2003; Flames et al. 2007), the neural code (Nicolelis and Ribeiro 2006; Nicolelis 2011), a neural code for taste (Di Lorenzo 2000; Hallock and Di Lorenzo 2006), an odorant receptor code (Dudai 1999; Ray et al. 2006), a space code in the hippocampus (O’Keefe and Burgess 1996, 2005; Hafting et al. 2005; Brandon and Hasselmo 2009; Papoutsi et al. 2009), the apoptosis code (Basañez and Hardwick 2008; Füllgrabe et al. 2010), the tubulin code (Verhey and Gaertig 2007), the nuclear signalling code (Maraldi 2008), the injective organic codes (De Beule et al. 2011), the molecular codes (Görlich et al. 2011; Görlich and Dittrich 2013), the ubiquitin code (Komander and Rape 2012), the bioelectric code (Tseng and Levin 2013; Levin 2014), the acoustic codes (Farina and Pieretti 2014), the glycomic code (Buckeridge and De Souza 2014; Tavares and Buckeridge 2015) and the Redox code (Jones and Sies 2015).
The living world, in short, is literally teeming with organic codes, and yet so far their discoveries have only circulated in small circles and have not attracted the attention of the scientific community at large.
Code Biology is the study of all codes of life with the standard methods of science. The genetic code and the codes of culture have been known for a long time and represent the historical foundation of Code Biology. What is really new in this field is the study of all codes that came after the genetic code and before the codes of culture. The existence of these codes is an experimental fact – let us never forget this – but also more than that. It is one of those facts that have extraordinary theoretical implications.
The first is the role that the organic codes had in the history of life. The genetic code was a precondition for the origin of the first cells, the signal transduction codes divided the descendants of the common ancestor into the primary kingdoms of Archaea, Bacteria and Eukarya, the splicing codes were instrumental to the origin of the nucleus, the histone code provided the rules of chromatin, and the cytoskeleton codes allowed the Eukarya to perform internal movements, including those of mitosis and meiosis (Barbieri 2003, 2015). The greatest events of macroevolution, in other words, were associated with the appearance of new organic codes, and this gives us a completely new understanding of the history of life.
The second great implication is the fact that the organic codes have been highly conserved in evolution, which means that they are the great invariants of life, the sole entities that have been perpetuated while everything else has been changed. Code Biology, in short, is uncovering a new history of life and bringing to light new fundamental concepts. It truly is a new science, the exploration of a vast and still largely unexplored dimension of the living world, the real new frontier of biology.
Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P (2007) Molecular Biology of the Cell. 5th Ed. Garland, New York.
Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencow BJ and Frey BJ (2010). Deciphering the splicing code. Nature, Vol 465, 53-59.
Barbieri M (2003) The Organic Codes. An Introduction to Semantic Biology. Cambridge University Press, Cambridge, UK.
Barbieri M (2015) Code Biology. A New Science of Life. Springer, Dordrecht.
Basañez G and Hardwick JM (2008) Unravelling the Bcl-2 Apoptosis Code with a Simple Model System. PLoS Biol 6(6): e154. Doi: 10.137/journal.pbio.0060154.
Berridge M (1985) The molecular basis of communication within the cell. Scientific American, 253, 142-152.
Brandon MP and Hasselmo ME (2009) Sources of the spatial code within the hippocampus. Biology Reports, 1, 3-7.
Buckeridge MS and De Souza AP (2014) Breaking the “Glycomic Code” of cell wall polysaccharides may improve second-generation bioenergy production from biomass. BioEnergy Research, 7, 1065-1073.
De Beule J, Hovig E and Benson M (2011) Introducing Dynamics into the Field of Biosemiotics. Biosemiotics, 4(1), 5-24.
Dhir A, Buratti E, van Santen MA, Lührmann R and Baralle FE, (2010). The intronic splicing code: multiple factors involved in ATM pseudoexon definition. The EMBO Journal, 29, 749–760.
Di Lorenzo PM (2000) The neural code for taste in the brain stem: Response profiles. Physiology and Behaviour, 69, 87-96.
Dudai Y (1999) The Smell of Representations. Neuron 23: 633-635.
Dunnill P (1966) Triplet nucleotide-amino-acid pairing; a stereochemical basis for the division between protein and non-protein amino-acids. Nature, 210, 1267-1268.
Farina A and Pieretti N (2014) Acoustic Codes in Action in a Soundscape Context. Biosemiotics, 7(2), 321–328.
Flames N, Pla R, Gelman DM, Rubenstein JLR, Puelles L and Marìn O (2007) Delineation of Multiple Subpallial Progenitor Domains by the Combinatorial Expression of Transcriptional Codes. The Journal of Neuroscience, 27, 9682–9695.
Fu XD (2004) Towards a splicing code. Cell, 119, 736–738.
Füllgrabe J, Hajji N and Joseph B (2010) Cracking the death code: apoptosis-related histone modifications. Cell Death and Differentiation, 17, 1238-1243.
Gabius H-J (2000) Biological Information Transfer Beyond the Genetic Code: The Sugar Code. Naturwissenschaften, 87, 108-121.
Gabius H-J (2009) The Sugar Code. Fundamentals of Glycosciences. Wiley-Blackwell.
Gamow G (1954) Possible relation between deoxyribonucleic acid and protein structures. Nature, 173, 318.
Gimona M (2008) Protein linguistics and the modular code of the cytoskeleton. In: Barbieri M (ed) The Codes of Life: The Rules of Macroevolution. Springer, Dordrecht, pp 189-206.
Görlich D, Artmann S, Dittrich P (2011) Cells as semantic systems. Biochim Biophys Acta, 1810 (10), 914-923.
Görlich D and Dittrich P (2013) Molecular codes in biological and chemical reaction networks. PLoS ONE 8(1):e54,694, DOI 10.1371/journal.pone.0054694.
Hafting T, Fyhn M, Molden S, Moser MB, Moser EI (2005) Microstructure of a spatial map in the entorhinal cortex. Nature, 436, 801-806.
Hallock RM and Di Lorenzo PM (2006) Temporal coding in the gustatory system. Neuroscience and Behavioral Reviews, 30, 1145-1160.
Hou Y-M and Schimmel P (1988) A simple structural feature is a major determinant of the identity of a transfer RNA. Nature, 333, 140-145.
Hunt P, Whiting J, Nonchev S, Sham M-H, Marshall H, Graham A, Cook M, Alleman R, Rigby PW and Gulisano M (1991) The branchial Hox code and its implications for gene regulation, patterning of the nervous system and head evolution. Development, 2, 63-77.
Jenuwein T and Allis CD (2001) Translating the histone code. Science, 293, 1074-1080.
Jessell TM (2000) Neuronal Specification in the Spinal Cord: Inductive Signals and Transcriptional Codes. Nature Genetics, 1, 20-29.
Jones DP and Sies H (2015) The Redox Code. Antioxidants and Redox Signaling, 23 (9), 734-746.
Kessel M and Gruss P (1991) Homeotic Tansformation of Murine Vertebrae and Concomitant Alteration of Hox Codes induced by Retinoic Acid. Cell, 67, 89-104.
Komander D and Rape M (2012), The Ubiquitin Code. Annu. Rev. Biochem. 81, 203–29.
Koonin EV and Novozhilov AS (2009) Origin and evolution of the genetic code: the universal enigma. IUBMB Life. 61(2), 99-111.
Kühn S and Hofmeyr J-H S (2014) Is the “Histone Code” an organic code? Biosemiotics, 7(2), 203–222.
Levin M (2014) Endogenous bioelectrical networks store non-genetic patterning information during development and regeneration. Journal of Physiology, 592.11, 2295–2305.
Maraldi NM (2008) A Lipid-based Code in Nuclear Signalling. In: Barbieri M (ed) The Codes of Life: The Rules of Macroevolution. Springer, Dordrecht, pp 207-221.
Marquard T and Pfaff SL (2001) Cracking the Transcriptional Code for Cell Specification in the Neural Tube. Cell, 106, 651–654.
Matlin A, Clark F and Smith C (2005) Understanding alternative splicing: towards a cellular code. Nat. Rev. Mol. Cell Biol., 6, 386-398.
Melcher G (1974) Stereospecificity and the genetic code. J. Mol. Evol., 3, 121-141.
Nicolelis M (2011) Beyond Boundaries: The New Neuroscience of Connecting Brains with Machines and How It Will Change Our Lives.Times Books, New York.
Nicolelis M and Ribeiro S (2006) Seeking the Neural Code. Scientific American, 295, 70-77.
O'Keefe J, Burgess N (1996) Geometric determinants of the place fields of hippocampal neurons. Nature, 381, 425-428.
O’Keefe J, Burgess N (2005) Dual phase and rate coding in hippocampal place cells: theoretical significance and relationship to entorhinal grid cells. Hippocampus, 15, 853-866.
Papoutsi M, de Zwart JA, Jansma JM, Pickering MJ, Bednar JA and Horwitz B (2009) From Phonemes to Articulatory Codes: An fMRI Study of the Role of Broca’s Area in Speech Production. Cerebral Cortex,19, 2156 – 2165.
Pelc SR and Weldon MGE (1966) Stereochemical relationship between coding triplets and amino-acids. Nature, 209, 868-870.
Pertea M, Mount SM, Salzberg SL (2007) A computational survey of candidate exonic splicing enhancer motifs in the model plant Arabidopsis thaliana. BMC Bioinformatics, 8, 159.
Ray A, van der Goes van Naters W, Shiraiwa T and Carlson JR (2006) Mechanisms of Odor Receptor Gene Choice in Drosophila. Neuron, 53, 353-369.
Redies C and Takeichi M (1996) Cadherine in the developing central nervous system: an adhesive code for segmental and functional subdivisions. Developmental Biology, 180, 413-423.
Ruiz i Altaba A, Nguien V and Palma V (2003) The emergent design of the neural tube: prepattern, SHH morphogen and GLI code. Current Opinion in Genetics & Development, 13, 513–521.
Schimmel P (1987) Aminoacyl tRNA synthetases: General scheme of structure-function relationship in the polypeptides and recognition of tRNAs. Ann. Rev. Biochem., 56, 125-158.
Schimmel P, Giegé R, Moras D and Yokoyama S (1993) An operational RNA code for amino acids and possible relationship to genetic code. Proceedings of the National Academy of Sciences USA, 90, 8763-8768.
Shapiro L and Colman DR (1999) The Diversity of Cadherins and Implications for a Synaptic Adhesive Code in the CNS. Neuron, 23, 427-430.
Shimizu M (1982) Molecular basis for the genetic code. J. Mol. Evol., 18, 297-303.
Strahl BD and Allis D (2000) The language of covalent histone modifications. Nature, 403, 41-45.
Tavares EQP and Buckeridge MS (2015) Do plant cells have a code? Plant Science, 241, 286-294.
Trifonov EN (1987) Translation framing code and frame-monitoring mechanism as suggested by the analysis of mRNA and 16s rRNA nucleotide sequence. Journal of Molecular Biology, 194, 643-652.
Trifonov EN (1989) The multiple codes of nucleotide sequences. Bulletin of Mathematical Biology, 51: 417-432.
Trifonov EN (1999) Elucidating Sequence Codes: Three Codes for Evolution. Annals of the New York Academy of Sciences, 870, 330-338.
Tseng AS and Levin M (2013) Cracking the bioelectric code. Probing endogenous ionic controls of pattern formation. Communicative & Integrative Biology, 6(1), 1–8.
Turner BM (2000) Histone acetylation and an epigenetic code. BioEssays, 22, 836–845.
Turner BM (2002) Cellular memory and the Histone Code. Cell, 111, 285-291.
Turner BM (2007) Defining an epigenetic code. Nature Cell Biology, 9, 2-6.
Verhey KJ and Gaertig J (2007) The Tubulin Code. Cell Cycle, 6 (17), 2152-2160.
Wang Z and Burge C (2008) Splicing regulation: from a part list of regulatory elements to an integrated splicing code. RNA, 14, 802-813.
Yarus M (1988) A specific amino acid binding site composed of RNA. Science, 240, 1751-1758.
Yarus M (1998) Amino acids as RNA ligands: a direct-RNA-template theory for the code's origin. J. Mol. Evol.,47(1), 109–117.
Yarus M, Caporaso JG, and Knight R (2005) Origins of the Genetic Code: The Escaped Triplet Theory. Annual Review of Biochemistry, 74,179-198.