Creating New Forms of Life: Redefining DNA's Functionality
Written on
By Carl Zimmer
Originally published at Nautilus on October 24, 2013.
In the 1970s, opening a cereal box often revealed a cardboard disk—a code wheel for budding cryptographers. It featured two disks of different sizes, allowing users to align them and decode messages with a hidden letter. This playful tool reminded me of a biological concept I frequently encounter in textbooks.
This wheel symbolizes the genetic code present in every one of our 30 trillion cells, converting the information in our DNA into the very substances that constitute us. Remarkably, this genetic code is nearly identical across all species on Earth, representing the essence of life itself.
The genetic code differs from an organism's specific genetic sequence—a more commonly understood notion. Take, for instance, the gorilla's genome, which is composed of 3.04 billion bases, forming approximately 21,000 genes. To transform these genes into the proteins that perform various functions within the gorilla's body, cells rely on a set of rules defined by the genetic code. Without this code, the genetic sequence is akin to hieroglyphics lacking a Rosetta Stone.
Scientists unraveled the genetic code in the 1960s, marking a significant achievement in modern biology, comparable to uncovering the double-helix structure of DNA. This breakthrough enabled scientists to manipulate organisms by introducing new genes, ushering in the biotechnology era.
Decades later, the allure of the genetic code persists. Researchers engage in discussions about its evolutionary origins and ponder why diverse codes do not exist. As they deepen their understanding of the genetic code's history, they are also reshaping its future. Scientists are reprogramming cells to create entirely new proteins that have never existed in nature, potentially giving rise to innovative medical therapies.
This research goes beyond the routine advancements in biotechnology, such as gene sequencing or protein optimization. It fundamentally alters how we perceive DNA. By reprogramming life, scientists might eventually create organisms that diverge significantly from anything seen on Earth in the past four billion years—a form of synthetic life developed in laboratories.
A Complex Puzzle
When Francis Crick and James Watson unveiled the structure of DNA in 1953, they resolved many longstanding mysteries of life. Previous scientists had grappled with the chemistry necessary for heredity, and DNA provided a straightforward solution. Comprising two backbones adorned with bases, DNA's four bases—A, C, G, and T—are sufficient to generate the diversity of life. One combination yields a gorilla; another, a sunflower.
Yet, despite their success, Crick and Watson were puzzled over how cells utilized DNA to synthesize proteins. This enigma was particularly perplexing since proteins are formed from 20 distinct amino acids, unlike the four bases of DNA.
Russian scientist George Gamow recognized this conundrum as a cryptographic challenge. DNA contained messages made up of a four-letter alphabet, while proteins utilized a different 20-letter alphabet. Gamow proposed, without evidence, that proteins formed when amino acids fit into specific "holes" in DNA molecules, a notion he likened to deciphering an encrypted message.
Gamow theorized that a unique amino acid could occupy each hole, hypothesizing 20 distinct holes corresponding to the 20 amino acids. While his elegant solution was incorrect, the actual process proved to be more intricate. Cells first create a single-stranded copy of a gene, known as messenger RNA. Ribosomes then read this RNA sequence, selecting amino acids to assemble the appropriate protein.
The ribosome interprets three bases simultaneously, with each triplet referred to as a codon.
The genetic code wheel illustrates all codons, radiating outward from the center, with GUA encoding valine, for instance. An intriguing aspect of the genetic code is that multiple codons can specify the same amino acid. Thus, GUA, GUC, GUG, and GUU all encode valine. This complexity contrasts sharply with Gamow's envisioned straightforward code.
To decode the genetic code, scientists initially focused on the bacterium E. coli. Researchers chose this microbe due to the wealth of tools developed through prior studies. Once they deciphered E. coli's genetic code, they found the same peculiarities in other species.
Since the discovery of the genetic code, scientists have speculated on the evolution of this universal, seemingly chaotic system. Some researchers argue that what appears to be disorder is, in fact, an advantageous feature—natural selection may have favored the genetic code for its robustness. Multiple codons for an amino acid allow organisms to mitigate the impact of mutations.
For instance, if GUC mutates to GUU, the cell continues to produce valine, preventing the formation of defective proteins. Researchers who evaluated numerous random genetic codes found that the actual genetic code ranked in the top 0.000001 percent for mutation resilience.
Conversely, some scientists dispute this unique perspective on the genetic code, suggesting that it may not be particularly special. Crick proposed a concept he referred to as a "frozen accident," contending that early life forms possessed imprecise genetic codes that frequently led to mistakes in codon interpretation. Over time, microbes evolved more accurate codes, reducing misreading rates and enabling the development of more sophisticated proteins.
As cells became increasingly complex, any mutations could potentially yield defective versions of many proteins, leading to disastrous outcomes. The evolution of the genetic code, Crick argued, reached a standstill.
Others, such as Nigel Goldenfeld from the University of Illinois, view the code as a universal language that facilitates the sharing of genes across species. Microbes can acquire genes from other organisms, sometimes gaining significant advantages. For example, antibiotic-resistant bacteria can pass their resistance genes to more susceptible species, but only if the recipient cell can decode them.
Over millions of years, Goldenfeld suggests that the various genetic codes converged, resulting in a single, common code.
A Universal Yet Variable Code
Decades after the genetic code's discovery, researchers uncovered exceptions to its universality. In 1992, they identified an anomaly within human cells. While most human DNA resides in the nucleus, a small amount exists within mitochondria—tiny structures responsible for energy production. Mitochondria possess their own ribosomes and decode genes independently, believed to have originated from free-living bacteria that integrated into our cells over two billion years ago.
While studying mitochondria, scientists noticed a surprising fact: their genetic code differed from that of nuclear DNA. For instance, in human mitochondria, UGA, typically a "stop codon," encodes tryptophan instead.
Since that initial finding, researchers have identified 34 instances of alternative genetic codes, each arising from evolutionary adjustments to the ancestral code. Ken Miller, a cell biologist at Brown University, likens these variations to dialects in language, emphasizing that the differences between them stem from a shared origin.
In most known alternative genetic codes, one codon has been reassigned to a different amino acid, while some species have introduced entirely new amino acids. Certain microbes incorporate selenocysteine, while others have added pyrrolysine or both.
These genetic dialects present a puzzle for biologists, as species with alternative codes are often distantly related, suggesting that evolution has repeatedly altered the genetic code.
In 2009, evolutionary biologist Edward Holmes and his colleagues discovered a commonality among species with alternative genetic codes—none could be infected by viruses. They proposed that evading viral infection drives some species to modify their genetic codes. Viruses, which require host cells for replication, must match their code to that of their host. If there is a mismatch, the host produces faulty viral proteins, jeopardizing the virus's survival.
When a new viral outbreak occurs, the majority of hosts may perish, but a mutant host with an alternative genetic code may endure, as the virus cannot exploit it. These survivors can then repopulate, becoming immune to viruses due to their unique genetic code.
However, researchers at the University of Buffalo recently identified a virus capable of infecting a species with an alternative genetic code, specifically a yeast species where CUG changed from encoding leucine to serine. Upon examining the virus's DNA, they noticed that the CUG codon was almost entirely absent. It appears that once the yeast modified its code, the virus adapted its genetic information to prevent errors, thus avoiding the production of faulty viruses. While evolving an alternative genetic code may provide a means of evading viruses, it does not guarantee immunity.
Innovators of the Genetic Code
The revelation of the genetic code in the 1960s continues to influence our daily lives. Once scientists recognized that humans and E. coli shared the same code for interpreting genes, they explored whether this microbe could synthesize proteins from human DNA. Herbert Boyer and his team successfully isolated the gene for insulin from human cells and inserted it into E. coli. The results were promising, as these bacteria began to produce insulin. Today, millions of diabetes patients receive insulin derived from bacteria.
Researchers are increasingly adept at harnessing the genetic code to create valuable substances. They have engineered goats to produce spider silk in their milk and modified genes to generate customized proteins, such as tailored antibodies targeting specific pathogens. All these achievements are possible due to the shared language of life.
Nonetheless, the genetic code constrains the creativity of biotechnology. It encodes only 20 amino acids, while nature contains hundreds of additional amino acids—some even found in space—that are absent from life as we know it. Moreover, scientists have the ability to synthesize a virtually limitless variety of unnatural amino acids. Reprogramming the genetic code to encompass these alternatives could unlock a realm of possibilities for manipulating life.
The knowledge that nature has already adapted the genetic code has encouraged researchers to attempt further modifications. Initial efforts began in the early 2000s, with one notable study in 2002, led by chemist Peter Schultz at the Scripps Research Institute, resulting in the creation of light-sensitive proteins.
Schultz and his team achieved this by linking a standard amino acid (phenylalanine) with a photosensitive compound (benzophenone). When exposed to UV light, benzophenones gain energy, facilitating bonding with adjacent proteins.
They then altered the cellular machinery so that UGA, instead of acting as a stop codon, coded for the novel amino acid. By introducing these modified genes into E. coli, the bacteria produced proteins that could form new connections upon UV exposure, resulting in unprecedented molecular formations.
Schultz later co-founded a company called Ambryx, which capitalized on these experiments, leading to a substantial partnership with Merck in 2012 to explore innovative drug development by modifying the genetic code.
One of their typical projects involves creating anti-cancer molecules that target tumors with precision. Researchers aim to enhance existing monoclonal antibody therapies that can be engineered to attack only cancerous cells. Standard monoclonal antibodies mark cancer cells for immune system recognition, while Ambryx seeks to engineer these antibodies to deliver toxins directly to cancer cells, leading to their destruction.
Currently, expanding the genetic code remains a promising avenue rather than a definitive solution. Merck is not yet equipped with E. coli producing cancer drugs at scale, and the efficiency of bacteria in generating these unnatural proteins remains uncertain.
More radical modifications to the genetic code could prove more effective. Farren Isaacs, a biochemist at Yale, is pursuing such an ambitious project, aiming to alter numerous codons simultaneously. If successful, this endeavor could yield organisms capable of producing entirely new proteins, unlike anything currently alive on Earth.
Isaacs plans to exploit the redundancy within the genetic code. Rather than using multiple codons for the amino acid arginine, he intends to reprogram an organism's DNA to rely solely on one. This adjustment would free up other codons for encoding unnatural amino acids, potentially creating vast new biological opportunities.
In a recent study published in Science, Isaacs and his team took initial steps toward this goal. They employed advanced gene-editing tools to identify every instance of the stop codon UAG in the E. coli genome, discovering 314 occurrences. They surgically replaced these UAGs with UAA, another stop codon, demonstrating that the bacteria thrived without the redundant version.
This experiment represents a milestone as the first instance of altering a codon throughout an organism's entire genome, paving the way for UAG to encode a new amino acid. If successful, researchers could replicate this process with other redundant codons as well.
Altering the genetic code in this manner could enable scientists to achieve more than just the creation of new molecules. Biotechnology is currently hindered by viruses that threaten the microbes employed for molecule production. Isaacs’s re-coded microbes might become resistant to viruses that would otherwise hijack their ribosomes.
A new genetic code might also eliminate the risk of engineered organisms escaping labs and causing harm. Researchers could design microbes that rely on unnatural amino acids for survival; should they escape, they would find only natural amino acids and perish. In essence, these modified species would become dependent on our engineered code, fundamentally separating them from the natural code used by all other living organisms.
Contemporary debates surrounding genetically modified food stem from the perception that we have begun to tamper with DNA in hazardous ways. However, humanity has been modifying DNA for millennia, ever since agriculture began. The genetic makeup of a sweet corn cob significantly differs from that of its wild ancestor, teosinte. Biotechnology has allowed for more precise gene transfers across species in recent decades, and scientists are now even fine-tuning individual DNA bases.
While the concept of a microbe harboring a human insulin gene may seem strange, it still operates under the ancient genetic code that has governed life for billions of years. We may be on the cusp of a new epoch—one in which we, rather than natural evolution, dictate the code of life.
Carl Zimmer is a columnist for The New York Times and the author of 12 books, including A Planet of Viruses.