Humanity is faced with an enormous challenge in the coming decades. The world’s population is rapidly growing and we need to produce enough food, fuel, medicine and goods to support this growth in an environmentally sustainable and restorative way. Plants will inevitably provide many solutions to the problems we face, but we need to build environmentally sustainable, carbon-negative industries as soon as possible. Applying protein engineering to accelerate the development of improved crop varieties that can produce more while using less is a promising approach. Here we provide an introduction to the approaches, tools and philosophy of protein engineering, as well as several examples of problems in plant breeding and engineering that protein engineers are currently working to solve.

It is predicted that we need to increase agricultural production by ~50% before 2050 in order to feed the world’s growing population. Successfully making this increase in a sustainable, climate-resilient manner will require many different, innovative solutions to improving crop yields, while also simultaneously decreasing the need for land, fertilizers, pesticides and irrigation. In essence, we need to develop sustainable agricultural systems that produce more while using less. The input portion of this calculation takes into consideration the limiting reserves of phosphorus fertilizer, which is required for cells to convert solar energy into plant growth and development. Additionally, the overuse of fertilizers leads to water pollution since plants can only capture a certain amount of nutrients at a time. We can also think about sustainable agriculture through the lenses of climate change and diseases. We not only need to dramatically increase agricultural production by 2050, but must also learn how to guide crop evolution in order to create resilient crops that can survive adverse climatic conditions and increase yields in the face of increasingly frequent, extreme and disastrous climate events. Developing a holistic understanding of plant biology, agronomy, ecology and environmental science and translating this understanding into applications will be critically important to achieving sustainable agriculture.

Plant breeding has been making progress towards increased agricultural sustainability at a rate limited by the generation time of crop plants, which is typically two-to-three generations per year. Improvements in agricultural practices have fortunately made up the gap up until this point. We now have the ability to rapidly measure many aspects of crop plants using drones as well as to sequence plant genomes rapidly and accurately. However, there are many limitations, or frankly dead-ends, that have arisen during the evolution of plants on Earth that constrain our ability to further improve the sustainability of agriculture. Numerous inefficiencies exist across all scales of biology simply because the inefficient way worked well enough for the previous generation. Domestication frequently reduces or constrains variation or diversity in traits, which then limits the rate of future improvements in crops. These evolutionary constraints on breeding, paired with the time crunch required to match growing global demand for agricultural products, pose a particularly difficult, if not impossible, challenge to overcome with traditional plant breeding. Fortunately, we are entering an era where we can edit plant genomes with relative ease and select breeding populations based on their whole genome sequences. However, the questions remain: Which gene edits will yield the greatest improvements, and which genome sequences are optimal for specific conditions? Protein engineering has the potential to dramatically increase the rate of improvement, by at least narrowing down the set of potential answers to these questions and at best identifying a near-optimal solution.

Protein engineering aims to harness the existing molecular-level knowledge of how biology functions and alter these functions to improve the useful value of biological systems. The methods through which biological functions are altered in the field of protein engineering are generally akin to natural evolution, where with each new generation there is new diversity that is expressed and subjected to various challenging and competitive conditions (Figure 1). These conditions select for the most fit protein function, which gives rise to improved traits and overall more fit organisms in these conditions. However, directed evolution ideally occurs at a much accelerated pace and with a clear purpose and direction, as opposed to the meandering path of natural evolution. François Jacob argues in his article ‘Evolution and tinkering’ that natural evolution is a tinkerer, not an engineer.

Figure 1

A schematic of the cycle of evolution: diversification, expression and selection

Figure 1

A schematic of the cycle of evolution: diversification, expression and selection

Close modal

To harness evolution, a protein engineer must navigate through the astronomically sized biological sequence space on an infinite-dimensional path. This task is well described through analogy in two short stories by the Argentinian author and librarian Jorge Luis Borges. In the ‘Library of Babel’, Borges describes a nearly infinite library containing books, both sensical and not, containing all possible combinations of a set of characters filling 410 pages. In his 1995 book, Darwin’s Dangerous Idea, the philosopher of science Daniel Dennet realized the similarity between Borges’ ‘Library of Babel’ and protein sequence space, all possible combinations of the 20 natural amino acids. This was further expanded upon by Nobel Laureate Frances Arnold, one of the founders of the field of protein engineering, in the Preface to the 55th volume of Advances in Protein Chemistry in 2001 and Marc Ostermeier, a pioneering and inspiring protein engineer, in a 2007 article review. Fortunately, as Ostermeier notes, while the entirety of protein sequence space is on the same scale as the library of Babel, a protein engineer’s task is not as difficult as the librarian’s, who are faced with cataloguing the library of Babel. Unlike the library of Babel, where books are randomly arranged, proteins of related sequences are likely to have similar functions, and protein engineers can then navigate through sequence space on paths of similar function.

This idea of paths through sequence space brings up another of Borges short stories, ‘A garden of forking paths’. In this story, Borges describes life’s choices as an infinite garden of forking paths, or a labyrinth in time, where each decision changes the shape of the paths and decisions that lie ahead. This garden of forking paths, similar to evolution, is linear but branching with each fork or decision equating to a mutation or generation. A visual parallel can be seen in the tree of life, e.g., Darwin’s original sketch of an evolutionary tree (Figure 2). From one starting point, a series of forking paths represents the evolution of a variety of different species, or genes and the proteins they encode.

Figure 2

A sketch from Charles Darwin’s notebook of what is perhaps the first diagram of an evolutionary tree

Figure 2

A sketch from Charles Darwin’s notebook of what is perhaps the first diagram of an evolutionary tree

Close modal

An important implication of the forking path of evolution is that one mutation along an evolutionary path might close paths to other related protein sequences, in a somewhat random and unpredictable way. For example, one mutation may conflict with another in the protein structure causing misfolding and preventing the two mutations from co-existing. Because of this, evolution is full of contingencies, chance events that prevent future events. Thus these historical contingencies in a gene, multiple genes and whole genome of a species can constrain future evolution. There are numerous constraints on plant growth and productivity as well as the future natural evolution and traditional breeding of plants.

How can we use protein engineering to identify and overcome these contingencies in evolution and use genome engineering tools and techniques to translate this knowledge into more efficient plants? In terms of Borges’ allegories, how can we expedite the cataloguing of the interesting parts of the library of Babel and map new territories within, and efficient routes through, the garden of forking paths? Here, we will introduce some examples of these questions and how scientists are currently seeking to answer them.

The evolution of plants on earth is only one possible path/solution to a group of organisms that fix carbon from the atmosphere using sunlight and water and minerals from the soil. Several other carbon fixation pathways exist in prokaryotes which may possess advantages over the typical pathway plants use to fix carbon. Additionally, certain plants and algae have evolved means of improving the efficiency of this carbon fixation pathway. Here we present a few examples of evolutionary contingencies that are limiting agricultural productivity and progress in plant breeding that protein engineers are currently trying to overcome.

Rubisco: constraining photosynthesis

Ribulose-1,5-bisphosphate carboxylase/oxygenase (rubisco) is in all likelihood the most abundant protein on our planet. It allows plants to fix carbon dioxide from the atmosphere that is then converted into many other important molecules that are necessary for all life on earth. Rubisco initially evolved in a much higher CO2 and much lower O2 environment. Because there were few plants and algae, early Earth’s atmosphere was very different from today’s. Unfortunately, contingencies in rubisco’s evolutionary history have constrained its adaptation, leading rubisco to be less efficient today in (relatively) low CO2 concentrations. Rubisco appears to be stuck in a dead-end in the garden of forking paths, requiring a long trek back through millions of years of evolutionary history, to begin adaptation to current conditions. Plants have instead evolved means of concentrating CO2 inside their cells to optimize the conditions for the constrained rubisco. Currently, many scientists and protein engineers are studying rubisco’s evolutionary origins and trying to engineer it to be more efficient. This is especially important as increasing global CO2 fixation is critical to decreasing the impacts of anthropogenic climate change.

Chemistry can be a drag sometimes

Another example of an inefficiency that has been retained throughout the evolution of plants is suicide enzymes, enzymes that are more reactants than catalysts, with their active sites being irreversibly altered in the reactions they catalyse. Additionally, in plants, many enzymes and other cellular machinery are damaged by the numerous radical oxygen species produced during photosynthesis. Producing a new enzyme requires about 5 ATP molecules per amino acid, and degrading a damaged enzyme requires ~100–200 ATP molecules. Together this adds up to an enormous amount of cellular energy for each suicide enzyme-catalysed reaction or damaged enzyme. Thus engineering non-suicidal, long-lasting, truly catalytic enzymes as well as protections against radical oxygen species could save plants and other organisms a great deal of energy that could be converted to yield and useful value.

Genomic complexity and contingencies of evolution

Protein engineering can allow us to explore additional solutions and perhaps overcome some of the limitations and contingencies resulting from evolutionary context, in terms of both single genes adapting to changing environments as above and genes co-evolving with one another. No gene or species is ever evolving alone, in isolation. It may seem nearly impossible to deconvolute the numerous connections that have evolved between the different genetically encoded signalling pathways plants use to co-ordinate their growth, development and behaviour. However, many scientists are studying plant evolution in both natural and domestication contexts and building and testing models of our current understanding of how plants perceive their environment and co-ordinate their growth and behaviour accordingly. Through this work, we will hopefully be able to identify constraints as well as engineerable target genes and transgenic solutions to overcome these constraints.

The domestication of maize (corn) is a very well-studied example of how domestication can introduce evolutionary constraints on future breeding, while also providing an amazing success story. Modern maize was bred from the wild teosinte more than 6000 years ago in the Oaxaca region of modern Mexico. Teosinte is a bushy, branched grass that has a very small ear containing a single-file stack of extremely hard kernels, whereas modern maize generally has a single unbranched stalk and very large ears with around one dozen long rows of soft kernels (Figure 3). These changes in plant form have dramatically improved productivity of this plant by increasing the ease of harvesting and preparation for food and feed. However, the overall per plant yield has not really improved in a dramatic way.

Figure 3

Comparing the form of modern corn (maize) and its wild relative teosinte. Credit: Nicolle Rager Fuller, National Science Foundation.

Figure 3

Comparing the form of modern corn (maize) and its wild relative teosinte. Credit: Nicolle Rager Fuller, National Science Foundation.

Close modal

Contrary to the dramatic difference in appearance between maize and teosinte, the underlying genetic changes involved only about five genes. Unfortunately, these differences not only contributed to an easier, more plentiful harvest, as well as adaptation across multiple environments, but also introduced constraints that have limited improvement in grain yield. Considering the analogy of ‘A garden of forking paths’, when a trait (and its associated genome) is selected, a path is set that potentially limits future exploration of the garden. The forking path of evolution for maize that allowed for development of many improved traits also reduced variation in grain yield, introducing a dead-end in terms of yield improvement. Engineering specific aspects of plant development is a very difficult challenge that often involves many rounds of the trial-and-error, engineering design-build-test-learn cycle. Because plants grow and develop in accordance with environmental conditions, many signals conveying information about a plant’s environment as well as its current state of development must be integrated at each decision point. As such, plant developmental signalling is highly connected, meaning that many potentially wide-ranging aspects of plant growth and development can be affected by single genetic mutations. At the same time, developmental signalling is also highly robust, meaning that most possible mutations will not lead to negative developmental outcomes. To overcome evolutionary constraints, new breeding and genome engineering strategies are needed to introduce novel genetic variation and access new forks in the evolutionary path. We need to develop the tools of protein engineering, directed evolution and synthetic biology, in plants and systems that rapidly and efficiently translate to plants. These tools will allow us to expand and expedite our efforts of cataloguing the library of plant genomic space and explore the garden of plant evolution and ultimately develop crops that are more resilient and adaptable to known and future adversities, and to produce more using less.

To overcome the challenges we have exemplified above and others facing agriculture will require new approaches, technologies and knowledge, as well as massive scaling and acceleration of the cycles of evolution for plant breeding and engineering (Figure 1). Scaling of genomic sequencing, which is critical for studying natural or directed evolution, is becoming a non-issue as sequencing technology has seen one of the steepest price declines of any technology over the past two decades. DNA synthesis and techniques for generating genetic diversity have followed a similar trend, particularly with the advent of CRISPR/Cas9. These advances in generating and characterizing diversity have left expression and selection as the primary rate-limiting steps in the cycle of evolution (Figure 1). While numerous highly accurate tools for genome engineering have been developed and implemented in plants, expression of engineered genomes is still limited by the generation time of plants, or the slow, difficult and improbable (or currently impossible for some plants) process of regenerating plants from individual engineered cells. Rapid advances in trait measurement are being made thanks to technologies in high-content imaging and computer vision using drones and satellites as well as machine learning; however, trait measurements (commonly referred to as phenotyping in the literature) are still limited to the availability of space and environmental fluctuations in addition to generation time. Because of this limitation, it is often advantageous to pursue other ways to assess how genetic changes can affect function and plant traits, namely computational and synthetic biology approaches. These approaches aim to recapitulate behaviours of plants and their genes and genomes in computers and test tubes to allow for enormous experimental throughput and narrow down the limiting genome engineering and field experiments to the most promising candidates. Because of this, most directed evolution and protein engineering for plants actually occurs in the workhorse bacteria Escherichia coli or yeast Saccharomyces cerevisiae. While this does accelerate the cycle of evolution, not all engineered proteins will function in plants as they did in bacteria or fungi due to the many differences between these kingdoms of life.

Despite the excitement about gene editing for improving agricultural sustainability, we only know of six types of traits approved for use in crops. This certainly leaves a wide open field for new improvements to be discovered and implemented. However, we have not touched here on the fact that plants have evolved in concert with microbial consortia that are also the focus of current and future engineering efforts. Currently, both Pivot Bio and Joyn Bio (now a part of Ginkgo Bioworks’ Ag Biologics division) are developing microbes, partially using genome engineering and directed evolution, that can help plants fix nitrogen from the atmosphere, improve carbon fixation and sequestration and protect plant health. Pivot’s Proven 40 has increased corn biomass by 12% and nitrogen content by 14% in a multi-year field study compared to standard application of nitrogen fertilizers.

These efforts and others to engineer sustainable agricultural ecosystems along with numerous gene-edited and transgenic crop development projects all aim to decrease fertilizer and pesticide requirements, while simultaneously increasing yield and useful value of crops to increase the efficiency of agriculture. However, engineering genomes and making seed accessible remains a slow process. We need to harvest all of the information and experimental power we can from microbial and rapid-generation-time model plants to guide our engineering of next-generation crops, if we plan to keep up with growing demands and population. While there are many promising results, very few have been proven in the field.

There is still a great deal of work to be done and many additional solutions to explore beyond engineering crops. As a great proportion of plant agricultural yields in western nations goes towards animal feed, it is certain that Earth cannot support a growing population that adopts the current western, animal-protein-rich diet, so the need to adopt more sustainable diets is inevitable. Strategies for reducing food waste will also play an essential part in solving the looming increase in demand. We are faced with an enormous global challenge, but the solutions to this challenge are visible on the horizon of translational, transdisciplinary science. It will take a global coalition of science, industry and agriculture as well as community leaders and environmental advocates to provide sustainable food, feed, pharmaceuticals and fuels to our growing population.

  • Milo, R., Jorgensen, P., Moran, U., et al. (2010) BioNumbers—the database of key numbers in molecular and cell biology. Nucleic Acids Res., 38, D750–D753. DOI: 10.1093/nar/gkp889

  • Erb, T.J. and Zarzycki, J. (2018) A short history of RubisCO: The rise and fall (?) of Nature’s predominant CO2 fixing enzyme. Curr. Opin. Biotechnol., 49, 100–107. DOI: 10.1016/j.copbio.2017.07.017

  • van Dijk, M., Morley, T., Rau, M.L. and Saghai, Y. (2021) A meta-analysis of projected global food demand and population at risk of hunger for the period 2010–2050. Nature. Food, 2, 494–501. DOI: 10.1038/s43016-021-00322-9

  • Blount, Z.D., Lenski, R.E. and Losos, J.B. (2018) Contingency and determinism in evolution: Replaying life’s tape. Science, 362, eaam5979. DOI: 10.1126/science.aam5979

  • Eshed, Y. and Lippman, Z.B. (2019) Revolutions in agriculture chart a course for targeted breeding of old and new crops. Science, 366, eaax0025. DOI: 10.1126/science.aax0025

  • Rao, G.S., Jiang, W. and Mahfouz, M. (2021) Synthetic directed evolution in plants: unlocking trait engineering and improvement. Synth. Biol., 6, ysab025. DOI: 10.1093/synbio/ysab025

  • Rizzo, G., Monzon, J.P., Tenorio, F.A., et al. (2022) Climate and agronomy, not genetics, underpin recent maize yield gains in favorable environments. Proc. Natl. Acad. Sci., 119, e2113629119. https://doi.org/10.1073/pnas.2113629119

  • Lin, M.T., Salihovic, H., Clark, F.K. and Hanson, M.R. (2022) Improving the efficiency of Rubisco by resurrecting its ancestors in the family Solanaceae. Sci. Adv., 8, eabm6871. DOI: 10.1126/sciadv.abm6871

  • Bar-Even, A., Noor, E., Milo, R. (2012) A survey of carbon fixation pathways through a quantitative lens. J. Exp. Bot., 63, 2325–2342. DOI: 10.1093/jxb/err417

  • Yang, C.J., Samayoa, L.F., Bradbury, P.J., et al. (2019) The genetic architecture of teosinte catalyzed and constrained maize domestication. Proc. Natl Acad. Sci., 116, 5643–5652. DOI: 10.1073/pnas.1820997116

  • Hanson, A.D., McCarty, D.R., Henry, C.S., et al. (2021) The number of catalytic cycles in an enzyme’s lifetime and why it matters to metabolic engineering. Proc. Natl Acad. Sci., 118, e2023348118. DOI: 10.1073/pnas.2023348118

  • Arnold, F.H. (2001) Preface. Adv. Prot. Chem., 55, ix–xi. DOI: 10.1016/S0065-3233(01)55000-7

  • Ostermeier, M. (2007) Beyond cataloging the Library of Babel. Chem. Biol., 14:237–238. DOI: 10.1016/j.chembiol.2007.03.002

  • García-García, J.D., Van Gelder, K., Joshi, J., et al. (2022) Using continuous directed evolution to improve enzymes for plant applications. Plant Physiol., 188, 971–983. DOI: 10.1093/plphys/kiab500

  • Brophy, J.A.N. (2022) Toward synthetic plant development. Plant Physiol., 188, 738–748. DOI: 10.1093/plphys/kiab568

  • Wright, R.C. and Nemhauser, J. (2019) Plant Synthetic Biology: Quantifying the “Known Unknowns” and Discovering the “Unknown Unknowns”. Plant Physiol., 179, 885–893. DOI: 10.1104/pp.18.01222

graphic

Clay Wright completed his bachelor’s degree in chemical and biomolecular engineering at North Carolina State University, prior to studying protein engineering, directed evolution and synthetic biology as a PhD student in chemical and biomolecular engineering at Johns Hopkins University. Clay was a postdoctoral fellow in biology and electrical engineering at University of Washington, prior to starting his own lab at Virginia Tech in 2019. The Wright Plant Synthetic Biology Lab studies and engineers how complex, dynamic gene networks determine cell fate and how robustness, plasticity and evolution of these networks are linked across cells, tissues and organisms. Email: [email protected].

graphic

Deisiany Ferreira Neres received her BSc in forest engineering from Federal University of Goias in Goiania, GO, Brazil. She joined the Wright Lab in March 2021, and her research objectives are to predict which hormone signalling genes regulate agriculturally important traits in crops and to engineer resistances against plant pathogens’ effector proteins.

graphic

Patarasuda Chaisupa received a bachelor's degree in pharmacy in Thailand. She is now a graduate student in biological systems engineering at Virginia Tech. Her curiosity surrounding genome editing tools like CRISPR-Cas9, synthetic biology and diverse scientific disciplines brought her to this PhD program. She is seeking knowledge and skills to use and develop new biological tools to enhance the production of pharmaceuticals.

graphic

John A. Bryant is a PhD student in biological systems engineering at Virginia Tech. He received his BSc in biology from Samford University in Birmingham, AL. John joined the Wright Lab in August 2019, and his research objective is to develop predictive models and new methods for engineering plant hormone pathways.

Published by Portland Press Limited under the Creative Commons Attribution License 4.0 (CC BY-NC-ND)