The genetic, physiological and metabolic diversity of microalgae has driven fundamental research into photosynthesis, flagella structure and function, and eukaryotic evolution. Within the last 10 years these organisms have also been investigated as potential biotechnology platforms, for example to produce high value compounds such as long chain polyunsaturated fatty acids, pigments and antioxidants, and for biodiesel precursors, in particular triacylglycerols (TAGs). Transformation protocols, molecular tools and genome sequences are available for a number of model species including the green alga Chlamydomonas reinhardtii and the diatom Phaeodactylum tricornutum, although for both species there are bottlenecks to be overcome to allow rapid and predictable genetic manipulation. One approach to do this would be to apply the principles of synthetic biology to microalgae, namely the cycle of Design-Build-Test, which requires more robust, predictable and high throughput methods. In this mini-review we highlight recent progress in the areas of improving transgene expression, genome editing, identification and design of standard genetic elements (parts), and the use of microfluidics to increase throughput. We suggest that combining these approaches will provide the means to establish algal synthetic biology, and that application of standard parts and workflows will avoid parallel development and capitalize on lessons learned from other systems.
Microalgae are an immensely diverse group of eukaryotic microorganisms  that nonetheless are united by the fact that the majority share the characteristics of being aquatic, unicellular organisms that photosynthesize. For decades they have been used as models to study fundamental biological processes. Perhaps the most well-known is the green alga Chlamydomonas reinhardtii (Figure 1A), which is an ideal simple system for the study of photosynthesis, especially since it can grow heterotrophically on acetate, so photosynthetic mutants are not lethal . Somewhat paradoxically C. reinhardtii is also a model organism for understanding the structure, assembly and function of eukaryotic flagella and cilia , which have been retained from basal eukaryotes into the animal lineages, although lost in land plants. The fundamental knowledge generated through this early work resulted in the establishment of techniques for genetic manipulation including routine cultivation, transformation and mutagenesis, not just for C. reinhardtii [2,4], but also for several other green algae, red algae such as Cyanidioschyzon merolae, and more distantly related species such as the diatoms Phaeodactylum tricornutum (Figure 1B) and Thalassiosira pseudonana, and species of the eustigmatophyte Nannochloropsis . Moreover, there are currently more than 60 sequenced algal genomes completed or in progress, and the increasing number of transcriptomic and proteomic studies is providing a rich resource for gene discovery and understanding of metabolism [6,7]. As a consequence, the development of algal biotechnology has been stimulated in recent years, not least because several species have the natural capacity to synthesize and store compounds of commercial interest, including high value products (pigments, vitamins, antioxidants), and those compounds with potential as bulk chemicals, such as glycerol or triacylglycerols (TAGs) that could be used as biofuels . In addition, microalgae are suitable industrial biotechnology platforms, since like bacteria and yeast they can be grown at scale in controlled and contained environments, but with the added advantage that their photosynthetic lifestyle means that their cultivation may be more sustainable . However this field is still in its infancy, not least because the ability to carry out genetic manipulation of algae in a predictable and efficient manner is still limited.
Building algal synthetic biology
With the advent of synthetic biology approaches, there is the potential to make a complete step-change in this field. Synthetic biology combines the objectives of metabolic engineering but significantly also applies the principles of engineering in a manner that can improve efficiency . This is achieved through a combination of platform development to improve process robustness, the application of standardized protocols and parts to generate predictable outcomes, and a design process that is informed by the results, employing an iterative Design-Build-Test-Learn cycle (Figure 1C). The aim is a (re)design of biological systems, which can then have tangible outputs. In the first instance these outputs are probably to benefit metabolic engineering efforts focused on increasing the productivity of high value natural compounds. In the longer term the aspiration of synthetic biology is towards achieving higher levels of design complexity, going beyond single gene expression cassettes to the concept of abstraction, depicted by the outer circular arrows in Figure 1(C). This is where combinations of genes (or devices), for example a complete metabolic pathway, or even entire biological networks, can be used in a predictable fashion because the behaviour of each individual component is known . In parallel, synthetic biology increases the rate of progress by the generation of insulated and transferable tools that operate in a predictable manner not only in different synthetic circuits in an organism, but also in different organisms. We are some way off this for microalgae, but in this mini-review we outline several new sophisticated molecular approaches that are being developed for these organisms, focusing on C. reinhardtii and P. tricornutum. We show how, combined, these techniques will pave the way for algal synthetic biology to develop, not just for commercial exploitation of algae, but also for deeper understanding of fundamental biological processes.
Since the first reports of transformation of C. reinhardtii in the late 1980s, introduction of genes into the nucleus has been achieved for many microalgae [5,11], although it is routine only for a few species. Homologous recombination has been reported for Ostreococcus  and Nannochloropsis spp. , but not for C. reinhardtii or diatoms, and instead the transgene is integrated at random in the genome, leading to variation in expression between transformants. For C. reinhardtii a particular issue is poor transgene expression and stability, due at least in part to efficient gene silencing activity . Serendipitously it was discovered that use of the promoter for heat shock protein 70A (HSP70A) to drive transgene expression improved the frequency of transformants  either when used alone or fused with other promoters such as RUBISCO small subunit 2 (RBCS2), and indeed these chimaeric promoters operate in other unrelated algae such as Nannochloropsis . When used to drive transgene expression the HSP70A promoter had higher levels of H3/4 acetylation and reduced nucleosome occupancy, both indicative of open chromatin, and reduced levels of methylation of H3 at Lys9, characteristic of repressed chromatin, compared to RBCS2 promoter, and this state could be transferred to the latter in the chimaeric promoter . Further evidence for the important role of the chromatin state for effective transgene expression came from an alternative approach. Neupert et al.  employed UV mutagenesis to generate two strains with improved expression of a reporter, YFP. Not only was the frequency of transgene integration enhanced from 10% to 50%, but the level of expression was increased so that fluorescence of the protein could be detected in all transformants. Subsequently, Barahimipour et al.  worked with one of the produced strains, UVM11, and found that the improvement was due to the presence of fewer nucleosomes in analysed promoters. The authors also found higher levels of acetylation at histone H4, and lower levels of mono-methylation of lysine 9 in histone H3, again indicating a more open chromatin state.
In P. tricornutum, although there appears to be less of an issue with transgene silencing, transformation protocols via biolistics or electroporation are time-consuming, requiring several weeks to obtain transformants, and much less efficient (∼1 in 106 cells transformed) than for C. reinhardtii (∼1 in 104 transformed). Karas et al.  explored the possibility of using extrachromosomal vectors, or episomes, for the expression of transgenes, both to increase efficiency but also to avoid complications caused by biolistics and electroporation, which include multiple and fragmented insertions, and random chromosomal integration leading to position-effects on expression and potential disruption of endogenous genes. Although circular DNA molecules had been isolated from diatoms previously they could not be successfully reintroduced and recovered, so instead, the authors screened for sequences that stabilized circular DNA molecules, enabling them to be recovered from transformed P. tricornutum cells after multiple rounds of cell division. They were also able to develop a conjugation protocol that allowed transfer of the construct assembled in Escherichia coli directly to diatom cells, increasing the frequency of transformation to more than 1 in 104 cells. Intriguingly, the greatest stability of the episomes was conferred not by P. tricornutum sequences, but rather from a sequence used to allow episomal maintenance in yeast, CEN6-ARSH4-HIS3, indicating conservation of function between organisms. Karas et al.  further demonstrated that the episomes could be used effectively to transform another diatom, T. pseudonana.
As well as nuclear transformation, in C. reinhardtii it is also possible to transform the chloroplast, integrating the foreign DNA via homologous recombination. A major application of C. reinhardtii to date has been the knockout of chloroplast-encoded genes, which has been seminal in identifying their function both in photosynthesis and the chloroplast genetic machinery [2,20]. Subsequently, the ability to introduce novel genes into the chloroplast has been exploited for the expression of recombinant proteins such as antigens, with production levels of up to 5% cellular protein . As a separate compartment from the nucleus, the chloroplast offers the opportunity for more sophisticated genetic manipulation. Young and Purton  observed that in the C. reinhardtii chloroplast the codon UGA, which is a stop codon in the majority of genomes, is not used. The authors subsequently engineered C. reinhardtii so that the redundant UGA codon encoded the amino acid tryptophan (normally encoded by UGG). Modification of the primary sequence of a transgene to incorporate an internal UGA would mean that translation of a functional protein would only be possible in the engineered C. reinhardtii strain, because in any other cell type, where UGA is a stop codon, a truncated protein would result (Figure 2). The fidelity of the system was demonstrated through the biosynthesis of two toxic recombinant proteins, the SPN9CC endolysin from a Salmonella typhimurium phage, and a toxic protein from Shewanella denitrificans of unknown function but identified from the PanDaTox database as unclonable in E. coli . This approach also provides a mechanism to limit concerns over transgene escape into other organisms.
Schematic of codon reassignment in C. reinhardtii chloroplasts
Manipulation of endogenous genes
The difficulties that result from the efficient silencing machinery in C. reinhardtii nonetheless offered the means for down-regulation of endogenous genes, and both antisense and inverted-repeat containing RNA constructs were shown to reduce gene expression . The identification of microRNAs (miRNA) , previously only known in multicellular eukaryotes, led to development of efficient miRNA methods , and these have been used to good effect to study the function of biologically important genes. For example, knockdowns of the C. reinhardtii homologue of HYDIN, a gene which in mammals is linked to hydrocephalus, demonstrated that this protein played an essential role in correct flagellar function, indicating that defects in cilia activity might be responsible for alterations in cerebrospinal fluid transport in patients with hydrocephalus . RNA interference has also been used widely to alter enzyme activity, particularly in the field of production of compounds of value such as TAGs, which are potential biodiesel precursors. As well as demonstrating which are the key enzymes of the biosynthetic pathway for TAGs , studies include the observation that down-regulation of a lipid droplet protein reduced levels of neutral lipid, indicating the importance of the sink for TAGs to accumulate . Moreover, more distant links were revealed between TAG production and pathways such as the TCA cycle , providing leads for further metabolic engineering strategies. Although the components of the silencing machinery characterized in plants and C. reinhardtii appear to be poorly conserved in diatoms, RNAi technology using similar inverted repeat (hairpin) constructs works effectively to reduce gene expression in diatoms including P. tricornutum , and has been used to study cell biology of the organism , as well as investigating potential routes to improve TAG production . Refinements to this technology include online tools for the design of suitable miRNAs [33,34], and improved vectors that make it easier to assemble constructs containing the hairpin sequences. In one example, inclusion of the gene for luciferase between the promoter and the artificial miRNA precursor allowed screening of suitable knockdown strains by luciferase luminescence .
To increase the precision of genetic manipulation it would be desirable to have targeted gene insertion, avoiding potential pleiotropic effects caused by disruption of endogenous genes. In the absence of homologous recombination for the introduction of transgenes into the nucleus, genome editing, using site-specific nucleases such transcription activator like effector nucleases (TALENs)  and CRISPR/Cas9  to cause cleavage and non-homologous end joining repair, is an alternative approach, and indeed is revolutionizing many areas of science. However, there has been limited success with these methods in C. reinhardtii. Although genome editing with CRISPR/Cas9 was demonstrated in one report, it was only after transient expression of Cas9, suggesting that this protein may be lethal in C. reinhardtii . In contrast, TALENs have been used effectively in P. tricornutum, for example to edit the gene for UDP-glucose pyrophosphorylase by introduction of a premature stop-codon into the coding region . This deletion was hypothesized to impact fatty acid accumulation in P. tricornutum, and indeed was confirmed by a high throughput flow cytometry screen and HPLC/MS. Separately, Weyman et al.  used TALENs to insert the zeocin resistance gene (BLE), concomitant with deletion of the urease gene of P. tricornutum, resulting in site-specific insertion into the P. tricornutum nuclear genome. Several laboratories around the world have been investigating the potential to use CRISPR/Cas9 in diatoms. In the first publication, Nymark and coworkers report how the use of high resolution melting based PCR assays facilitated identification of lines with altered sequences, resulting in a mutation frequency of 31% .
Standardization of the design process
In parallel with improvement in transformation methods, molecular toolkits of different genetic elements have been developed, for example to ensure efficient expression, to allow regulation of transgene expression or to target the gene product to the appropriate cellular location. Lauersen et al.  generated a versatile set of vectors for C. reinhardtii, called pOptimized, with combinations of different reporters (fluorescent proteins clover, mRuby2, mCerulean3 and mVenus, and luciferase), and selectable markers, paromomycin and hygromycin, as well as a series of targeting peptides. The different parts could be combined with each other in various permutations using unique flanking restriction sites, and the constructs generated were used to demonstrate the fidelity of targeting sequences for secretion (N-terminal carbonic anhydrase 1), localization to the nucleus (N-terminal SV40), chloroplast (N-terminal PSAD peptide), mitochondria (N-terminal ATPA peptide), microbody (C-terminal peptide derived from malate synthase PTS1 of pumpkin and C. reinhardtii) and the pyrenoid (C-terminal RBCS1 peptide).
Barahimipour et al.  investigated the impact of GC content and codon use on transgene expression in C. reinhardtii strain UVM11, generated through UV mutagenesis and selection for strains demonstrating enhanced nuclear transgene expression . Through careful experimental design the authors were able to test the impact of GC content of the DNA sequence and codon use independently on transgene expression in this strain. They demonstrated that while both factors are important, codon usage was the key factor determining translation efficiency and mRNA stability, potentially a by-product of increased ribosome occupancy and its stabilizing effect, whereas GC content affected transgene expression at the level of chromatin structure, for reasons discussed above.
The resources available for P. tricornutum, although more limited, are beginning to be developed. To complement promoters of well-characterized genes, a cis-acting regulatory element in the 5′-flanking region of the LHCF2 gene encoding light-harvesting complex containing fucoxanthin 2 protein has been shown to act as an enhancer of transcription . It is functionally similar to the now ubiquitously employed RBCS2 intron of C. reinhardtii  in that it works independently of orientation and can be separated from start of transcription, as demonstrated by the insertion of a 360 base pair spacer sequence between the LHCF2 element and the reporter β-glucuronidase (GUS). A series of transit peptides have also been described from P. tricornutum allowing proteins to be targeted to the nucleus, mitochondria and chloroplast , using knowledge from previously characterized genes. With the increasing amount of RNAseq data available , it is likely that a more systematic way of identifying new elements that can improve levels and regulation of transgene expression will be established.
Coupled with identification of the appropriate sequences has been the realization that standardization of assembly of the different components, together with standard workflows, would allow much more reproducible data to be collected. Although such approaches have not yet been widely adopted, methods and tools already exist to achieve this in the form of standardized and scalable methods to generate transgene expression constructs quickly, such as Gibson assembly and Golden Gate cloning (and variations thereof) . This will be greatly enhanced by the adoption of a common syntax for standard biological parts, as proposed for plant synthetic biology . In parallel, protocols for analysis must be standardized, from the way cells are processed and transformed, through to how specific values or reporters are tested, analysed and data presented. These analyses can be supported by existing technologies such as colony picking and liquid handling robots, both of which allow much higher throughput of samples, providing statistical power in comparing, for example, the efficacy of two different promoters or different codon usage. A particularly exciting prospect is the fact that microalgae are amenable to analysis using microfluidics, an emerging field that allows analysis at the single cell level , such as the dynamics of the C. reinhardtii flagellar cycle . A refinement of the approach is microdroplets, where cells are encapsulated in aqueous droplets that are carried in an oil phase and can be handled in a variety of devices. For example, droplets containing single cells of C. reinhardtii, as well as Chlorella vulgaris and Dunaliella salina, have been maintained in reservoirs for several days allowing cell division to be observed and quantified . More recently, compounds produced by the cells can be quantified by fluorescence measurements, such as ethanol produced by genetically modified cyanobacteria , and sorting devices can be used to separate droplets with high levels of product from the population . In principle this approach could be used to identify highly-expressing transformants from a population of cells, and coupled with generation of mixed libraries of constructs with different parts, this might offer the means to screen for optimal sequences without the need to prepare individual transformants first, thus increasing throughput of analysis substantially.
Due to their photosynthetic lifestyle, diverse range of valuable natural products and amenability to high throughput analytical techniques, microalgae present a unique set of opportunities in academic research and industrial biotechnology. C. reinhardtii remains the model microalga, with the largest knowledge foundation, standardized protocols and most developed resources that fit the synthetic biology remit, but alternative microalgal platforms such as P. tricornutum are developing fast. The many different approaches aimed at improving genetic manipulation of the two organisms described here offer considerable potential for improving our understanding of algal biology and the exploitation of microalgae for industrial biotechnology. At present the approaches are often carried out independently, but, since they fulfil many of the requirements of the Design-Build-Test-Learn methodology, there is now the opportunity to combine them and allow the establishment of the field of algal synthetic biology.
We are grateful to Dr Louiza Norman for images of C. reinhardtii and P. tricornutum, and to Prof Saul Purton and Dr Rosanna Young (UCL) for helpful and stimulating discussion of ideas presented here.
This work was supported by the Biotechnology and Biological Sciences Research Council of the UK [grant number BB/I00680X/1]; and the European Commission 7th Framework Program (FP7) project SPLASH (Sustainable PoLymers from Algae Sugars and Hydrocarbons) [grant number 311956].
Synthetic Biology UK 2015: Held at Kingsway Hall Hotel, London, U.K., 1–3 September 2015