A synthetic biology workflow is composed of data repositories that provide information about genetic parts, sequence-level design tools to compose these parts into circuits, visualization tools to depict these designs, genetic design tools to select parts to create systems, and modeling and simulation tools to evaluate alternative design choices. Data standards enable the ready exchange of information within such a workflow, allowing repositories and tools to be connected from a diversity of sources. The present paper describes one such workflow that utilizes, among others, the Synthetic Biology Open Language (SBOL) to describe genetic designs, the Systems Biology Markup Language to model these designs, and SBOL Visual to visualize these designs. We describe how a standard-enabled workflow can be used to produce types of design information, including multiple repositories and software tools exchanging information using a variety of data standards. Recently, the ACS Synthetic Biology journal has recommended the use of SBOL in their publications.

Introduction

Reproducibility is a critical and growing issue in synthetic biology [1]. Substantial effort is often required to design a new biological system, with input from many researchers with different backgrounds, including biology, mathematics, computer science, physics, and chemistry. Extracting information in order to reuse or build upon the contributions made by these researchers, however, is often extremely challenging. At present, information about genetic circuit design is often incomplete or buried in textual descriptions. Even scientific publications often fail to fully convey this information: designs are often available only as visual depictions that provide abstract representations or as unannotated sequences, and frequently some of the genes or gene products are not even captured, making it nearly impossible to reuse these designs. Capturing DNA sequences is key as a first step, but this information may also not be available, may require lengthy and error-prone manual lookups based on gene identifiers, or may only be derivable by search and extraction of the partial sequences given in forward and reverse primers. Even then, deriving exact sequences of designs may be impossible when full information about the final design, such as its exact assembly process, cloning strategy, or the spacer sequences between constituent genes and their components, is not clearly specified.

Further complicating matters, experimental measurements may vary between different laboratories due to the differences in sequences, chassis organisms, or the lack of information about experimental conditions. Even single nucleotide differences between sequences in the design itself or the host chassis can significantly change the functionality of genetic circuits. Notably, modifications in non-coding sequences can strongly affect the rates of transcription and translation processes, resulting in unexpected behaviors [25]. As the scale and the complexity of designs increase, these problems bring more challenges.

As synthetic biology continues to develop as an engineering discipline, practitioners are grappling with these problems and adopting the same sort of strategies that enable management of complexity in every mature engineering discipline, such as standardization, abstraction, modularity, and automation. Applications are created through design-build-test cycles and automation is key to achieving faster cycles for commercialization. There are already a wide variety of available computational tools that can be used in different stages of design, manufacturing, testing, and analysis. Often, tools are specialized in performing specific functions, and synthetic biology engineers need to flexibly co-ordinate the operation of these tools in complex design workflows. As a result, computational access to and exchange of information without any loss is crucial. Finally, the use of standards to capture design information also enhances reproducibility and reusability [6], effectively allowing the products of one workflow to be consumed by other workflows. Computational access particularly facilitates the storage and retrieval of these designs, making them ever more accessible. Practitioners can therefore more easily find designs that are created by other practitioners in a timely manner, make modifications or reuse them, and electronically share their new designs and data.

Synthetic Biology Open Language

The Synthetic Biology Open Language (SBOL) is one of the key technologies that can support the emerging standards-driven approach to synthetic biology engineering workflows. SBOL is a free and open community standard for the description and exchange of biological designs, supported by a diverse international community of researchers. This standard provides a ‘common core’ set of relatively abstract representations of biological structure, function, and sequence, with a focus on abstraction and composition, and is broadly applicable across a wide range of workflow elements. Critically, SBOL also supports machine-interpretable links between this shared core and more specialized representations, such as numerical models, protocol automation scripts, LIMS tracking, and measurement data, allowing SBOL to serve as a ‘hub’ for linking together a wide range of more specialized tools and processes without loss of information as shown in Figure 1.

Central role of SBOL in the synthetic biology design, build, test cycle.

Figure 1.
Central role of SBOL in the synthetic biology design, build, test cycle.

SBOL provides a shared representation for flexibly constructing workflows that may involve many different types of biological engineering resources, tools, and processes.

Figure 1.
Central role of SBOL in the synthetic biology design, build, test cycle.

SBOL provides a shared representation for flexibly constructing workflows that may involve many different types of biological engineering resources, tools, and processes.

The development of SBOL was motivated by the shortcomings of prior standards, such as FASTA [7] and GenBank [8], with respect to describing the engineering of biological systems. These prior standards focus on the recording and annotation of natural nucleic acid or protein sequence data, which have different challenges and requirements from those of the engineering of novel human-designed biological constructs. For example, the description of engineered systems requires the representation of the abstraction and composition of (at least partially) modular components. To serve these needs, in 2008, the SBOL community developed first an initial draft standard called PoBol [9], which evolved into first the SBOL 1 standard [10,11], focusing on the genetic structure of engineered DNA sequences. SBOL 1 recently evolved into the SBOL 2 standard [12,13], which represents both the structure and function of genetic designs as depicted in Figure 2.

SBOL extends beyond prior sequence-centric formats like FASTA and GenBank to enable modular, hierarchical representations of both structure and function of a genetic design.

Figure 2.
SBOL extends beyond prior sequence-centric formats like FASTA and GenBank to enable modular, hierarchical representations of both structure and function of a genetic design.

SBOL 1 enables hierarchical composition of DNA components, some perhaps without sequences assigned, while SBOL 2 allows for more types of components and their interactions to be expressed (Reprinted with permission from Zundel et al. [29]. © 2016 American Chemical Society.).

Figure 2.
SBOL extends beyond prior sequence-centric formats like FASTA and GenBank to enable modular, hierarchical representations of both structure and function of a genetic design.

SBOL 1 enables hierarchical composition of DNA components, some perhaps without sequences assigned, while SBOL 2 allows for more types of components and their interactions to be expressed (Reprinted with permission from Zundel et al. [29]. © 2016 American Chemical Society.).

Complementary to this data model, the SBOL visual standard provides a common visual language for communication about engineering biological constructs, much as diagram languages for electrical engineering [14,15] and architecture [16,17] do in those fields. SBOL Visual (SBOLv) [18,19] enables diagrams for SBOL 1 constructs and is in the process of being extended and integrated with Systems Biology Graphical Notation (SBGN) [20] to support the functional representations of SBOL 2 as well. SBOL visual is formally related to the SBOL data representation by means of the Sequence Ontology (SO) [21], which is used by the SBOL data model to designate the roles of components as shown in Figure 3. Namely, each glyph in SBOL visual is mapped to one or more ontology terms, enabling automatic computational mapping from SBOL data models to diagrams, by selecting for each component the most specific glyph whose term covers the component's role or roles and by organizing these glyphs according to the sequence and order relationships specified in the data model.

Link between SBOLv and SBOL data.

Figure 3.
Link between SBOLv and SBOL data.

SBOLv is linked to the SBOL data model by shared use of SO terms (figure courtesy of Zhang et al. [36]).

Figure 3.
Link between SBOLv and SBOL data.

SBOLv is linked to the SBOL data model by shared use of SO terms (figure courtesy of Zhang et al. [36]).

Supporting reproduction and reuse with SBOL

To support effective reproduction and reuse, practitioners must not only have the capability to represent information about engineered biological organisms, but must also use those capabilities to encode enough information of the right types to enable others to reproduce or build upon their results. In mature engineering fields, this typically takes the form of formalized datasheets, such as the component datasheets used in electronics or CAD components used in mechanical engineering. Although biological organism engineering aspires to this level of rigor (e.g. [22]), in practice the field has not yet attained that level of maturity [23]. In other areas of biology, the challenges of reproduction and reuse are addressed with a variety of minimum information standards [24], which aim to at least ensure that enough information on protocol and context is included that a practitioner can determine whether an attempt to reproduce or reuse works as expected. For example, Minimum Information About a Microarray Experiment establishes minimum information standards for reporting on microarray experiments [25], and MIFlowCyt establishes minimum information standards for reporting on flow cytometry experiments [26]. By making it easier to compare the products of different efforts, such minimum information standards have significantly improved data quality and accelerated discovery in the areas in which they have been established.

Similarly, reproduction and reuse of genetic constructs should be able to be accelerated by establishing a reporting standard for the minimum information about a genetic construct. Such minimum information about a genetic construct or collection of constructs needs to include at least the following:

  • The full sequence of all of the ‘base’ components used in a genetic construct. For example, a library made by combining pairs of promoters and coding sequences would need to include the full sequence of every promoter and every coding sequence.

  • Information sufficient to unambiguously determine the sequence of every complete construct. For example, the promoter/coding-sequence library would record all combinations made, but not necessarily the sequence of each combination, if that can be determined from the combination and the sequences of the individual components.

  • Identification of the role played by each significant designed feature. For example, explicitly recording that each promoter is, in fact, a promoter.

  • Identification of identities between construct components, such as by the composition of subcomponents. For example, it should be easy to tell if two promoter/coding-sequence constructs share the same promoter.

  • The assembly method used, if any, for composing smaller components into larger components, and any effects this is expected to have on the resulting sequence.

  • Any required additional modifications of the base sequence, such as methylation.

  • The vector or integration point used for transformation of the host organism. For example, a plasmid used to deliver a construct to bacteria, or the location targeted for CRISPR-based integration into a chromosome.

  • An unambiguous identification of the host organism for the construct, sufficient for determining genome and other relevant features.

The core representations of SBOL readily support most of this information, while the remainder can be linked to SBOL representations via the annotation mechanisms provided by SBOL, and an effort is ongoing within the SBOL community to formalize these recommendations.

Already, journals have shown interest in using SBOL to improve the ability of readers to reproduce and reuse elements of the papers they publish. In 2016, ACS Synthetic Biology became the first journal to formally embrace SBOL as a means of enhancing reproduction and reuse of synthetic biology research [27], with a workflow including validation and review of submitted designs and their deposit into a design repository linked with the paper and with interfaces for access by both humans and genetic design automation tooling as shown in Figure 4. As minimum information standards are established and adopted, they can integrate with such workflows in order to improve the ability of the research community to reproduce and to build upon one another's results. In parallel, we may expect such standards to provide a basis for the development of a wide variety of new capabilities, services, and business models in the industrial community, much as shared standards have already done in other communities, such as software, electronics, and mechanical systems.

ACS Synthetic Biology workflow for integration of published articles with machine-readable SBOL representations of the biological constructs described by those articles.

Figure 4.
ACS Synthetic Biology workflow for integration of published articles with machine-readable SBOL representations of the biological constructs described by those articles.

The author constructs their design using the genetic design automation (GDA) tool(s) of their choice producing a description in FASTA, GenBank, or preferably SBOL. Their design is then converted into a valid SBOL 2 document that is deposited in an SBOL repository. A link to these data and potentially an SBOLv diagram are published with the article (Reprinted with permission from Zundel et al. [29]. © 2016 American Chemical Society.).

Figure 4.
ACS Synthetic Biology workflow for integration of published articles with machine-readable SBOL representations of the biological constructs described by those articles.

The author constructs their design using the genetic design automation (GDA) tool(s) of their choice producing a description in FASTA, GenBank, or preferably SBOL. Their design is then converted into a valid SBOL 2 document that is deposited in an SBOL repository. A link to these data and potentially an SBOLv diagram are published with the article (Reprinted with permission from Zundel et al. [29]. © 2016 American Chemical Society.).

Software support for SBOL

Crucial to the success of SBOL is software infrastructure to support developers' integration of this standard within their tools. In particular, several libraries have been developed that provide access to the data model through an application programming interface (API). These libraries also permit both the serialization of data objects into RDF (resource description framework)/XML and the parsing of SBOL files into SBOL data objects for ease of interaction and manipulation within software tools. There are currently four main libraries maintained in loose federation by members of the SBOL community: libSBOLj (Java) [28], libSBOL (C/C++), pySBOL (Python), and sboljs (JavaScript). The Java library provides methods for converting into/from FASTA, GenBank, SBOL 1, and SBOL 2, as well as methods to check an SBOL document against the validation rules outlined in the SBOL specification [29]. The SBOL Validator/Converter provides a web service that can be leveraged by non-Java software to access these functionalities.

Leveraging these libraries, many software applications that support the SBOL standard have been developed, as illustrated in Table 1. These tools can be loosely divided into data repositories for storing genetic design information, sequence editors, visualization tools, genetic design compilers, and modeling and simulation tools. Many of these applications actually cover more than one of these functions. While most of these tools support either SBOL 1 or SBOLv, an increasing number of tools supporting SBOL 2 are being released. The rest of this section provides a brief description of some of these software tools. More detailed descriptions can be found in Supplementary Material.

Table 1
A partial list of software supporting SBOL
 Function SBOL URL 
Name  
Benchling [35 ●    ●   benchling.com 
BOOST [37 ●    ● ●  boost.jgi.doe.gov 
Cello [43   ●   ●  cellocad.org 
DeviceEditor [35 ● ●   ●  ● j5.jbei.org 
DNAPlotLib [39  ●   ●  ● dnaplotlib.org 
Eugene [34 ●    ●  ● http://www.eugenecad.org 
Finch  ● ● ●   ● ● synbiotools.org 
GenoCAD [60 ● ●     ● www.genocad.com 
GeneGenie  ●    ●   gene-genie.org 
Graphviz   ●     ● www.graphviz.org 
ICE [30●  ●   ● ● ● public-registry.jbei.org 
iBioSim [50 ● ● ● ● ● ● ● www.async.ece.utah.edu/ibiosim 
j5 [61 ●       j5.jbei.org 
MoSeC [62 ●   ● ●   ico2s.org/software/mosec.html 
Pigeon [38  ●     ● pigeoncad.org 
Pinecone  ●      ● serotiny.bio 
Pool Designer [63 ●    ● ●  github.com/CIDARLAB/poolDesigner 
Proto BioCompiler [41  ● ●  ●  ● synbiotools.bbn.com 
SBOLDesigner [36 ● ●   ● ● ● www.async.ece.utah.edu/SBOLDesigner 
SBOLme [32●      ●  www.cbrc.kaust.edu.sa/sbolme 
ShortBol [64 ●  ●   ●  shortbol.ico2s.org/sandbox.html 
SynBioHub [31●  ●   ● ● ● synbiohub.org 
Tellurium [52    ●  ●  tellurium.analogmachine.org 
TeselaGen  ● ●   ●  ● www.teselagen.com 
TinkerCell [53  ● ● ● ●  ● www.tinkercell.com 
VisBOL [40  ●    ● ● visbol.org 
VirtualParts [33●    ●  ●  www.virtualparts.org 
 Function SBOL URL 
Name  
Benchling [35 ●    ●   benchling.com 
BOOST [37 ●    ● ●  boost.jgi.doe.gov 
Cello [43   ●   ●  cellocad.org 
DeviceEditor [35 ● ●   ●  ● j5.jbei.org 
DNAPlotLib [39  ●   ●  ● dnaplotlib.org 
Eugene [34 ●    ●  ● http://www.eugenecad.org 
Finch  ● ● ●   ● ● synbiotools.org 
GenoCAD [60 ● ●     ● www.genocad.com 
GeneGenie  ●    ●   gene-genie.org 
Graphviz   ●     ● www.graphviz.org 
ICE [30●  ●   ● ● ● public-registry.jbei.org 
iBioSim [50 ● ● ● ● ● ● ● www.async.ece.utah.edu/ibiosim 
j5 [61 ●       j5.jbei.org 
MoSeC [62 ●   ● ●   ico2s.org/software/mosec.html 
Pigeon [38  ●     ● pigeoncad.org 
Pinecone  ●      ● serotiny.bio 
Pool Designer [63 ●    ● ●  github.com/CIDARLAB/poolDesigner 
Proto BioCompiler [41  ● ●  ●  ● synbiotools.bbn.com 
SBOLDesigner [36 ● ●   ● ● ● www.async.ece.utah.edu/SBOLDesigner 
SBOLme [32●      ●  www.cbrc.kaust.edu.sa/sbolme 
ShortBol [64 ●  ●   ●  shortbol.ico2s.org/sandbox.html 
SynBioHub [31●  ●   ● ● ● synbiohub.org 
Tellurium [52    ●  ●  tellurium.analogmachine.org 
TeselaGen  ● ●   ●  ● www.teselagen.com 
TinkerCell [53  ● ● ● ●  ● www.tinkercell.com 
VisBOL [40  ●    ● ● visbol.org 
VirtualParts [33●    ●  ●  www.virtualparts.org 

An up-to-date list is maintained in http://sbolstandard.org. The function column indicates if the tool is a (R)epository, (S)equence design tool, (G)enetic circuit design tool, (M)odeling and simulation tool, or a (V)isualization tool. The SBOL column indicates if it supports SBOL(1), (2), or (v)isual.

Several data repositories have been developed that can store genetic design information using the SBOL data standard. ICE [30] is an open-source software tool that provides a web-based platform to register and manage DNA parts, and an instance of this platform is used as the ACS Synthetic Biology Registry [27]. SynBioHub is an open-source repository built upon the SBOL Stack [31] RDF database back-end, and it provides both a user-friendly web-based front-end and programmatic access via either libSBOLj or a RESTful API. SBOLme is a web-based open-access repository that has recently been developed to promote the use of the SBOL for metabolic engineering applications [32]. The first release of SBOLme contains annotated SBOL parts of 28 437 chemical compounds, 6883 enzyme classes, 9909 metabolic reactions, and 3 173 238 proteins from 3908 different organisms. Finally, the Virtual Parts Repository supports CAD tools by providing readily accessible modular and reusable models of biological components that can be individually joined together for simulation [33].

Sequence editors are software tools for the design of DNA, RNA, and/or protein sequences. The task of designing sequences incorporates the manipulation, composition, and annotation of sequences. There are many tools developed or being developed with these functions; we highlight here a few with the best SBOL support, while more are described in Supplementary Material. Eugene enables the specification of rules in order to automatically enumerate composited designs based on biological knowledge [34]. The Joint BioEnergy Institute (JBEI) develops DeviceEditor [35] to visually design combinatorial DNA constructs based on part types (e.g. promoter, CDS, and terminator), VectorEditor for a graphical preview of the design, and j5 for DNA assembly design automation. SBOLDesigner is a modular sequence design tool that combines the SBOL 2 data model with SBOLv symbols to construct genetic designs hierarchically using parts fetched from SBOL data repositories [36]. The Build-Optimization Software Tools (BOOST) [37] enable the design of DNA sequences in order to maximize the success rate of their synthesis via codon optimization, verification of sequence constraints, and decomposition into synthesizable blocks.

SBOLv defines a set of agreed symbols to denote commonly used genetic elements and best practices for how biological designs should be visualized. Many point-and-click genetic design tools have adopted these symbols (see Table 1), and several dedicated pieces of software are now available to simplify the process of generating compliant diagrams. One of the first tools to help automate the production of standardized SBOLv diagrams was Pigeon [38], which converts a textual input description of a genetic construct into a diagram where each part is represented by its associated SBOLv symbol. Highly customized SBOLv diagrams can be created by using the DNAplotlib computational toolkit [39]. VisBOL is a web-based tool that in addition to supporting the Pigeon syntax can also convert directly from an SBOL 2 document into SBOLv symbols [40]. Finally, SBOL visual symbols have been adopted into the widely used general graph visualization toolkit, Graphviz.

Genetic circuit design involves constructing biological systems that implement logical functions similar to those found in electronic circuits. Circuit designers usually build circuits by connecting parts or modules found in a library to form larger and more complex constructs. Many tools have been developed that attempt to assist engineers in genetic circuit design. Proto BioCompiler takes in specifications of computations, transforms them into a data-flow representation of the computation to be carried out by the biological organism, then selects parts from a genetic library, and finally optimizes the circuit design [41]. iBioSim adapts a graph-based technology mapping procedure from digital electronic circuit design to map a specified genetic regulatory model into a network of genetic gates specified using SBOL [42]. Finally, Cello provides a platform where users can describe the desired function of their genetic circuit using Verilog, a hardware description language commonly used to specify electronic circuits, and then translate it into a directed acyclic graph of connected 2-input NOR and NOT gates implementing the logic [43].

Finally, SBOL allows for the association of genetic circuit designs with computational models. The most commonly used data standard for models of biological systems is the Systems Biology Markup Language (SBML) [44]. SBML models can be analyzed using a large selection of different analysis methods including deterministic and stochastic simulation [45], flux balance analysis [46], and stochastic model checking [47]. To facilitate the construction of SBML models, a converter from SBOL into SBML has been developed [48]. It is also possible to begin with an SBML model annotated with SBOL [49] and produce an SBOL description for the genetic design [48]. Given an SBML model for a genetic design, it is then possible to analyze this model using a variety of SBML modeling tools including those optimized for genetic circuit design, such as iBioSim [50,51], Tellurium [52], and TinkerCell [53].

A standard-enabled workflow for synthetic biology

A key design principle in the development of SBOL is that it would not attempt to cover all aspects of genetic design, but rather it would leverage existing standards whenever possible. A key example of this is the use of SBML for modeling. To pursue this goal, SBOL recently joined the COMBINE (COmputational Modeling in BIology NEtwork) community of standards [6]. COMBINE is an open community initiative to co-ordinate the development of standards and formats for systems and synthetic biology. Figure 5 depicts a complete synthetic biology computational design workflow that leverages COMBINE standards. This workflow assumes that data required for design must come from a variety of data repositories. While some are SBOL repositories, others store their information in other formats such as GenBank or BioPAX [54], another COMBINE standard. Converters can be utilized to translate this knowledge into SBOL to be utilized during sequence design using any of the sequence editors and visualization tools described earlier. Next, genetic modeling, analysis, and design tools can be utilized to construct and evaluate complete genetic designs. These models would be constructed using a COMBINE modeling language such as SBML or CellML [55], and their analyses should be encoded using the Simulation Experiment Description Markup Language (SED-ML) [56]. Next, SBOLv only represents DNA constructs, so a visualization standard such as SBGN could be leveraged to represent the biochemical aspects of the design. Finally, each of these files can be packaged together, shared, and distributed using a COMBINE Archive [57]. Throughout, the data conversions required by this standard-enabled workflow are enabled by the use of common ontologies, such as the BioPAX Ontology [54], the SO [21], and the Systems Biology Ontology (SBO) [58] with URIs taken from identifiers.org [59], whenever possible.

A standard-enabled workflow for synthetic biology using COMBINE standards.

Figure 5.
A standard-enabled workflow for synthetic biology using COMBINE standards.

The use of standards provides rich data repositories, consistent annotations, lossless data conversions, intuitive visualizations, and seamless connections of tools.

Figure 5.
A standard-enabled workflow for synthetic biology using COMBINE standards.

The use of standards provides rich data repositories, consistent annotations, lossless data conversions, intuitive visualizations, and seamless connections of tools.

Conclusion

Standards are an important enabler for data sharing and reproducibility in synthetic biology. Collaborations within the COMBINE community are essential to create new workflows enabled by these standards. The ultimate goal of these collaborations is a complete standard-enabled workflow for synthetic biology. For more information about SBOL, please see our website: http://www.sbolstandard.org/, and YouTube channel that includes several demonstrations of the standard-enabled workflow that we are developing.

Abbreviations

     
  • API

    application programming interface

  •  
  • BOOST

    Build-Optimization Software Tools

  •  
  • CDS

    coding sequence

  •  
  • COMBINE

    COmputational Modeling in BIology NEtwork

  •  
  • CRISPR

    clustered regularly interspaced short palindromic repeats

  •  
  • JBEI

    Joint BioEnergy Institute

  •  
  • LIMS

    Laboratory Information Management System

  •  
  • RDF

    resource description framework

  •  
  • SBGN

    Systems Biology Graphical Notation

  •  
  • SBML

    Systems Biology Markup Language

  •  
  • SBOL

    Synthetic Biology Open Language

  •  
  • SBOLv

    SBOL Visual

  •  
  • SED-ML

    Simulation Experiment Description Markup Language

  •  
  • SO

    Sequence Ontology

Funding

This material is based on work supported by the National Science Foundation under grant nos CCF-1218095 and DBI-135604. T.E.G. is supported by BrisSynBio, a Biotechnology and Biological Sciences Research Council and Engineering and Physical Sciences Research Council Synthetic Biology Research Centre [BB/L01386X/1]. G.M. and A.W. have been supported by the Engineering and Physical Sciences Research Council (EPSRC) [grant EP/J02175X/1]. J.A.M. is supported by FUJIFILM DioSynth Biotechnologies. J.B. is supported, in part, by the National Science Foundation Expeditions in Computing Program Award #1522074 as part of the Living Computing Project. E.O. is supported under Contract No. DE-AC02-05CH11231 by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility. This document does not contain technology or technical data controlled under either the U.S. International Traffic in Arms Regulations or the U.S. Export Administration Regulations.

Competing Interests

The Authors declare that there are no competing interests associated with the manuscript.

References

References
1
Peccoud
,
J.
,
Anderson
,
J.C.
,
Chandran
,
D.
,
Densmore
,
D.
,
Galdzicki
,
M.
,
Lux
,
M.W.
et al. 
(
2011
)
Essential information for synthetic DNA sequences
.
Nat. Biotechnol.
29
,
22
doi:
2
Cardinale
,
S.
and
Arkin
,
A.P.
(
2012
)
Contextualizing context for synthetic biology — identifying causes of failure of synthetic biological systems
.
Biotechnol. J.
7
,
856
866
doi:
3
Davis
,
J.H.
,
Rubin
,
A.J.
and
Sauer
,
R.T.
(
2011
)
Design, construction and characterization of a set of insulated bacterial promoters
.
Nucleic Acids Res.
39
,
1131
1141
doi:
4
Meng
,
W.
,
Belyaeva
,
T.
,
Savery
,
N.J.
,
Busby
,
S.J.
,
Ross
,
W.E.
,
Gaal
,
T.
et al. 
(
2001
)
UP element-dependent transcription at the Escherichia coli rrnB P1 promoter: positional requirements and role of the RNA polymerase alpha subunit linker
.
Nucleic Acids Res.
29
,
4166
4178
doi:
5
Iverson
,
S.V.
,
Haddock
,
T.L.
,
Beal
,
J.
and
Densmore
,
D.M.
(
2016
)
CIDAR MoClo: improved MoClo assembly standard and new E. coli part library enable rapid combinatorial design for synthetic and traditional biology
.
ACS Synth. Biol.
5
,
99
103
doi:
6
Hucka
,
M.
,
Nickerson
,
D.P.
,
Bader
,
G.D.
,
Bergmann
,
F.T.
,
Cooper
,
J.
,
Demir
,
E.
et al. 
(
2015
)
Promoting coordinated development of community-based information standards for modeling in biology: the COMBINE Initiative
.
Front. Bioeng. Biotechnol.
3
,
19
doi:
7
Pearson
,
W.R.
and
Lipman
,
D.J.
(
1988
)
Improved tools for biological sequence comparison
.
Proc. Natl Acad. Sci. U.S.A.
85
,
2444
2448
doi:
8
Bilofsky
,
H.S.
and
Christian
,
B.
(
1988
)
The GenBank® genetic sequence data bank
.
Nucleic Acids Res.
16
,
1861
1863
doi:
9
Galdzicki
,
M.
,
Chandran
,
D.
,
Nielsen
,
A.
,
Morrison
,
J.
,
Cowell
,
M.
,
Grünberg
,
R.
et al. 
(
2009
)
BBF RFC 31: Provisional BioBrick Language (PoBoL
.) https://dspace.mit.edu/handle/1721.1/45537
10
Galdzicki
,
M.
,
Wilson
,
M.
,
Rodriguez
,
C.
,
Pocock
,
M.
,
Oberortner
,
E.
and
Adam
,
L.
Synthetic Biology Open Language (SBOL). Version 1.1.0. BBF RFC 87, doi:1721.1/73909
11
Galdzicki
,
M.
,
Clancy
,
K.P
,
Oberortner
,
E.
,
Pocock
,
M.
,
Quinn
,
J.Y
,
Rodriguez
,
C.A.
et al. 
(
2014
)
The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology
.
Nat. Biotechnol.
32
,
545
550
doi:
12
Bartley
,
B.
,
Beal
,
J.
,
Clancy
,
K.
,
Misirli
,
G.
,
Roehner
,
N.
,
Oberortner
,
E.
et al. 
(
2015
)
Synthetic Biology Open Language (SBOL) Version 2.0.0
.
J. Integr. Bioinform.
12
,
272
doi:
13
Roehner
,
N.
,
Beal
,
J.
,
Clancy
,
K.
,
Bartley
,
B.
,
Misirli
,
G.
,
Grünberg
,
R.
et al. 
(
2016
)
Sharing structure and function in biological design with SBOL 2.0
.
ACS Synth. Biol.
5
,
498
506
doi:
14
IEEE
(
1991
)
IEEE Graphic Symbols for Logic Functions (Includes IEEE Std 91A-1991 Supplement, and IEEE Std 91-1984). IEEE Std. 91a-1991
15
IEEE
(
1993
)
IEEE Standard American National Standard Canadian Standard Graphic Symbols for Electrical and Electronics Diagrams (Including Reference Designation Letters). IEEE Std. 315-1975 (Reaffirmed 1993)
16
Schley
,
M.
,
Buday
,
R.
,
Sanders
,
K.
and
Smith
,
D.
(
1997
)
AIA CAD Layer Guidelines
,
The American Institute of Architects Press
,
Washington, DC
17
British Standards Institution
(
2007
)
Collaborative production of architectural, engineering and construction information. BS 1192:2007
18
Quinn
,
J.
,
Beal
,
J.
,
Bhatia
,
S.
,
Cai
,
P.
,
Chen
,
J.
,
Clancy
,
K.
et al. 
(
2013
)
Synthetic biology open language visual (SBOL visual), version 1.0.0
.
BioBricks Foundation Request for Comments (BBF RFC) #93
19
Quinn
,
J.Y.
,
Cox
,
R.S.
,
Adler
,
A.
,
Beal
,
J.
,
Bhatia
,
S.
,
Cai
,
Y.
et al. 
(
2015
)
SBOL visual: a graphical language for genetic designs
.
PLoS Biol.
13
,
e1002310
doi:
20
Le Novere
,
N.
,
Hucka
,
M.
,
Mi
,
H.
,
Moodie
,
S.
,
Schreiber
,
F.
,
Sorokin
,
A.
et al. 
(
2009
)
The systems biology graphical notation
.
Nat. Biotechnol.
27
,
735
741
doi:
21
Eilbeck
,
K.
,
Lewis
,
S.E.
,
Mungall
,
C.J.
,
Yandell
,
M.
,
Stein
,
L.
,
Durbin
,
R.
et al. 
(
2005
)
The sequence ontology: a tool for the unification of genome annotations
.
Genome Biol.
6
,
R44
doi:
22
Canton
,
B.
,
Labno
,
A.
and
Endy
,
D.
(
2008
)
Refinement and standardization of synthetic biological parts and devices
.
Nat. Biotechnol.
26
,
787
793
doi:
23
Kwok
,
R.
(
2010
)
Five hard truths for synthetic biology
.
Nature
463
,
288
290
doi:
24
Taylor
,
C.F.
,
Field
,
D.
,
Sansone
,
S.-A.
,
Aerts
,
J.
,
Apweiler
,
R.
,
Ashburner
,
M.
et al. 
(
2008
)
Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project
.
Nat. Biotechnol.
26
,
889
896
doi:
25
Brazma
,
A.
,
Hingamp
,
P.
,
Quackenbush
,
J.
,
Sherlock
,
G.
,
Spellman
,
P.
,
Stoeckert
,
C.
et al. 
(
2001
)
Minimum information about a microarray experiment (MIAME)-toward standards for microarray data
.
Nat. Genet.
29
,
365
371
doi:
26
Lee
,
J.A.
,
Spidlen
,
J.
,
Boyce
,
K.
,
Cai
,
J.
,
Crosbie
,
N.
,
Dalphin
,
M
. et al.  (
2008
)
MIFlowCyt: the minimum information about a flow cytometry experiment
.
Cytometry Part A
73A
,
926
930
doi:
27
Hillson
,
N.
,
Plahar
,
H.
,
Beal
,
J.
and
Prithviraj
,
R.
(
2016
)
Improving synthetic biology communication: recommended practices for visual depiction and digital submission of genetic designs
.
ACS Synth. Biol.
5
,
449
451
doi:
28
Zhang
,
Z.
,
Nguyen
,
T.
,
Roehner
,
N.
,
Misirli
,
G.
,
Pocock
,
M.
,
Oberortner
,
E.
et al. 
(
2015
)
libSBOLj 2.0: a Java library to support SBOL 2.0
.
IEEE Life Sci. Lett.
1
,
34
37
doi:
29
Zundel
,
Z.
,
Samineni
,
M.
,
Zhang
,
Z.
and
Myers
,
C.J.
(
2016
)
A validator and converter for the synthetic biology open language
.
ACS Synth. Biol.
PMID:
[PubMed]
30
Ham
,
T.S.
,
Dmytriv
,
Z.
,
Plahar
,
H.
,
Chen
,
J.
,
Hillson
,
N.J.
and
Keasling
,
J.D
. (
2012
)
Design, implementation and practice of JBEI-ICE: an open source biological part registry platform and tools
.
Nucleic Acids Res.
40
,
e141
doi:
31
Madsen
,
C.
,
McLaughlin
,
J.A.
,
Mısırlı
,
G.
,
Pocock
,
M.
,
Flanagan
,
K.
,
Hallinan
,
J.
et al. 
(
2016
)
The SBOL Stack: a platform for storing, publishing, and sharing synthetic biology designs
.
ACS Synth. Biol.
5
,
487
497
doi:
32
Kuwahara
,
H.
,
Cui
,
X.
,
Umarov
,
R.
,
Grünberg
,
R.
,
Myers
,
C.J.
and
Gao
,
X.
(
2017
)
SBOLme: a repository of SBOL parts for metabolic engineering
.
ACS Synth. Biol.
PMID:
[PubMed]
33
Misirli
,
G.
,
Hallinan
,
J.
and
Wipat
,
A.
(
2014
)
Composable modular models for synthetic biology
.
ACM J. Emerg. Technol. Comput. Syst
.
11
,
22
doi:
34
Bilitchenko
,
L.
,
Liu
,
A.
,
Cheung
,
S.
,
Weeding
,
E.
,
Xia
,
B.
,
Leguia
,
M.
et al. 
(
2011
)
Eugene – a domain specific language for specifying and constraining synthetic biological parts, devices, and systems
.
PLoS ONE
6
,
e18882
doi:
35
Chen
,
J.
,
Densmore
,
D.
,
Ham
,
T.S.
,
Keasling
,
J.D.
and
Hillson
,
N.J.
(
2012
)
DeviceEditor visual biological CAD canvas
.
J. Biol. Eng.
6
,
1
doi:
36
Zhang
,
M.
,
McLaughlin
,
J.
,
Wipat
,
A.
and
Myers
,
C
. (
2017
)
in press
37
Oberortner
,
E.
,
Cheng
,
J.-F.
,
Hillson
,
N.J.
and
Deutsch
,
S.
(
2017
)
Streamlining the design-to-build transition with build-optimization software tools
.
ACS Synth. Biol.
6
,
485
496
PMID:
[PubMed]
38
Bhatia
,
S.
and
Densmore
,
D.
(
2013
)
Pigeon: a design visualizer for synthetic biology
.
ACS Synth. Biol.
2
,
348
350
doi:
39
Der
,
B.S.
,
Glassey
,
E.
,
Bartley
,
B.A.
,
Enghuus
,
C.
,
Goodman
,
D.B.
,
Gordon
,
D.B.
et al. 
(
2016
)
DNAplotlib: programmable visualization of genetic designs and associated data
.
ACS Synth. Biol.
PMID:
[PubMed]
40
McLaughlin
,
J.A.
,
Pocock
,
M.
,
Mısırlı
,
G.
,
Madsen
,
C.
and
Wipat
,
A.
(
2016
)
VisBOL: web-based tools for synthetic biology design visualization
.
ACS Synth. Biol.
5
,
874
876
doi:
41
Beal
,
J.
,
Lu
,
T.
and
Weiss
,
R.
(
2011
)
Automatic compilation from high-level biologically-oriented programming language to genetic regulatory networks
.
PLoS ONE
6
,
e22490
doi:
42
Roehner
,
N.
and
Myers
,
C.J.
(
2014
)
Directed acyclic graph-based technology mapping of genetic circuit models
.
ACS Synth. Biol.
3
,
543
555
doi:
43
Nielsen
,
A.A.K.
,
Der
,
B.S.
,
Shin
,
J.
,
Vaidyanathan
,
P.
,
Paralanov
,
V.
,
Strychalski
,
E.A.
et al. 
(
2016
)
Genetic circuit design automation
.
Science
352
,
aac7341
doi:
44
Hucka
,
M.
,
Finney
,
A.
,
Sauro
,
H.M.
,
Bolouri
,
H.
,
Doyle
,
J.C.
,
Kitano
,
H.
et al. 
(
2003
)
The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models
.
Bioinformatics
19
,
524
531
doi:
45
Gillespie
,
D.T.
(
1977
)
Exact stochastic simulation of coupled chemical reactions
.
J. Phys. Chem.
81
,
2340
2361
doi:
46
Edwards
,
J.S.
,
Covert
,
M.
and
Palsson
,
B.
(
2002
)
Metabolic modelling of microbes: the flux-balance approach
.
Environ. Microbiol.
4
,
133
140
doi:
47
Madsen
,
C.
,
Zhang
,
Z.
,
Roehner
,
N.
,
Winstead
,
C.
and
Myers
,
C.
(
2014
)
Stochastic model checking of genetic circuits
.
ACM J. Emerg. Technol. Comput. Syst.
11
,
23
doi:
48
Nguyen
,
T.
,
Roehner
,
N.
,
Zundel
,
Z.
and
Myers
,
C.J.
(
2016
)
A converter from the systems biology markup language to the synthetic biology open language
.
ACS Synth. Biol.
5
,
479
486
doi:
49
Roehner
,
N.
and
Myers
,
C.J.
(
2014
)
A methodology to annotate systems biology markup language models with the synthetic biology open language
.
ACS Synth. Biol.
3
,
57
66
doi:
50
Myers
,
C.J.
,
Barker
,
N.
,
Jones
,
K.
,
Kuwahara
,
H.
,
Madsen
,
C.
and
Nguyen
,
N.-P.D.
(
2009
)
iBioSim: a tool for the analysis and design of genetic circuits
.
Bioinformatics
25
,
2848
2849
doi:
51
Madsen
,
C.
,
Myers
,
C.
,
Patterson
,
T.
,
Roehner
,
N.
,
Stevens
,
J.
and
Winstead
,
C.
(
2012
)
Design and test of genetic circuits using iBioSim
.
Des. Test Comput. IEEE
29
,
32
39
doi:
52
Sauro
,
H.M.
,
Choi
,
K.
,
Medley
,
J.K.
,
Cannistra
,
C.
,
Konig
,
M.
,
Smith
,
L.
et al. 
(
2016
)
Tellurium: a python based modeling and reproducibility platform for systems biology
.
bioRxiv
054601
doi:
53
Chandran
,
D.
and
Sauro
,
H.M.
(
2012
)
Hierarchical modeling for synthetic biology
.
ACS Synth. Biol.
1
,
353
364
doi:
54
Demir
,
E.
,
Cary
,
M.P.
,
Paley
,
S.
,
Fukuda
,
K.
,
Lemer
,
C.
,
Vastrik
,
I.
et al. 
(
2010
)
The BioPAX community standard for pathway data sharing
.
Nat. Biotechnol.
28
,
935
942
doi:
55
Hedley
,
W.J.
,
Nelson
,
M.R.
,
Bellivant
,
D.P.
and
Nielsen
,
P.F.
(
2001
)
A short introduction to CellML
.
Philos. Trans. Roy. Soc. A
359
,
1073
1089
doi:
56
Waltemath
,
D.
,
Adams
,
R.
,
Bergmann
,
F.T.
,
Hucka
,
M.
,
Kolpakov
,
F.
,
Miller
,
A.K.
et al. 
(
2011
)
Reproducible computational biology experiments with SED-ML — the Simulation Experiment Description Markup Language
.
BMC Syst. Biol.
5
,
198
doi:
57
Bergmann
,
F.T.
,
Rodriguez
,
N.
and
Le Novère
,
N.
(
2015
)
COMBINE Archive Specification Version 1
.
J. Integr. Bioinform.
12
,
261
doi:
58
Courtot
,
M.
,
Juty
,
N.
,
Knupfer
,
C.
,
Waltemath
,
D.
,
Zhukova
,
A.
,
Drager
,
A.
et al. 
(
2011
)
Controlled vocabularies and semantics in systems biology
.
Mol. Syst. Biol.
7
,
543
doi:
59
Juty
,
N.
,
Le Novere
,
N.
and
Laibe
,
C.
(
2012
)
Identifiers.org and MIRIAM Registry: community resources to provide persistent identification
.
Nucleic Acids Res.
40
,
D580
D586
doi:
60
Czar
,
M.J.
,
Cai
,
Y.
and
Peccoud
,
J.
(
2009
)
Writing DNA with GenoCADTM
.
Nucleic Acids Res.
37
,
W40
W47
doi:
61
Hillson
,
N.J.
,
Rosengarten
,
R.D.
and
Keasling
,
J.D.
(
2012
)
j5 DNA assembly design automation software
.
ACS Synth. Biol.
1
,
14
21
doi:
62
Misirli
,
G.
,
Hallinan
,
J.S.
,
Yu
,
T.
,
Lawson
,
J.R.
,
Wimalaratne
,
S.M.
,
Cooling
,
M.T.
et al. 
(
2011
)
Model annotation for synthetic biology: automating model to nucleotide sequence conversion
.
Bioinformatics
27
,
973
979
doi:
63
Woodruff
,
L.B.A.
,
Gorochowski
,
T.E.
,
Roehner
,
N.
,
Mikkelsen
,
T.S.
,
Densmore
,
D.
,
Gordon
,
D.B.
et al. 
(
2017
)
Registry in a tube: multiplexed pools of retrievable parts for genetic design space exploration
.
Nucleic Acids Res.
45
,
1553
1565
doi:
64
Pocock
,
M.
,
Taylor
,
C.
,
McLaughlin
,
J.
,
Misirli
,
G.
and
Wipat
,
A
. (
2016
)
An environment for augmented biodesign using integrated data resources
.
8th International Workshop on Bio-Design Automation
,
Newcastle University, Newcastle upon Tyne, UK, 16–18 August 2016

Supplementary data