The mechanisms used by bacterial pathogens to regulate the expression of their genes, especially their virulence genes, have been the subject of intense investigation for several decades. Whole genome sequencing projects, together with more targeted studies, have identified hundreds of DNA-binding proteins that contribute to the patterns of gene expression observed during infection as well as providing important insights into the nature of the gene products whose expression is being controlled by these proteins. Themes that have emerged include the importance of horizontal gene transfer to the evolution of pathogens, the need to impose regulatory discipline upon these imported genes and the important roles played by factors normally associated with the organization of genome architecture as regulatory principles in the control of virulence gene expression. Among these architectural elements is the structure of DNA itself, its variable nature at a topological rather than just at a base-sequence level and its ability to play an active (as well as a passive) part in the gene regulation process.
In addition to their importance to clinical science due to their ability to cause infectious disease, bacterial pathogens are of great interest to those trying to understand bacterial physiology, speciation and evolution. This is because many pathogens seem to contain specialist ‘virulence’ genes that are required for infection that sit side-by-side with so-called housekeeping genes with which they share many regulatory features. How did these pathogen genomes evolve at a regulatory level and how might they evolve further in the future? Are there rules governing the process and, if so, can we learn them and use the information to respond more effectively to emerging pathogens or even prevent their emergence?
As antibiotic therapy loses its effectiveness due to the rise of antibiotic resistance, strategies that target virulence genes and/or their regulators begin to look more attractive compared with those that rely on disabling the whole microbe [1,2]. If more refined targeting strategies are to succeed, detailed knowledge about the nature of virulence gene regulation is required. It can also be used to engineer mutant derivatives of pathogens that have lost the ability to cause disease but can still elicit a full immune response: live-attenuated vaccines [3,4].
The story of the virulence gene regulation research field has many parallels with the history of bacterial gene regulation in general. Both narratives have been dominated by the important roles of regulatory proteins in controlling gene expression in response to signals from both the intra- and the extra-cellular environments [5,6]. More recently, there has been a growing appreciation of the roles of small regulatory RNA molecules in controlling the expression of genes, including virulence genes, usually at a posttranscriptional level [7–10]. Each of these fields has generated an impressive literature and this process is likely to accelerate as the new ‘comparative sciences’ based on bioinformatics and computational biology gain traction. Throughout this history, an underappreciated regulatory factor has been the genetic material itself: DNA. In addition to storing the genetic information required to build, run and reproduce the cell, DNA is also an important contributor to the control of gene expression. In part this arises from the role played by cis-acting DNA elements that serve as binding sites for regulatory proteins. In the older literature these sites were called ‘operators’ and they are certainly very important [11–14]. What is less appreciated is that the DNA landscape upon which these DNA-binding proteins function is itself dynamic and environmentally responsive [15–28]. Because DNA shape is in flux, the target of a DNA-binding protein can be presented to that protein in a variety of forms even though the nucleotide sequence of each of those DNA forms is completely identical. This adds an important variable to the conventional picture of gene regulation in which the key questions concern whether or not a particular DNA-binding protein is present and whether or not its activating signal (chemical or physical) has been detected. This additional variable contributes to the creation of physiological diversity across a bacterial population, producing a spectrum of competitive fitness among the genetically identical members of the population upon which the environment imposes selective pressure.
GENE REGULATION MODELS–HISTORICAL INFLUENCES
The operon theory provided a powerful explanation for the collective control of a group of genes contributing to the same biological process [13,29]. The operon is characterized by the transcription of all of its genes as a common messenger RNA from which the open reading frames are translated individually. By placing the expression of all of the constituent cistrons of the operon under the control of a single set of transcription signals, these genes are collectively controlled by the same sets of environmental signals. The paradigmatic example of the operon is the lac system of Escherichia coli K-12 (not a pathogen). The DNA-binding protein LacI, the lac repressor, controls transcription of the lac operon negatively by binding to cis-acting sites called operators, preventing transcription of the structural genes lacZYA . LacI, a tetrameric protein, ceases to bind to DNA in the presence of allolactose, a lactose-related carbohydrate that induces lac transcription by diminishing LacI affinity for lac operator sites [30,31]. Lactose is a less attractive carbohydrate for the bacterium than glucose, because metabolism of glucose yields more energy in the form of ATP [32–34]. When glucose is available, its uptake interferes with lac operon induction at two levels. Firstly, it deprives lac of the stimulatory influence of Crp, a DNA-binding protein that recruits RNA polymerase and enhances lac transcription . The Crp protein (sometimes also called CAP) requires cyclic-AMP to bind to DNA and a key cAMP-Crp site is located at the lac transcription promoter . Glucose uptake by the cell eliminates cAMP-Crp stimulation of transcription by inhibiting the activity of adenylate cyclase, the enzyme that synthesizes cAMP . Secondly, glucose uptake also inhibits lactose import via the lactose permease, LacY .
The discovery that the Crp protein regulates many genes other than lac gave rise to the concept of the Crp regulon. In a regulon, a variety of genes and operons at different locations in the genome are governed collectively by copies of the same regulatory protein . In order to belong to a given regulon, a gene typically must have a DNA sequence close to its promoter that matches the consensus for the binding site of the eponymous regulatory protein. In the case of Crp, a match to the sequence 5′-TGTGANNNNNTCACA-3′ is required, where ‘N’ is any base . The protein ‘reads’ the sequence using a special feature that can enter the major groove of the DNA. In the case of Crp, this is a helix–turn–helix motif. The Crp binding site consists of an inverted repeat to which the protein binds as a dimer with each monomer interacting with one half of the inverted repeat . Possession of this sequence at an appropriate location in the promoter region admits a gene to membership of the Crp regulon, a group that includes both core genome and horizontally acquired genes [42,43]. This sequence-directed binding of a protein to a particular target in DNA involves a ‘direct readout’ mechanism. However, it is now clear that several DNA-binding proteins recognize a DNA shape rather than just a specific sequence, binding through an ‘indirect readout’ mechanism [44–46]. Many DNA-binding proteins combine aspects of both direct and indirect readout in their DNA binding mechanisms. Since particular DNA shapes can be specified by many different DNA sequences, indirect readout binding mechanisms lend themselves to a more promiscuous relationship between the protein and the genome [44,47]. Historically, transcription regulation has been considered chiefly from the perspective of proteins that bind to DNA and influence the transcription process. Indications that DNA itself plays more than a passive role in transcription control are changing our models of collective gene control.
DNA TOPOLOGY IS VARIABLE
Although it may seem obvious, it is worth recalling that since all of the genes in the genome are made of DNA, if DNA structure can vary in response to changes in bacterial physiology, these changes have the potential to influence the transcription pattern of the cell in highly pervasive ways. DNA in most bacteria is negatively supercoiled because it has a deficit of helical turns compared with its fully relaxed state [48–50]. Negative supercoils are introduced to DNA by the type II topoisomerase DNA gyrase, an ATP-dependent enzyme [51–54]. They are also introduced locally by the tracking activities of DNA and RNA polymerase . The type I enzyme DNA topoisomerase I (Topo I) removes negative supercoils from DNA, using the energy stored in the supercoiled DNA to drive the reaction [51,54,56]. E. coli has two other topoisomerases: Topo III which is a type I enzyme (related to Topo I) that has a decatenase activity [57,58], and Topo IV, an ATP-dependent type II enzyme that is related to gyrase but lacking the ability to supercoil DNA negatively . Topo IV is essential for the decatenation of the daughter chromosomes at the end of the cell cycle .
About half of the supercoils in the E. coli chromosome are constrained by proteins with the unconstrained portion of the genome having the potential to use the pent-up torsional energy of supercoiling to drive transactions such as transcription [61,62]. Measurements of superhelical density (sigma) give values that are averages for the whole chromosome; it is very difficult to discuss the superhelicity of discrete portions of the chromosome although these have the potential to be supercoiled to different values of sigma in different places .
The activity of DNA gyrase is sensitive to the ratio of ATP to ADP in the cell. High metabolic fluxes result in higher ATP/ADP ratios, more gyrase activity and hence DNA that is more negatively supercoiled [64,65]. The exponential phase of the bacterial growth cycle is characterized by negative supercoiling of DNA, with DNA becoming more relaxed in the lag and stationary phases . When bacteria experience stress they undergo a shift in the ATP/ADP ratio that alters DNA supercoiling [67,68]. Examples of such stress include changes to temperature, osmotic pressure, oxygen concentration, pH or growth phase [15–17,19,21,22,26,28,66]. Since the tracking of DNA and RNA polymerase creates local domains of supercoiling (both positive and negative), any change to these rates of tracking will affect DNA superhelicity at those sites in the genome [55,69,70].
BACTERIAL NUCLEOID STRUCTURE
Prokaryotes do not possess a nucleus and the single circular chromosome of E. coli exhibits organization at the nanoscale and microscale within a structure known as the nucleoid. The 4.64 Mbp chromosome is subdivided into four macrodomains and two unstructured regions [74–76]. It is further subdivided into approximately 400 microloops of approximately 14 kbp each, with each loop being compacted by negative supercoiling of the DNA [72,77–79]. The chromosome is replicated bi-directionally from oriC to ter, creating two replichores, one Left and one Right (Figure 1). Bioinformatic analysis suggests that the placement of the genes along the replichores is non-random, perhaps reflecting the contributions of genes at corresponding locations in each replichore to the same biological process. There is also a rough correlation between the distance a gene is located from oriC and the stage in the growth cycle at which it is expressed: genes concerned with rapid growth are closer to oriC than to Ter . Highly expressed genes (e.g. those encoding ribosome components) are usually aligned with the direction of DNA replication, perhaps to eliminate head-to-head collisions by DNA and RNA polymerase [80,81] although any causal association is still unclear [82–84]. It has also been suggested that those parts of the chromosome lying closest to oriC are more negatively supercoiled than those in the Ter macrodomain (Figure 1) . Experimental data from trimethylpsoralen binding assays support the existence of a supercoiling gradient. During exponential growth the oriC-proximal regions appear more negatively supercoiled , an observation that correlates with more transcriptional activity there and a higher density of binding sites for DNA gyrase . In stationary phase, transcriptional activity in the oriC-proximal part of the chromosome is diminished and the Ter macrodomain becomes the more negatively supercoiled region (Figure 1). These structural features may impose limits on where new genes can be placed in the chromosome and in which orientation they can be inserted. Experimental data support the notion that gene position is a factor that determines gene expression patterns [71,86–88].
The macrodomain structure of the
E. coli chromosome
Nucleoid-associated proteins (NAPs) contribute to nucleoid structure, as their name suggests, but also influence gene expression . NAPs like Fis form intimate connections between transcription and growth stage because Fis is abundant in rapidly growing bacteria in the exponential phase of the growth cycle [90,91]. Fis also links site-specific recombination to growth phase, allowing it to influence mechanisms that create diversity in the bacterial population [92–94]. The Fis monomer consists of just four alpha helices, with two of these being devoted to DNA binding . It binds as a homodimer using an induced fit binding mechanism that distorts the DNA helical axis, creating a bend of up to 90° . Fis has a dependency on DNA shape for binding; the A+T-rich DNA regions to which Fis binds show only a weak base sequence consensus [97,98]. This allows Fis to form relationships readily with horizontally acquired DNA molecules that have an appropriate base composition, bringing their genes into the Fis regulon [99,100].
Fis is a very versatile DNA-binding protein that can influence transcription positively or negatively, depending on the position of its binding site relative to the promoter of the target gene . It can behave as a conventional transcription factor, contacting RNA polymerase to modify its behaviour, or it can act indirectly [102–105]. When acting indirectly, Fis may bind upstream of the promoter and constrain a small domain of negatively supercoiled DNA, or it may suppress DNA strand unpairing, transferring this tendency downstream to a target promoter . In this way, Fis works closely with the superhelical state of the DNA to influence transactions such as transcription .
Among the targets of the Fis protein are the promoters of the dusB-fis operon (which encodes Fis) and the promoter of hns, the gene that encodes the H-NS nucleoid-associated protein. Fis autorepresses the transcription of its own gene and stimulates the transcription of hns [17,91,108,109]. Fis and H-NS often encounter one another as opponents in gene regulation across the genome, with Fis attempting to stimulate transcription of genes that are silenced by H-NS [110–112]. H-NS invariably acts as a transcription silencer, although it can act positively on gene expression at a posttranscriptional level . The hns gene, at least in E. coli, is expressed during the passage of chromosome replication forks, leading to the hypothesis that H-NS levels are matched to the number of binding sites for the protein in the genome . Certainly the cell must control the dosage of H-NS carefully because departures from its normal concentration result in significant changes to the physiological state of the cell and its competitive fitness [87,115]. In addition to its role in suppressing transcription initiation, H-NS is involved in modulating transcription termination . It also minimizes the extent of transcriptional noise (or ‘pervasive transcription’) arising from transcription initiation events occurring antisense within open reading frames and in non-coding regions of the genome [117,118]. For all of these reasons, matching the number of H-NS molecules to the number of its targets is likely to be an important challenge for the cell to meet.
H-NS shares with Fis a preference for binding to A+T-rich portions of the genome. In pathogenic bacteria such as Salmonella, Shigella, Yersinia, Vibrio cholerae, the virulent strains of E. coli and many other Gram-negatives, these include the major virulence genes that have been acquired by horizontal gene transfer [110,119–126]. Horizontal transfer may be a historic event as in the case of the pathogenicity islands of Salmonella , or it may be current, as in the case of the phage-mediated transmission of the cholera toxin gene in V. cholerae .
H-NS interacts with the minor groove of DNA (Fis binds in the major groove, although its binding mode compresses the minor groove) and has a preference for DNA that is intrinsically flexible [129,130]. H-NS forms homodimers that have the capacity to build bridges between different parts of the same DNA molecule or between different DNA molecules [131–133]. It can also polymerize along DNA with or without bridging and this creates a barrier to the binding of other proteins or complexes of proteins, including RNA polymerase . Atomic force microscopy data show that the plectonemically interwound nature of negatively supercoiled DNA, where the writhing of the DNA helical axis brings different parts of the molecule close together, facilitates H-NS mediated bridging . In vitro, H-NS can be ‘toggled’ between its DNA bridging and the polymerizing (or ‘DNA stiffening’) modes of binding by changes in the concentration of magnesium ion . It is not known if similar signals act in vivo or if such switching between binding modes is physiologically relevant.
The march of H-NS along the DNA can be arrested by a ‘counter-bridging’ mechanism in which another DNA-binding protein can erect a barrier to H-NS polymerization by bridging two DNA segments in the path of H-NS . The LysR-like DNA-binding protein LeuO can perform this task and so can LacI . The LeuO protein is tetrameric  and, like LacI, its tetramers can establish DNA loops . Unlike LacI, which is restricted to the operator sites in the vicinity of lac, LeuO has the potential to erect bridges among sites found throughout the high A+T zones of the genome .
PATHOGENS, NAPs AND DNA SUPERCOILING
Genome sequencing data suggest that many bacterial pathogens have evolved through the acquisition of specialist virulence genes via horizontal gene transfer [127,141] and the acquisition of non-coding DNA regulatory regions via horizontal regulatory transfer . Some common commensals actually possess virulence genes but maintain them in a cryptic state. For example, E. coli K-12 has a haemolysin gene (clyA/hlyE) that is silenced by the H-NS protein [143,144]. For a pathogen to evolve successfully it must integrate the virulence genes that it acquires from both a physical and a regulatory perspective in ways that do not compromise the competitive fitness of the new gene–cell combination . H-NS-mediated transcription silencing has been described as a means by which a pathogen can buy evolutionary time by acquiring genes that are immediately shut down before their expression can compromise fitness, allowing the organism to evolve a regulatory process that permits expression to occur under circumstances that benefit the bacterium [145–147]. However, even this silencing strategy brings risks because it causes the bacterium to devote H-NS to newly arrived targets, possibly resulting in existing target genes escaping silencing and being expressed inappropriately. Gram-negative bacteria have evolved a number of strategies to avoid or evade this problem. One involves the use of a very closely related protein, StpA, that can substitute for H-NS in gene silencing, especially in rapidly growing bacteria where the H-NS population is maximally ‘stretched’ as new binding sites are created through chromosome replication [148–152]. Another strategy involves the import of an H-NS-like protein together with the new H-NS target gene(s) by horizontal transfer, as in the case of certain large plasmids that enhance bacterial virulence [115,153,154]. In addition, H-NS can be ‘directed’ away from core genome targets towards those in horizontally acquired genes by a family of small H-NS-binding proteins related to Hha/YdgT that have dimerization domains in common with H-NS but lack DNA binding domains of their own . These proteins modify the affinity of H-NS for curved DNA in ways that favour the A+T-rich imported genes [155–157].
Once loss of fitness has been avoided by transcriptional silencing, the newly evolved pathogen must develop the means to awaken its silenced new genes. This is achieved by an impressively diverse range of mechanisms , the most basic of which relies on adjustments to the local structure of the DNA itself. In Shigella flexneri, the virF gene that encodes the master regulator of virulence is silenced by H-NS at temperatures below 37°C but, with increasing temperature, a change to the angle of a DNA bend at the virF promoter accompanied by a shift in the position of the bend centre causes the H-NS-DNA bridge at virF to collapse, allowing this gene to be transcribed . This example shows how a fundamental property of DNA that makes its structure sensitive to an environmental signal (temperature) that is also a key to the expression of a virulence phenotype can be exploited in association with the DNA binding requirement of the H-NS protein to create a simple genetic switch to govern a process that leads potentially to dysentery in a human host [126,159].
The role of DNA gyrase in the introduction of negative supercoiling into DNA is well established. It is also important to recognize that RNA and DNA polymerases introduce DNA supercoiling at a local level as these molecular machines track along the template strand of duplex DNA . Each polymerase over winds the DNA ahead of it and under winds the DNA behind, creating substrates for gyrase and Topo I in eliminating the over wound and the under wound domains respectively [160,161] (Figure 2). Changes to local supercoiling created in this way allow promoters along the chromosome to influence one another in cis through an under-researched DNA-telegraph-like mechanism [70,162]. In the example given above where local changes to DNA structure activate the promoter of the virF gene in S. flexneri, the VirF protein next activates transcription of the intermediate regulatory gene virB that also has a DNA supercoiling sensitive promoter [163,164]. Normally, a rise in temperature to 37°C adjusts the topology of the virB regulatory region to permit VirF to activate transcription [165,166]. However, the thermal signal can be dispensed with if a second promoter is introduced upstream and oriented divergently from the virB promoter. Activation of this upstream promoter creates the local underwinding of the DNA at virB that is required for its activation by VirF  (Figure 3). This case from Shigella involves virulence gene activation by a local increase in DNA negative supercoiling. DNA relaxation, something that accompanies the transition from the exponential to the stationary phase of growth, is also exploited as a trigger for virulence gene activation. S. Typhimurium exploits DNA relaxation as part of the repertoire of signals for the activation of the SPI2 pathogenicity island genes that allow it to survive in the macrophage vacuole, an environment where bacterial growth is slow [24,167]. Furthermore, S. Typhimurium DNA undergoes differential DNA relaxation in epithelial cells and macrophage, allowing it to activate distinct sets of genes in these two intracellular environments . These and many other examples show that beneath the protein-mediated regulatory mechanisms that are so familiar from a reading of the literature there is an often-underappreciated role for dynamic DNA topology in the control of promoter activity. How does this regulatory influence fit into the story of gene regulation evolution?
The twin-supercoiling domain model
Transcription-induced DNA negative supercoiling and promoter coupling
GENE REGULATION IN SIMPLE GENOMES
When considering how gene regulatory complexity has evolved, it is useful to look for clues in the simplest organisms. Mycoplasma genitalium is the smallest known self-replicating prokaryotic cell; its genome is only 580 kbp in size and its limited gene content makes the organism dependent on its host for many of its nutritional requirements . The capacity of this organism for protein-mediated transcription regulation is very limited and few (perhaps five) of its protein-encoding genes seem to specify conventional transcription factors . Like M. pneumoniae, M. genitalium has just one sigma factor for RNA polymerase [168,169]. Therefore, Mycoplasma lacks the capacity to re-programme its RNA polymerase to read different promoter types through the interchange of different sigma factors that appear in the cell in response to specific environmental challenges. Our understanding of promoter architecture in Mycoplasma lags behind that in model organisms such as E. coli yet it is clear that transcription regulation does occur in response to environmental signals [170–175].
Mycoplasmas, including M. genitalium, possess topoisomerases, among them enzymes that resemble DNA gyrase and DNA topoisomerase IV at the amino acid sequence level. A link has been established between transcriptional up-regulation of gene expression in M. genitalium and novobiocin-sensitive topoisomerase activity . These data implicate DNA supercoiling as playing a regulatory role in M. genitalium [176,177]. However, not all Mycoplasmas may have a full complement of topoisomerase activities. For example, a biochemical analysis of topoisomerase activity in M. fermentans and M. pirum failed to detect DNA negative supercoiling activity. Instead ATP-dependent DNA relaxing activity was found, something that is shared by E. coli Topo IV [59,178]. Topoisomerase activity in M. fermentans and M. pirum is inhibited by novobiocin, the antibiotic that inhibits ATP-dependent negative supercoiling in E. coli DNA gyrase and DNA relaxation in E. coli Topo IV . Thus some Mycoplasmas may resemble eukaryotes in having type II topoisomerase activities that do not include the ability to negatively supercoil relaxed DNA . Taken together with the data from the M. fermentans and M. pirum study  the mechanism may involve an ATP-dependent DNA relaxation of negatively supercoiled DNA by a Topo IV-like activity. Certainly, DNA relaxation has been shown to be a contributing factor to the collective control of virulence gene expression in other intracellular pathogens such as Salmonella [26,167] and Topo IV DNA relaxing activity can modulate virulence gene transcription .
If some Mycoplasmas lack a gyrase-like activity to underwind DNA, how can negative supercoils be introduced? This is likely to be achieved in the same way that DNA in the eukaryotic nucleus becomes negatively supercoiled. There, DNA is wound around a histone core to create a constrained solenoidal negative supercoil that is compensated by the introduction of a positive supercoil in the free, unconstrained DNA (Figure 4). Next, a topoisomerase relaxes the positive supercoil, leaving the constrained negative one intact. In this way, the DNA has gained a negative supercoil that will be revealed if the constraining protein is removed. Mycoplasmas have a protein candidate for the role of core histone in this scenario: it is the HU protein, their only nucleoid-associated protein . Thus, these simple genomes possess the basic elements of a primitive gene regulation apparatus that is linked to the metabolic status of the cell. It consists of the machinery to transcribe genes (RNA polymerase with its single sigma factor), topoisomerases to manage the DNA topological consequences of transcription (and DNA replication), a dependency on ATP for topoisomerase activity allowing this activity to be modulated by shifts in the cellular ATP/ADP balance and a NAP (HU) that can collaborate with topoisomerases in the introduction of negative supercoils. The same NAP and topoisomerases can impose organization on the genomic DNA so that it is organized within the nucleoid in a state that is appropriate to support both gene expression and chromosome replication/segregation. Such a basic regulatory set can then be evolved by the acquisition of more sophisticated, and specific, regulatory factors, such as transcription factors.
Introduction of a negative supercoil without DNA gyrase
GENE REGULATION IN SPACE AND TIME
The regulatory elements summarized in the previous section can provide the basis of a simple, environment-responsive, genome-wide gene regulation regime that has the capacity to evolve in scale and sophistication. Sophistication can develop from the recruitment of protein regulators that respond to specific environmental signals and transmit those signals to particular genes with which the proteins form relationships. These relationships typically reflect the presence at the target genes of DNA sequences or structures, or both, that attract the protein to bind there. The location of the protein relative to the binding site of RNA polymerase determines whether the protein will act as an activator or a repressor of transcription [181,182]. The protein can also act as an anti-repressor if its binding removes a transcription silencer such as H-NS, as in the case of the VirB intermediate regulator of virulence gene transcription in S. flexneri [183–185]. This regulatory mode can also be combined with that of a transcription activator as in the case of the ToxT master regulator of virulence gene expression in V. cholerae . Alternatively, the protein may cooperate with another positive regulator to make the target promoter respond to multiple environmental signals . Further regulatory sophistication can be provided through the contributions of trans-acting small RNA or cis-acting RNA elements [7–9].
Local concentration is an important consideration in the determination of the efficiency with which a given regulatory protein can influence the expression of its target gene [187,188]. This can be both effected and affected by the looping of DNA segments to which the regulatory proteins are bound. It can also be influenced by the distance between the gene that encodes the regulatory protein and the target gene(s) in the folded chromosome within the nucleoid [72,189,190]. An important consideration is the distance over which a regulatory factor can diffuse once it has been expressed in the cell [191,192]. Thus, the need for efficient gene-to-gene communication may play a guiding role in the evolution of the architecture of the nucleoid. The requirements of efficient communication may place restrictions on where and in which orientation newly arrived genes acquired by horizontal transfer can be placed in the chromosome. Perhaps it will determine which genes can become chromosomal and which will be maintained in self-replicating plasmids. Clues to the rules that govern genome evolution at a regulatory level can be discerned by inspecting the ‘wiring’ of living bacteria–the evolutionary success stories. These clues can also be obtained by conducting experiments that ‘re-wire’ existing genomes or synthesize them from scratch. In all cases it is probably wise to consider the regulatory wiring arrangements from a DNA-centric as well as a protein-focused perspective.
This work was supported by the Science Foundation Ireland Principal Investigator Award [grant number 13 IA 1875].
1Present address: Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, U.K.