It has been discovered recently, via structural and biophysical analyses, that proteins can mimic DNA structures in order to inhibit proteins that would normally bind to DNA. Mimicry of the phosphate backbone of DNA, the hydrogen-bonding properties of the nucleotide bases and the bending and twisting of the DNA double helix are all present in the mimics discovered to date. These mimics target a range of proteins and enzymes such as DNA restriction enzymes, DNA repair enzymes, DNA gyrase and nucleosomal and nucleoid-associated proteins. The unusual properties of these protein DNA mimics may provide a foundation for the design of targeted inhibitors of DNA-binding proteins.
The interactions of proteins, and particularly of enzymes, with DNA are tightly controlled spatially and temporally to ensure appropriate treatment of the genome. A classical method of control is the use of sequence-specific DNA-binding proteins which block access of another protein, such as RNA polymerase, to the same sequence. The binding affinity of these repressor proteins for their DNA target depends on the cellular environment. Where two such repressor proteins act upon the same or similar sequences, the systems can be linked to form complex switches to regulate, for example, transcription. The best studied of such switches is the classical bacteriophage lambda protein switch controlling the choice of the lytic or lysogenic pathways . Many other repressor proteins and activator proteins are known to either block access of an enzyme to DNA or to enhance the binding of an enzyme to DNA. The paradigm of the lambda switch is featured in all molecular biology textbooks and it therefore often seems surprising that DNA binding by proteins can also be blocked by direct interaction between an inhibitor protein and the DNA-binding protein. In other words, instead of two proteins competing for binding to the same piece of DNA, we have a competitive interaction between an inhibitor protein and the DNA target sequence for binding to the DNA-binding protein. A successful competitive inhibitor usually resembles in shape, charge and possible interactions the molecule with which it competes. The proteins that compete with DNA for binding to specific DNA-binding proteins are no exception. DNA sequences recognized by proteins are usually of the order of 10 bp in length, display a range of charged phosphate groups and the edges of the nucleotide bases (which possess different hydrophobicity and hydrogen-bonding potential) and can be distorted away from the typical double helical conformation by twisting and bending. All of these structural features can be copied by a well-designed inhibitor and increasing numbers of such ‘DNA mimics’ are being discovered .
ocr: an inhibitor of type I DNA R/M (restriction and modification) enzymes
The oldest studied example of a DNA mimic protein is the gene 0.3 protein, also known as ocr for ‘overcome classical restriction’, expressed immediately by bacteriophage T7 upon infection of Escherichia coli . The ocr protein drastically reduces the effectiveness of all type I DNA restriction systems within the host cell and enables successful infection of the bacterium by the phage. Mark and Studier  noticed upon sequencing of the gene that the ocr protein is highly acidic and contains a vast excess of arginine and glutamate residues over asparagine and lysine residues. They also demonstrated a direct interaction of the ocr protein with the type I R/M enzyme, EcoKI, which prevented the enzyme from binding to DNA, and suggested that ocr acts as a polyanionic inhibitor of DNA binding which might have great potential for further engineering. ocr became classified therefore as an antirestriction protein [3–7]. At about the same time, it was noticed that ocr also appeared to bind tightly to E. coli RNA polymerase , perhaps suggesting that ocr might interfere with DNA binding by other proteins in addition to the type I R/M enzymes, although this observation has not been followed up in the intervening years.
Biophysical and crystallographic studies of ocr eventually revealed that it structurally mimics 24 bp of B-form DNA containing a central bend of angle 34° [9–11]. The binding affinity of the type I R/M enzyme EcoKI to ocr was far stronger than the affinity for binding to the enzyme's DNA target sequence . This effect is apparently due to the fact that the enzyme bends DNA upon binding, an energy-consuming operation not required with the pre-bent ocr molecule . ocr has a range of carboxy groups positioned across its surface to closely match the location of phosphate groups on the DNA target; thus the electrostatic interactions between ocr and EcoKI must be similar to those between EcoKI and DNA. The ocr protein uses amphipathic α-helices to arrange the carboxy groups appropriately and the periodicity of the helix would seem ideal for the construction of DNA mimics. Although the ocr protein has only been found in T7 and a few close relatives [14,15], the use of DNA mimicry by other antirestriction proteins with some degree of similarity to ocr, such as the Ard family of proteins and the phage T3 SAMase protein, seems highly probable [7,16–19].
UGI (uracil glycosylase inhibitor): a phage-encoded inhibitor of uracil glycosylase
A second clear example of DNA mimicry with some biological similarity to antirestriction was found by studying the uracil-DNA glycosylase inhibitor protein from bacteriophage PBS2 and its interaction with uracil-DNA glycosylase [20–23]. The phage genome unusually contains uracil instead of thymine, a feature rendering it resistant to bacterial DNA R/M systems and thus seemingly a good method of ensuring propagation of the phage through the bacterial population. However, bacteria have uracil glycosylase, which, as part of the DNA repair machinery of the cell, removes uracil bases from damaged DNA. Such an action on the phage DNA would lead to its inactivation, so the phage encodes a small protein, UGI, which binds strongly to uracil glycosylase and prevents the enzyme from binding to DNA. The UGI protein, like ocr, is made immediately upon infection of the cell and knocks out a host system that could act as a defence mechanism for the bacterium. As uracil glycosylase only needs to recognize uracil within DNA, it does not need to recognize a long DNA target sequence. Therefore the UGI protein only needs to mimic a very short section of DNA, just a few bases, and structures revealed that it would mimic phosphate groups and uracil glycosylase–DNA interactions (H bonds and hydrophobic contacts).
dTAFII230: a regulator of the TBP (TATA-box-binding protein) in eukaryotes
Neither of the above examples of DNA mimics emulates specific DNA target sequences since they must inhibit DNA-binding enzymes that recognize a range of DNA sequences; type I R/M enzymes from different bacteria recognize different DNA targets but ocr inhibits them all, and uracil can be flanked by any other nucleotide but UGI will still function.
As an example of sequence-specific DNA mimicry, one has the dTAFII230 protein of Drosophila [2,24]. This protein mimics the DNA sequence and structure recognized by TBP. TBP has several features enabling it to recognize and bind to DNA. It interacts with the phosphates using two rows of lysine and arginine residues, two asparagines sit in the middle of the binding surface and hydrogen-bond with the DNA bases and four phenylalanine residues partially force their way into the DNA minor groove to create a major distortion of the DNA. dTAFII230 arranges its amino acids to mimic all of these interactions. It has two rows of acidic amino acids to interact with the arginine and lysine residues, a methionine residue and several main chain peptides to hydrogen-bond to the two asparagines and two large concave hydrophobic surfaces to interact with the phenylalanine residues on TBP. Therefore dTAFII230 is a very precise mimic of the grossly distorted DNA structure bound by TBP.
DinI: a down-regulator of the SOS response in bacteria
The DinI protein of E. coli is a negative down-regulator of the SOS response that appears to interfere with the RecA-DNA filament, a key structure of the SOS response. The DinI protein structure reveals the presence of a single amphipathic α-helix exposing a run of acidic residues along one face of the helix. It was suggested that this helix could mimic the single-stranded DNA substrate recognized and bound by RecA [25,26]. In other words, DinI was a mimic of the phosphate backbone of single-stranded DNA. Given the use of two side-by-side α-helices by ocr to mimic double-stranded DNA, this suggestion seems plausible. However, more recent experiments have called the DNA mimicry into question, as the ability of DinI to displace RecA from single-stranded DNA seems to be a minor effect [27,28]. DinI instead appears to interact with the RecA-DNA filament and alter its structure without causing displacement of the RecA from the DNA. However, the suggestion that single-strand DNA mimicry can also exist is a useful one and it will probably be identified conclusively in the future.
HI1450: a DNA mimic in Haemophilus influenzae
The structure of the HI1450 protein expressed by H. influenzae was solved before any evidence of biological function was forthcoming . The structure revealed a great abundance of aspartate and glutamate residues arranged on α-helices and the spacing of the residues was such as to suggest a possible overlap with the phosphates of DNA. Recently, it has been demonstrated that this protein binds strongly to HU-α . HU-α is an abundant protein in the nucleoid of H. influenzae and binds to the DNA to form nucleosome-like structures as found for the histone-like proteins in other bacteria such as E. coli. Whether the interaction between HI1450 and HU-α is the correct physiological interaction remains to be seen, but clearly protein mimics of DNA could play a major, and so far unrecognized, role in the control of the nucleoid structure in bacteria.
MfpA: a Mycobacterium tuberculosis protein that binds to DNA gyrase
The most recent example of DNA mimicry to be discovered is of the extraordinary MfpA protein expressed by M. tuberculosis . This protein plays an important role in conferring resistance to fluoroquinolone antibiotics. Fluoroquinolone inhibits DNA gyrase by binding strongly to the covalently linked intermediate in the enzyme–DNA complex and preventing the completion of the reaction, with fatal consequences for the bacterium. The MfpA protein binds to gyrase and apparently protects it from fluoroquinolone until the enzyme has carried out its reaction. MfpA is a member of the penta-peptide repeat family of proteins and it seems likely that other members of this family may be DNA mimics as well. The structure of MfpA is a dimer with each monomer forming a β-strand helical structure that allows acidic side chains to project out from the surface of the protein to mimic phosphate groups and other groups to potentially mimic base interactions. Although the exact structural form of the interaction between MfpA and DNA gyrase is not known, the β-strand helical dimer is long enough to mimic approx. 30 bp of DNA and it seems overwhelmingly probable that MfpA is a dramatic example of DNA mimicry.
To quote the authors of the MfpA paper , “the core of the right-handed quadrilateral β-helix structure appears robust enough to allow for surface amino acid substitutions that could tailor specificity and could provide a platform for the rational design of proteins that specifically target DNA-binding proteins of known structure”. This statement could clearly be expanded to encompass all of the known DNA mimics discussed above and such a use should be investigated in parallel with a search for further examples in Nature.
Recombinant DNA Technology for the 21st Century: Focused Meeting held at AstraZeneca, Loughborough, U.K., 21–22 November 2005. Organized by M. Dyson (Wellcome Trust Sanger Institute), J. Sayers (Sheffield, U.K.) and A. Wallace (AstraZeneca, U.K.). Edited by J. Sayers.
We thank the Biotechnology and Biological Sciences Research Council and The Leverhulme Trust for grants to support our work on ocr.
Present address: The Sheffield Bioincubator, University of Sheffield, Sheffield S3 7QB, U.K.