In humans, expression of the FMO1 (flavin-containing mono-oxygenase 1) gene is silenced postnatally in liver, but not kidney. In adult mouse, however, the gene is active in both tissues. We investigated the basis of this species-dependent tissue-specific transcription of FMO1. Our results indicate the use of three alternative promoters. Transcription of the gene in fetal human and adult mouse liver is exclusively from the P0 promoter, whereas in extra-hepatic tissues of both species, P1 and P2 are active. Reporter gene assays showed that the proximal P0 promoters of human (hFMO1) and mouse (mFmo1) genes are equally effective. However, sequences upstream (−2955 to −506) of the proximal P0 of mFmo1 increased reporter gene activity 3-fold, whereas hFMO1 upstream sequences (−3027 to −541) decreased reporter gene activity by 75%. Replacement of the upstream sequence of human P0 with the upstream sequence of mouse P0 increased activity of the human proximal P0 8-fold. Species-specific repetitive elements are present immediately upstream of the proximal P0 promoters. The human gene contains five LINE (long-interspersed nuclear element)-1-like elements, whereas the mouse gene contains a poly A region, an 80-bp direct repeat, an LTR (long terminal repeat), a SINE (short-interspersed nuclear element) and a poly T tract. The rat and rabbit FMO1 genes, which are expressed in adult liver, lack some (rat) or all (rabbit) of the elements upstream of mouse P0. Thus silencing of FMO1 in adult human liver is due apparently to the presence upstream of the proximal P0 of L1 (LINE-1) elements rather than the absence of retrotransposons similar to those found in the mouse gene.
Mammalian FMOs (flavin-containing mono-oxygenases; EC 184.108.40.206) are microsomal enzymes that catalyse the NADPH-dependent mono-oxygenation of numerous foreign chemicals including therapeutic drugs and environmental pollutants . These enzymes thus constitute an important interface between the organism and its chemical environment. In human and in mouse, the FMO1, 2, 3, 4 and 6 genes are clustered on chromosome 1 [2,3]. The FMO5 gene lies outside the cluster; in humans it is located on chromosome 1  and in mouse it is on chromosome 3, in a region of synteny between mouse and human . A second FMO gene cluster, also located on chromosome 1 of both species, encodes five pseudogenes in human and, in the mouse, three genes that are not known to be expressed . Although functional FMO genes show evolutionary conservation with respect to both sequence and organization, changes have occurred in individual genes, which markedly influence the species-specific expression of the FMO1, 2 and 3 genes [3–6].
The reasons for these species-specific differences in FMO gene expression differ. For instance, in most humans the FMO2 gene encodes a non-functional protein, because of a C>T mutation that converts a glutamine residue codon at position 472 into a stop codon . In mouse and all other species examined to date, including the chimpanzee, the FMO2 gene encodes a glutamine residue at position 472 and hence a full-length FMO2 protein [5,7]. In humans, with the exception of individuals who suffer from trimethylaminuria, the FMO3 gene is most highly expressed in the liver [8–10], whereas in male mice expression of the Fmo3 gene in liver is switched off 5 weeks after birth [6,11]. The hepatic silencing of the Fmo3 gene in male mice has been shown to be mediated by hormonal factors .
In humans, expression of the FMO1 gene in liver is switched off shortly after birth, but the gene continues to be expressed in adult kidney [2,4,9]. Silencing of FMO1 gene expression in adult liver is specific to humans. In all other mammals studied, e.g. pig , rat , rabbit , mouse [11,15] and dog , the gene continues to be expressed in the liver after birth.
A consequence of this species difference in FMO1 expression is that, in adult humans, the contribution of this protein to detoxification is extra-hepatic. In contrast, in laboratory animals used in drug metabolism studies, FMO1 is a major form of the enzyme present in adult liver. An increasing number of therapeutic drugs, including tamoxifen, itopride, benzydamine, olopatidine and xanomeline, have been shown to be substrates for human FMO1 . In such cases, extrapolation of drug metabolism data derived from experimental animals and in vitro systems requires careful consideration.
In the present paper, we report the use of functional transcription assays and comparative gene analyses to identify DNA sequences that play a role in the species-specific extinction of FMO1 gene expression in human liver. We show that the presence of species-specific repetitive DNA elements and the use of alternative tissue-specific promoters in liver and kidney can account for the differential developmental and tissue-specific expression of the FMO1 gene in human and mouse.
Identification of transcriptional start sites
RNA was isolated from liver and kidney of 8-week-old female C57BL/6 mice. The Fmo1 exon 2 primer 5′-gggtgttaacggtgagcgaa-3′ (Eurogentec, Hampshire, U.K.) was end-labelled with [γ-32P]ATP and mixed with 12 μg of total RNA. A primer extension reaction was carried out for 1 h with AMV RT (reverse transcriptase) (Roche, Lewes, East Sussex, U.K.). For RT–PCR reactions, RNA from liver and kidney was reverse-transcribed by using the Fmo1 exon 5 primer 5′-cccttaaaagttagtatacct-3′, and amplified with Taq DNA polymerase (Roche Molecular Biochemicals) using the same primer and one located in intron 1 of the gene (5′-gcacaccacacagatagtct-3′). In addition, RNA was reverse-transcribed by using the exon 2 primer described above and amplified with this primer and one located in exon 0 (5′-gctctgggatcctaattgtgt-3′). The amplified products were cloned and sequenced.
The transcriptional start sites of the human FMO1 gene were determined by TAP-RLPCR [TAP (tobacco acid pyrophosphatase) reverse ligation-mediated PCR], as described by Fromont-Racine et al. . Fetal liver and adult kidney samples were obtained from the MRC Tissue Bank (Royal Marsden Hospital, London, U.K.) and St Mary's Hospital (London, U.K.) respectively as described in . Total RNA was extracted by the use of TRIzol® (Invitrogen, Paisley, Renfrewshire, Scotland, U.K.). RNA was incubated with calf intestinal alkaline phosphatase (New England Biolabs, Hitchin, Herts., U.K.). This removed phosphate groups from the 5′-end of partially degraded, i.e. non-capped, mRNAs. The RNA was then treated with TAP (5 units; Epicenter Technologies, Madison, WI, U.S.A.), which hydrolysed the 5′–5′-phosphodiester-linked cap structure from full-length mRNAs. An RNA linker 5′-gggcauaggcugacccucgcugaaa-3′ was synthesized from a partially double-stranded DNA comprising a 17-nt sequence 5′-taatacgactcactata-3′ bound to a 42-nt sequence 3′-attatgctgagtgatatcccgtatccgactgggagcgacttt-5′. The 17-bp double-stranded region of the DNA contained a T7 promoter, and the 25-nt single-stranded region provided a template for RNA synthesis. An RNA copy of the single-stranded region was produced by transcription with T7 RNA polymerase . The RNA linker was ethanol-precipitated and electrophoresed through a 12% (w/v) polyacrylamide gel containing 8 M urea. Nucleic acids were visualized by UV shadowing and the band corresponding to the 25-mer linker was excised from the gel, eluted from the polyacrylamide and purified by reverse-phase chromatography on a Sep-Pack C18 column (Waters Associates, Milford, MA, U.S.A.).
RNA (1 μg), treated as described above, was ligated to 100 ng of RNA linker with T4 RNA ligase. Ligated RNAs were reverse-transcribed by using the FMO1 primer +203/+229 (5′-cttctgggaatggaaagtctgagtaac-3′) (numbers are relative to the A of the ATG translational initiation codon). The resulting product was used as a template for PCR by using the linker-specific primer DNAPr-1 (5′-gggcataggctgaccctcgctg-3′) and an FMO1 primer (−71 to −90, relative to the A of the ATG) (5′-atcagtatgagccagtgctg-3′) labelled at its 5′-end with [γ-32P]ATP (>5000 Ci/mmol) through the use of a 5′-end-labelling kit (Amersham Biosciences). The PCR was catalysed using ThermoZyme DNA polymerase (Invitrogen).
For sequence analysis of transcriptional start sites, RNA from human fetal liver or adult kidney was ligated to the RNA linker and reverse-transcribed with random primers. The products were amplified by semi-nested PCR using the linker-specific primer (DNA-Pr1) and FMO1 primers +373/+399 (5′-ctcttcatgcatagtgaccacctccca-3′) and +203/+229. DNAs were amplified by ‘touchdown’ PCR  in a GeneAmp PCR System 96 thermal cycler (PerkinElmer, Norwalk, CT, U.S.A.). Amplified products were cloned into pCR4-TOPO cloning vector (Invitrogen) and sequenced.
Mining cDNA sequences
Human and mouse FMO1 cDNAs were identified from BLASTn analyses (http://www.ncbi.nlm.nih.gov/entrez/) using clones M64082  and NM_010231  respectively as query sequences. Human cDNAs identified were BC047129 [IMAGE consortium (Integrated Molecular Analysis of Genomes and their Expression consortium); http://www.image.llnl.gov] and AK097039 . Mouse cDNAs identified were D16215 , BC011229 , BF532824, AI115B9, AA245076, AI255718, AA238774 and BI247068 (IMAGE consortium), all derived from liver mRNAs, and BF784152, CB954312, CB599568, AI118998 and CB955318 (IMAGE consortium), derived from kidney mRNAs.
Promoter-reporter gene constructs
The parent plasmid for each construct was pGL3 Basic (Promega, Southampton, U.K.). Oligonucleotides used to prime amplification of promoter sequences of FMO1 genes of human and mouse are given below. Restriction sites, for insertion of amplified products into the parent plasmid, were included in the primers and are shown in bold-face. Human sequences were amplified from human genomic DNA by using the reverse primer +27, 5′-caagcttccccagcacagtggataaac-3′, and forward primer −544, 5′-cgagctcccactcgatcatgcctattt-3′, or −3027, 5′-cgagctc-gccctgctcatcacattca-3′, to produce plasmids pGL-544(H) and pGL-3027(H) respectively. Mouse sequences were amplified from mouse genomic DNA by using the reverse primer +50, 5′-caagcttgggagttccctgcacacaggat-3′, and forward primer −431, 5′-cgagctcgccaggactcatcatgacttcgaa-3′, or −2955, 5′-cgagctcggcatggcatgaaaggaaaa-3′, to produce plasmids pGL3 −431 (M) and pGL3 −2955(M) respectively. The construct pGL −544(H)/−2955(M) was prepared by cloning an amplified mouse product (forward primer −2955 and reverse primer −506, 5′-cgagctcgggaatgcaagacagatgtgtg-3′) upstream of the human proximal promoter in pGL-544(H). The products were amplified by using BIO-X-ACT Short or Long DNA polymerase (Bioline, London, U.K.) as appropriate.
Cell transfection and reporter gene assays
HepG2 cells (passage 9–13) were obtained from the European Collection of Animal Cell Culture. Cells were cultured in Williams' E medium (Sigma–Aldrich) supplemented with gentamicin (50 μg/ml) and 10% (v/v) fetal calf serum. Cells were transfected at 70–75% confluency with the reporter constructs shown in Figure 5. Each 60-mm plate was transfected with 5 μg of a reporter gene construct and 0.25 μg of pRL-TK (Promega, Madison, WI, U.S.A.), as a control for transfection efficiency, using Tfx20 reagent (Promega) in the ratio of 3:1 (TfX20/DNA). Luciferase reporter gene activity was measured after 48 h by using the Dual-luciferase reporter system (Promega) according to the manufacturer's recommendations.
Comparative sequence analyses
Human and mouse FMO1 genomic sequences were downloaded from the Wellcome Trust Sanger Centre, U.K., at http://www.ensembl.org/Homo_sapiens/ and http://www.ensembl.org/Mus_musculus/ (Ensemblv37). Rat FMO1 sequences were downloaded from http://genome.ucsc.edu/ (release June 2003). Rabbit FMO1 promoter sequence was from . Sequences were aligned using the dot matrix facility of MacVector 6.5.3 and VISTA (http://www.gsd.lbl.gov/vista) [23,24]. Transcription factor sequences were analysed by using MacVector 6.5.3 and AliBaba2 (http://www.alibaba2.com/) . Conserved motifs were identified using DNA Footprinter (http://bio.cs.washington.edu/software.html)  and ConSite (http://mordor.cgb.ki.se/cgi-bin/ConSite/consite/) .
Tissue-specific use of alternative promoters in the mouse Fmo1 gene
Transcriptional start sites of Fmo1 were determined by primer extension. An Fmo1-specific primer complementary to the sequence +26 to +7, relative to the A of the AUG translation initiation codon, was used to prime synthesis from adult mouse liver and kidney RNAs. A complex pattern of extended products was obtained from each of the two tissues (Figure 1a). Although these patterns are similar, a product of 180 nt is clearly more abundant in kidney than in liver, indicating a potential tissue-specific difference in the transcriptional start of Fmo1.
Analysis of transcriptional start sites of mouse Fmo1 in adult liver and kidney
Analysis of available mouse liver FMO1 cDNA clones (NM_01023, D16215, BC011229, BF532824, AI115B9, AA245076, AI255718, AA238774 and BI247068) shows that although their leader sequences differ in length, all are derived from what we now call exon 0 spliced to sequences derived from exon 2 (Figure 2). RT–PCR and sequence analysis of the amplified products confirm that mouse liver FMO1 mRNAs are derived from the splicing of exon 0 to exon 2 (Figure 1b). These two exons are separated by a distance of ∼6.68 kb. Thus, in mouse liver, transcription of Fmo1 is initiated from a promoter (P0) located upstream of exon 0. Of the five available kidney FMO1 cDNAs, three, BF784152, CB954312 and CB599568, have leader sequences derived from the 3′-end of intron 1 and exon 2. RT–PCR and sequence analysis of the amplified products confirmed that, in kidney, transcription can occur from within intron 1, from a promoter designated P2 (Figure 1c). The two other kidney cDNAs, AI118998 and CB955318, are the products of a splicing event between exon 2 and a novel exon we now call exon 1 (Figure 2). These two exons are separated by a short intron of 231 bp. Transcription of these mRNAs starts from a promoter (P1) located upstream of exon 1 (Figure 2). Thus the results of the RT–PCR experiments confirm the use of alternative promoters and promoter-driven splicing events in the expression of the mouse Fmo1 gene.
Alternative promoters are used in the transcription of the FMO1 gene in human and mouse
Although the results of the RT–PCR experiments suggest that the P0 and P2 promoters can be used in both liver and kidney, there is a preference for P0 in liver and a marked preference for P2 in kidney (Figures 1b and 1c). This is supported strongly by the finding that all nine of the liver-derived cDNAs are transcribed from P0, whereas none of the five kidney-derived cDNAs is transcribed from this promoter. Therefore, taken together, the results indicate a marked in vivo tissue-specific preferential, if not exclusive, use of the P0 promoter in liver and the P1 and P2 promoters in kidney.
Tissue-specific use of alternative promoters in the human FMO1 gene
Primer extension analysis of mouse FMO1 mRNA yields a complex pattern of extended products (see above). A similar pattern was reported previously for the rabbit  and human FMO1 mRNAs . Consequently, to define the start sites of transcription for the human FMO1 gene in fetal liver and adult kidney, we used TAP-RLPCR . This method is advantageous because only mRNAs that are capped at their 5′-ends (and hence are full-length) are amplified.
Human fetal liver and adult kidney RNAs were reverse-transcribed and then amplified by PCR (Figure 3a). A PCR product of approx. 82 bp was obtained from fetal liver and, in less abundance, also from adult kidney RNAs (Figure 3a), indicative of an mRNA that extends 127 nt upstream of the AUG translation initiation codon. To define the transcriptional start site, a second TAP-RLPCR was performed using semi-nested PCR with FMO1 gene-specific primers corresponding to sequences within exons 4 and 3 respectively. A 380-bp fragment was amplified from fetal liver and adult kidney (results not shown). An additional 400-bp fragment was amplified only from adult kidney. The amplified products were cloned and sequenced. Of ten clones of the 380-bp product, eight corresponded to a leader sequence extending 127 nt upstream of the A of the AUG initiation codon and two to a leader extending 125 nt upstream of the AUG. The 5′-ends of these mRNAs correspond to transcriptional start sites located −9594 bp and −9592 bp respectively upstream of the A of the ATG translation initiation codon. These start sites are used in fetal liver and adult human kidney. In both cases, exon 0 of the FMO1 gene is spliced to exon 2. Sequence analysis of clones corresponding to the 400-bp adult kidney-specific PCR product showed an alternative transcriptional start site 151 bp upstream of the translation initiation codon (Figure 3c). The leader sequence of these mRNAs is derived from intron 1 sequence and in these mRNAs exon 2 is extended at the 5′-end.
Mapping of the transcriptional start site of the human FMO1 gene in fetal liver and adult kidney
Therefore, in the human, as in the mouse, different start sites can be used to transcribe the FMO1 gene (Figures 1–3). In fetal liver and adult kidney, transcription begins with exon 0, using the promoter P0. In the kidney, transcription can begin also from within intron 1 (Figure 3), from promoter P2. Very few human FMO1 cDNAs have been identified. The only liver-derived cDNA (M64082) is transcribed from P0. A cDNA (BC047129) derived from a pool of colon, kidney and stomach RNAs is transcribed from P2. A single FMO1 cDNA (AK097039) has been isolated from human small intestine . The 5′-end of this cDNA is derived from an exon located between exons 0 and 2, which we call exon 1, and is thus transcribed from promoter P1. The position of exon 1 in human FMO1 is equivalent to that of the corresponding exon in the mouse gene (Figure 2) . The small intestine cDNA has sequence derived from exons 1–9, but also contains intronic sequences from between exons 2 and 3 and between exons 7 and 8. Thus it is not expected that this cDNA would encode a functional protein.
Analysis of transcriptional start sites, RT–PCR and sequence analysis of the amplified products, and a survey of the sequences of available cDNA clones, revealed that in both the human and mouse FMO1 genes, transcription can begin from three different promoters: P0 (upstream of exon 0), P1 (upstream of exon 1) and P2 (upstream of exon 2).
Species-specific repetitive elements and the transcription of the FMO1 gene from P0
RT–PCR analysis showed that, in human fetal liver, FMO1 mRNAs are derived from a splicing event between exon 0 and exon 2. Thus, in this tissue, FMO1 is transcribed exclusively from the P0 promoter. The P0 promoter remains active in adult mouse liver (Figure 1), but is switched off in adult human liver. To identify sequences that may be responsible for the specific silencing of the FMO1 gene in adult human liver, we investigated 5′-flanking sequences upstream of P0 by a combination of sequence analysis and reporter gene assays.
A comparison of sequences upstream of the P0 promoter of FMO1 of human and mouse, and of two other species, rat and rabbit, in which the gene is expressed in adult liver, revealed considerable sequence identity extending for ∼450 bp upstream of the P0 transcriptional start site (Figure 4). In this region, the human sequence is 66, 69 and 70% identical respectively with the sequences of mouse, rat and rabbit, and the mouse sequence is 88 and 60% identical respectively with the sequences of rat and rabbit. Analysis of the proximal region of the P0 promoter, with the phylogenetic footprinting program ConSite , identified five motifs that were conserved in all four species. Of these, three were predicted as binding sites for Sox-5, a protein known to be involved in developmental processes , and two as binding sites for the liver-specific transcription factors HFH1 [HNF3 (hepatocyte nuclear factor 3) homologue 1] and HFH2  (Figure 4). The program Footprinter 2.1  identified 13 evolutionarily conserved footprints in the four species. Analyses of these conserved motifs, both manually and by the transcription factor identification programs MacVector 6.5.3 and AliBaba 2.1 , failed to identify any nucleotide differences that would create or destroy a predicted protein-binding site (results not shown).
The proximal region of the FMO1 P0 promoter of human, mouse, rat and rabbit
Thus the proximal region of the P0 promoter of FMO1 of human, mouse, rat and rabbit is well conserved. HepG2 cells (human hepatocellular carcinoma G2 cells) were transfected with a reporter gene under the control of the proximal region of the P0 promoter of FMO1 of human, pGL[−544 to +27(H)], or mouse, pGL[−431 to +50(M)]. No significant differences were observed in reporter gene activity directed by the proximal P0 of the two species (Figure 5). Therefore the transcriptional activity of the proximal region of the P0 promoter of the FMO1 gene of human is very similar to that of the mouse.
Transcriptional activity of human and mouse FMO1 P0 reporter gene constructs
Although the proximal region of the P0 promoter of human and mouse FMO1 share considerable sequence identity, dot matrix analyses showed that upstream of approx. −470 there is little sequence similarity between the human and mouse genes (Figure 6). This breakdown in sequence similarity is due to the presence of species-specific repetitive DNA elements that have inserted upstream of the proximal region of P0 of both human and mouse. Region −541 to −1835 of the human FMO1 gene contains three L1 [LINE (long-interspersed nuclear element)-1]-like elements, which lie 3′–5′ with respect to the transcriptional start site of the P0 promoter (Figure 6). We have named these L1a, L1b and L1c. The longest of these, L1c (Figure 6), which lies between −1310 and −1835 of P0, was identified by a BLAST search and its characteristics were assigned by alignment to the query sequence human transposon L1.2 (accession no. M80343) . L1c is truncated at its 5′-end and contains an incomplete ORF2 (open reading frame 2) with several stop codons, a 3′-UTR (3′-untranslated region) (220 bp), and an AT-rich region (38 bp). L1a (−707 to −541) and L1b (−1218 to −720) have been identified by the UCSC Genome Browser . They have 47% identity to each other and 41% identity to L1c. Neither L1b nor L1a has an A-rich tail.
Transposable elements upstream of P0 in the FMO1 gene of human, mouse, rat and rabbit
A further two L1 elements, L1d and L1e, were identified upstream of L1c (Figure 6). L1d (−4386 to −3189) is truncated at its 5′-end and comprises a 967-bp ORF2, a 203-bp 3′-UTR and a 26-bp A-rich sequence. It is flanked by an almost perfect (13/14) TSD (target-site domain). None of the other four L1 elements has clearly identifiable TSDs. L1d and L1c are inverted with respect to one another and are separated by 1874 bp of host FMO1 gene sequence. L1e lies 688 bp upstream of L1d. The sequence separating L1e and L1d elements is host FMO1 sequence. L1e (−11369 to −5080) represents a full-length L1 element. It has a 5′-leader, a 1024-bp ORF1, a 63-bp spacer, a 3767-bp ORF2, a 152-bp 3′-UTR and a 26-bp A-rich region. Both ORF1 and ORF2 have several frameshifts and stop codons. The features of L1d and L1e were identified by comparison with human transposon L1.2 (accession no. M80343). Of the five L1-like elements, L1d and L1c have the greatest sequence identity (85%). These two elements have an identity of approx. 67% to L1e.
Analysis of the 5′-flanking sequence of the mouse Fmo1 P0 promoter revealed a stretch of 35 adenine residues between nt −486 and −450 that is not present in the human flanking sequence (Figure 4). A BLAST query using the sequence upstream of this poly A stretch identified the Fmo1 gene itself and a cDNA clone (AK046657) from a 4-day adipose tissue library, which is identical in sequence with a region extending from −2994 to −450 of the mouse Fmo1 promoter. The poly A stretch originally noted in the mouse Fmo1 gene (see Figures 4 and 6) corresponds to Tresidues at the 5′-end of the cDNA, indicating that this region can be transcribed, but in the opposite direction to that of the Fmo1 gene. The RIKEN project states that the cDNA has an unclassifiable product. The annotations of the cDNA (http://fantom.gsc.riken.jp/db/link/cloneid.cgi?id=B430304E15)  show that it contains a SINE (short-interspersed nuclear element) B2 element. The SINE element lies between −2822 and −2616 of Fmo1 and is flanked by a 6-bp TSD (GGAGAT). An LTR (long terminal repeat)-like sequence (RLTR13D), identified in the cDNA, was found to lie between −2312 and −1593 of Fmo1. The LTR is flanked by a 6-bp TSD (CTAAAG). The presence of the LTR-like sequence in mouse Fmo1 was confirmed by PCR amplification of genomic DNA using primers that flanked the LTR. A product of the expected size (670 bp) was obtained and its identity was verified by DNA sequencing (results not shown). Between −822 and −743 of Fmo1, there is an almost perfect 80-bp direct repeat. Thus, within approx. 2.5 kb, the 5′-flanking sequence of mouse Fmo1 has a number of different retrotransposable elements, which are flanked at the 5′-end by a 33-bp T-rich sequence and at the 3′-end by the 35-bp stretch of A residues. Figure 6(a) shows a dot matrix comparison of the FMO1 P0 sequences of human and mouse. The regions of difference in the matrix correspond to the insertion of species-specific retrotransposons.
The effect of the species-specific retrotransposons on transcriptional activity of FMO1 P0 promoters was investigated (Figure 5). Transfection of HepG2 cells with pGL[−3027/+27(H)], a construct that contains the L1a, b and c elements that lie upstream of the human FMO1 P0 promoter, resulted in a 75% reduction in reporter gene expression compared with cells transfected with the proximal P0 construct pGL[−544/+27(H)], which contains no L1 elements. In contrast, reporter gene activity was 3-fold higher when cells were transfected with the mouse Fmo1 construct pGL[−2955/+50(M)] compared with cells transfected with pGL[−431/+50(M)] (the mouse P0 proximal promoter construct). We then tested the transcriptional activity of a chimaeric construct, pGL[−2955(M)/−544(H)], in which the human P0 proximal promoter (−544 to +27) was placed under the control of the mouse upstream retrotransposon sequence (−2955 to −373). Reporter gene activity from this construct was 8-fold higher than that from a construct, pGL[−3027/+27(H)], in which the human P0 proximal promoter was controlled by the human L1-enriched sequence upstream of P0. The results of these reporter gene assays show that mouse upstream 5′-flanking sequences enhance transcription from P0, whereas human upstream 5′-flanking sequences act to repress transcription from P0.
To determine whether other species that express the FMO1 gene in adult liver contain retrotransposon elements similar to those found in the mouse, we analysed the P0 5′-flanking sequences of the FMO1 genes of rat and rabbit. Dot matrix (Figure 6b) and VISTA alignments (results not shown) of the rabbit and mouse FMO1 flanking sequences show that there is little identity between the two species in the 2.5-kb region containing the mouse retrotransposon elements. In contrast, there are some similarities between rat and mouse in this region. For instance, the rat gene contains the SINE element found in the mouse Fmo1 gene and has a TG-rich sequence in the equivalent position to that of the mouse T-rich sequence (Figure 6e). However, the rat gene lacks the 35-bp poly A tract present in mouse Fmo1 (Figure 4) and it possesses only one copy of the 80-bp direct repeat sequence of the mouse. The rat FMO1 flanking sequence does not contain the LTR sequence found in the mouse (Figure 6e). The absence of the LTR was confirmed by PCR amplification of rat genomic DNA and DNA sequencing (results not shown). The 6-bp TSD that flanks the mouse LTR is present in the corresponding position in the rat gene as a single copy. This suggests that the rat has not lost the LTR, but instead the mouse has gained an additional retrotransposon sequence in its Fmo1 gene after the speciation of rodents. The poly A stretch, which is seen in the mouse, but not rat (Figure 4), may have arisen by a transposition event that split the 5′-end of a transposable element from its 3′-end (3′-transduction) .
The alternative FMO1 promoters P1 and P2
Although transcription from P0 is silenced in adult human liver, the FMO1 gene continues to be expressed in adult kidney . A reason for this is the use in kidney, but not in liver, of an alternative promoter, P2, located downstream of P0 (Figure 3). We used a bioinformatic approach to explore the P1 and P2 regions of the human and mouse FMO1 genes. The P2 transcriptional start site of human FMO1 lies within a consensus INR (initiator) site , TCACAT (base +1 is indicated in boldface), located 151 bp upstream of the ATG translation initiation codon. An identical INR sequence is present in the mouse gene 153 bp upstream of the ATG codon. In both species, the sequence TTAAC is located approx. 30 bp upstream of the transcriptional start site and may represent a binding site for the TATA-binding protein. The sequence GGGCGG, a potential SP-1-binding site, is present between −49 and −54 of the human gene. A similar sequence, GGGTGG, is present between −57 and −52 of the mouse gene. No downstream promoter element consensus sequence is evident in either species. In the mouse, transcription of the Fmo1 gene in the kidney can start also at exon 1, and in this case P1 is used (see above). The location of P1 can be inferred from analysis of cDNA clones from mouse kidney and human small intestine.
Phylogenetic footprinting was used to map evolutionarily conserved transcription factor-binding motifs within a 3.0-kb region upstream of P1 and P2 of human and mouse FMO1 genes. Eight conserved footprints, identified by ConSite , lie upstream of exon 2 and are indicated in Figure 7. Two footprints are located in the region of P2 in the short intronic sequence between exons 1 and 2 (intron lengths are 238 bp in mouse and 231 bp in human). Five conserved footprints are positioned within a ∼500-bp region located just upstream of exon 1, in the region of P1. The eighth conserved phylogenetic footprint is located further upstream, approx. 1500 and 1300 bp from the ATG, in mouse and human respectively. No conserved potential transcription factor footprints were identified within exon 1 of either the human or mouse gene.
Phylogenetic footprint analyses of the intronic sequences encompassing P1 and P2 promoters of the FMO1 gene of human and mouse
We have identified three promoters, P0, P1 and P2, that are used in the transcription of the FMO1 gene in both human and mouse. Different promoters can be used in different tissues: in adult mouse and fetal human liver transcription is exclusively from the P0 promoter, P1 and P2 are used in mouse kidney and P2 is used in human kidney. In addition to the tissue-specific use of FMO1 promoters, there is a species-specific developmental silencing of the P0 promoter in adult human liver. The continued expression of FMO1 in adult human kidney can be explained by the use of an alternative downstream promoter, P2. Whichever promoter is used, FMO1 protein-coding sequences are derived from the splicing of eight constitutive exons, exons 2–9. The use of alternative promoters results in the inclusion in the mRNA of additional leader sequences derived from the mutually exclusive, regulated, cassette exons 0 and 1 or from intron 1. FMO1 mRNA transcribed from P0 contains exon 0. mRNA from P1 has a leader sequence derived from exon 1 and the 5′-end of exon 2. When P2 is used, the leader contains sequences derived from intron 1 and the 5′-end of exon 2. None of the different leader sequences contains protein-coding information. Thus mRNAs derived from the use of different FMO1 promoters encode identical proteins. Humans display up to 10-fold inter-individual variation in amounts of FMO1 [9,37]. FMO1 is not known to be subject to induction by endogenous or exogenous compounds. Thus the observed differences in FMO1 amounts are more likely to be a consequence of genetic variations that influence the strength of one or more of the promoters from which the gene is transcribed, rather than of environmental factors.
Reporter gene assays revealed that the proximal regions of the P0 promoter of human and mouse FMO1 have very similar strengths. However, inclusion in reporter gene constructs of additional human FMO1 upstream sequences was found to markedly down-regulate the activity of the P0 promoter. In contrast, inclusion of the corresponding upstream region of the mouse Fmo1 gene increased the activity of the human P0 core promoter.
Comparisons of sequences upstream of the proximal P0 promoter of human and mouse FMO1 showed that both have undergone species-specific insertion of DNA elements into this region during evolution. In the mouse, a 2.5-kb region upstream of the core P0 promoter contains a collage of different transposable elements, none of which is present in the human P0 promoter. However, most or all of these elements are also absent from the corresponding regions of the FMO1 gene of organisms such as rat and rabbit, in which the FMO1 gene is expressed in adult liver. Thus, although such elements may have a positive effect on gene transcription, their absence from the human P0 promoter cannot account for the negative effect of upstream sequences on the activity of the human P0 promoter. Instead, the repressive effect of upstream sequences on the P0 promoter is most likely due to the unique presence in the human FMO1 gene of L1 elements.
In the human, the bulk of the 11-kb region upstream of the P0 promoter is composed of L1 elements (see Figure 6). Inclusion in a reporter construct of the three most proximal of these, L1a, b and c, down-regulates expression from P0. One explanation for the effect of the L1 elements on the activity of the human P0 promoter is that their insertion during evolution may have led to the separation from the core promoter of a regulatory element that is essential for transcription from P0 in adult human liver, but not in fetal liver. Alternatively, the L1 elements may have a more direct effect on activity of the P0 promoter. For instance, L1 elements can be heavily methylated, a mechanism that is thought to protect our genome from spurious transcription of these sequences . As methylation spreads, it leads to heterochromatin formation and transcriptional repression. Thus, although methylation is unlikely to explain the action of these elements in a reporter gene assay, silencing of the FMO1 gene in adult human liver may be the result of methylation of the battery of L1 elements that lie upstream of the liver-specific promoter P0. The P1 and P2 promoters, which are active in non-hepatic tissues, may be sufficiently removed (approx. 9 kb) from the L1 elements to be unaffected by the methylation.
Although the P0 promoter of the FMO1 gene is inactive in adult human liver, transcription of the gene in fetal liver occurs from this promoter (see Figure 3). This may be due to the presence in the P0 promoter, proximal to the L1 elements, of binding sites for the developmentally regulated transcription factor Sox-5 (see Figure 4). Large-scale analysis of the human transcriptome (HG-U95A) (http://www.ncbi.nlm.nih.gov/projects; gene expression omnibus GDS181) reveals that the mRNA Sox-5 is expressed in fetal, but not adult, human liver.
In addition, methylation of L1 elements is an epigenetic modification, which would occur during development. Thus, at the fetal stage, the L1 elements may be insufficiently methylated to inhibit transcription from the P0 promoter, but later during development, methylation may have progressed sufficiently to switch off transcription from the promoter. The continued expression of the FMO1 gene in human kidney after birth can be explained by the use of the downstream promoter P2. The insertion of retrotransposons into promoter regions of genes may be one of the factors that drive tissue-specific switches in promoter use.
The region upstream of the P0 promoter of the FMO1 gene appears to be a hotspot for retrotransposition, given the independent insertion of various transposable elements into the same region of DNA in three species, human, mouse and rat (Figure 6e). The collage of repetitive elements accumulated upstream of P0 by the mouse Fmo1 gene increases transcription and the gene continues to be expressed in the liver after birth. In contrast, in the case of the human, the presence of inserted L1-like elements has a deleterious effect on the expression of the FMO1 gene and appears to be a major factor contributing to the silencing of the gene in adult human liver.
This work was funded by a grant to E. A. S. and I. R. P. from the Wellcome Trust, U.K. (053590). P. C. and M. S.-W. were the recipients of Ph.D. studentships from the MRC, U.K., and Bart's and the Royal London School of Medicine and Dentistry respectively.
- HepG2 cells
human hepatocellular carcinoma G2 cells
hepatocyte nuclear factor
long-interspersed nuclear element
long terminal repeat
open reading frame
short-interspersed nuclear element
tobacco acid pyrophosphatase
TAP reverse ligation-mediated PCR