A clone encoding a plastid isoenzyme of aspartate aminotransferase (AAT5) was isolated from an Arabidopsis genomic library and its complete sequence determined. The gene for AAT5 (asp5) contains an open reading frame of 2447 bp comprising 11 exons separated by introns ranging in length from 74 to 207 bp. The upstream regulatory region contains a putative TATA box and multiple copies of two sequence motifs, CTCTT and AAAGAT, previously associated with nodule-specific gene activity in legumes. The deduced primary amino acid sequence of the protein product of asp5 was used to generate a three-dimensional structure of the AAT5 protein by using the computer program Sybyl: Biopolymer Composer and known AAT structures on the protein databases. Both the mature protein and its precursor protein containing a putative N-terminal transit peptide were modelled. The resulting structure of the precursor protein indicated that the transit peptide might also inhibit dimerization of the protein until after its translocation across the chloroplast membrane. The derived structure of the mature protein was then analysed in terms of its component elements of secondary structure, and the positions on the polypeptide backbone corresponding to intron insertion sites were determined. It is observed that the introns tend to map to regions between structural subdomains of the protein and also map to sites on the surface of the molecule. The asp5 gene in Arabidopsis is thus consistent with Gilbert's exon-shuffling theory of gene evolution [Gilbert (1985) Science 228, 823–824]. A high degree of conservation of intron insertion sites between AAT genes from different plants and animals is observed, particularly within the part of the gene encoding a large β-sheet structure that forms the structural and functional core of the protein. This β-sheet structure is thus believed to comprise an ancient and very highly conserved moiety of the molecule.

This content is only available as a PDF.

Author notes

The nucleotide sequence reported will appear in DDBJ, EMBL and GenBank Nucleotide Sequence Databases under the accession number X91865.