Any attempt to characterize a bacterial community and their functional genes coding for enzymes of the nitrogen cycle is faced with its extreme biodiversity. Novel techniques, based on PCR amplification of target genes in DNA from environmental samples, have been developed for characterizing both cultured and as yet uncultured bacteria in the last few years. Computer-based assignment tools have now been developed utilizing terminal restriction fragments obtained from digestions with multiple restriction enzymes. Such programs allow the gross characterization of bacterial life in any complex bacterial community with confidence.
The complexity of bacterial communities
The species richness and complexity in environmental samples is enormous. A soil, for example, harbours approx. 104 ribotypes (∼different bacteria) per g of sample [1,2]. Early attempts (e.g. by us ) were restricted to the characterization of bacteria which could be isolated from plates after growth on conventional media. Southern hybridizations using DNA probes and dot-blot analyses gave hints that the abundance of bacteria was higher in the vicinity of roots than in the bulk soil. The concentration of denitrifying bacteria seemed to be higher at the root surface than in the root-free soil . A conclusion such as this could immediately be criticized on the grounds that only a small percentage of bacteria can be assessed by such a method and that the overwhelming majority of bacteria cannot be cultured as yet. Estimates for the number of as yet uncultured bacteria vary between 0.1 and 15% of the total community, depending on the habitat and investigator [4,5–6]. Therefore molecular techniques, based on PCR, have been developed in the past few years to characterize target genes in DNA isolated from different habitats and to quantify their occurrence. Methods have recently been published for the isolation of mRNA suitable for analysis of transcript formations by RT (reverse transcriptase)–PCR [7,8]. The present paper will briefly evaluate different methods to assess the bacterial biodiversity in habitats based on PCR amplification of genes using extracted DNA.
Choice of enzymes within the nitrogen cycle for the amplification of their encoding genes by PCR
To refer the relative abundance of any functional gene on the total, the 16 S rRNA gene is generally chosen as marker for the concentration of the bacterial community in an environmental sample. Primers of different regions have been selected by different investigators, and it is assumed that the 16 S rRNA gene of all bacteria can be amplified with essentially the same efficiency using universal primers. However, bacteria can have different copy numbers of the 16 S rRNA gene (n=1–15), which are dependent on the species [9,10]. Biases in the amplification, cloning and sequencing of DNA have been critically evaluated by Kitts . Thus the characterization of a bacterial community by amplifying, cloning and sequencing of the 16 S rRNA gene provides data on the richness but hardly on the evenness of bacteria.
With regard to dinitrogen fixation, nifH encoding nitrogenase reductase is the gene of choice for assessing the diversity of N2-fixing bacteria by molecular techniques. Early investigators already noted that the nifH gene is very much conserved among all organisms . Therefore nifH is taken by almost all investigators for probing of nitrogenase occurrence in bacteria, unless special problems are studied. An example for such a subject is the distribution of V- or Fe-only nitrogenases in organisms, where primers are centred around the vnf/anfG gene, which is a specific feature of these alternative enzymes. For nifH of both alternative and conventional, Mo-containing, nitrogenases, the primers developed by Zehr and co-workers [13,14] but also by others  have been employed widely by different investigators.
Relatively few sequences have been published for genes encoding nitrification enzymes. The currently available databases allowed the development of primers for amoA, encoding one subunit of ammonium mono-oxygenase. Probing with them gives insights into a limited range of ammonium-oxidizing bacteria. The molecular techniques do not yet allow us to screen for heterotrophic nitrifiers which are said to be the major players in the oxidation of the ammonium ion via nitrite to nitrate in aerated soil .
With regard to denitrification, the sequence databases are better. However, any characterization of denitrification is confronted with the fact that each single step, with the exception of N2O reduction, is catalysed by at least two different enzymes. Many bacteria contain a membrane-bound dissimilatory nitrate reductase (encoded by the nar genes), some have additionally a periplasmic enzyme (nap genes) whereas others such as rhizobial strains possess only the periplasmic enzyme. Nitrate ammonifiers such as Escherichia coli and other Enterobacteriaceae contain an ntr-encoded nitrate reductase. Thus general probes that recognize all dissimilatory nitrate reductases are difficult to construct, although all enzymes possess the molybdopterin cofactor in their prosthetic group. Probing with narG has been performed [17,18], but such an approach can hardly reach the whole sub-community of bacteria which reduce nitrate anaerobically for energy generation.
Dissimilatory nitrite reductases contain either cytochrome cd1 or Cu in their prosthetic group, and bacteria possess either one of these two enzymes or none. The copper nitrite reductase is distributed among bacteria of totally unrelated taxonomic affiliations . Primers for probing nirK encoding Cu-containing nitrite reductase have been developed and the distribution of bacteria with this gene has been tested in environmental samples [15,20,21]. Several authors noted that broad-range primers recognizing nirS encoding cytochrome cd1 nitrite reductase in all organisms can only be designed with great difficulty [15,20,21]. However, this problem might be resolved by the development of the new primers recently communicated . Within one single genus, Azospirillum, strains may possess the cytochrome cd1-containing, the Cu-containing or no nir enzyme .
Two enzymes also exist for catalysing the conversion of NO into N2O: the cytochrome bc and the quinone nitric reductase. Since these enzymes have been discovered fairly recently, relatively few sequences of the genes coding for these enzymes have been published. Despite this, primers that allow the amplification of a norB segment of both NO reductases have successfully been employed for assessing the distribution of organisms with this gene in pure cultures and environmental samples .
Only one enzyme is known to catalyse the conversion of N2O into N2. Different areas of the nosZ gene encoding nitrous oxide reductase have been employed for developing primers to be used for gene probing in ecological studies. However, probing with nosZ does not comprehensively assess all denitrifying bacteria, since some of them form N2O as the final product of denitrification (see the list of organisms compiled by Zumft ). In some of these organisms, nosZ may be present, but conditions have not yet been met to express the gene. For us, nosZ seems to be the best gene for monitoring the denitrification in bacteria by molecular approaches, unless a combination of two or more genes (nosZ, nirS/nirK and narG, for example) is employed.
Methods to assess the biodiversity of bacteria in environmental samples
To get access to the total bacterial community in a sample, various methods, some of which are based on DNA extraction followed by PCR amplification of the target gene(s), have been employed in the last few years. Some of them will be described below.
DGGE (denaturing gradient gel electrophoresis)
The amplified segments of the target genes are separated by DGGE or temperature gradient gel electrophoresis where the number, precise position and intensity of the bands give an estimate of the relative abundance of the dominant bacteria in the sample [6,25,26]. In general, the microbial community in a habitat is so complex that too many bands or a smear of unresolved bands may be obtained after the electrophoresis. In addition, bacteria which are less abundant, although physiologically active, may not be represented in the DGGE patterns and can, therefore, be missed. A new avenue in the field is the development of group-specific 16 S rRNA gene primers (for α- and γ-proteobacteria , ammonia-oxidizing bacteria  and others ) which allows the separation of group-specific PCR-generated segments with a sufficient resolution in the DGGE band patterns. As with other methods, DGGE only provides an indication but not an absolute measure of the degree of biodiversity in a bacterial community [6,29]. The method permits the isolation of bands obtained from DGGE gels; these can then be cloned and sequenced, thus identifying the bacterium involved.
FISH (fluorescence in situ hybridization) studies
This method has been employed widely. The probe for the 16 S rDNA or any other target gene is labelled by a fluorescent dye. In situ hybridizations with such labelled oligonucleotide probes in combination with confocal laser microscopy allows the enumeration of the number of specific bacteria in a sample from impressive images. Quantitative FISH experiments can also be performed with broad-range probes for groups of organisms in combination with semi-automatic digital image analyses . This technique can be coupled with microautoradiography, which allows the consumption of specific molecules by distinct organisms to be assessed. In activated sludge, for example, Microthrix parvicella is able to take up and store long-chain fatty acids, in contrast with most other bacteria of this habitat. The uptake of such fatty acids labelled by radioactivity can be followed specifically for M. parvicella by FISH during the course of the development of the complex bacterial community in activated sludge [31,32]. Whereas FISH gives insights into the correlation between the physiological behaviour and the growth state of an organism (or a group of organisms), it hardly provides a survey of the total community structure.
Quantification of gene content by real-time PCR
Real-time PCR is widely used now in microbiology , plant science  and other fields to quantify the concentration of a gene in a sample of extracted DNA. The relative content of a gene can adequately be compared in different DNA samples. The technique is fairly expensive, which prevents a wide range of applications. Absolute quantifications require very careful calibrations and DNA preparations of high purity. In addition, since the PCR product has to be relatively short (<150 bp), unspecific amplicon production has to be taken into account. The technique is currently employed for the narG gene .
tRF (terminal restriction fragment) analysis using single restriction enzymes
The advantages and also the artifacts and bias of the analysis of tRF patterns [tRFLP (tRF length polymorphism) method] were extensively discussed by Kitts . In this method, the target gene in DNA extracted from a sample is amplified by PCR using primers of conserved motifs in the gene sequence. Beforehand, one primer is labelled at the 5′-end with a fluorescent dye. The amplified DNA is digested with a restriction enzyme usually recognizing a tetranucleotide sequence. The digests are separated in an automatic DNA sequencer equipped with a fluorescence detector. An automatic fragment analysis program calculates the tRF lengths in bp by comparison with a DNA size standard. The tRF patterns can be used to determine the relative abundance of PCR amplificates in a mixture and to detect shifts in the functional diversity of bacterial communities, e.g. from one nutritional status to the next. However, a tRF cannot be isolated, cloned and sequenced for the identification of the bacterium involved. Several totally unrelated organisms can produce a tRF of the same size when only one restriction enzyme is used, which causes ambiguities in the analysis of tRF profiles. Therefore two new approaches use tRF digests from multiple restriction enzymes and special algorithms, as discussed in the following section.
New tools to assess the biodiversity in a bacterial community from tRFs using multiple restriction enzymes
Two different assignment tools using digests of a series of restriction enzymes have been published recently [36,37]. The PAT tool by Kent et al.  (http://www.trflp.limnology.wisc.edu/index.jsp) utilizes the MICA website (http://mica.ibest.uidaho.edu) to generate a database of tRFs calculated for selected restriction enzymes and all bacterial sequences deposited. In their algorithm (Figure 1A), each tRF of an environmental sample is assigned a collection of species from the database. The program compiles a species list consisting of those organisms in the MICA database having tRFs of the sizes as generated by digestion of the environmental sample. This list of species is then screened for the presence of a tRF both in the environmental sample and after in silico digestion with a second restriction enzyme. Organisms with no scores are eliminated, and the positives are then examined in the environmental sample and the database for possessing a tRF after restriction with a third enzyme. The consecutive generation of files from n restriction enzymes by their algorithm  finally results in a list of species retrieved from an environmental sample (Figure 1A). The effectiveness of the PAT tool was demonstrated with the aquatic microbial communities collected from a humic lake, using the 8F primer labelled with the dye FAM (6-carboxyfluorescein) .
The two different algorithms used to assess bacterial communities in environmental samples
In our TReFID program , tRFs obtained from different restriction enzymes are utilized in parallel (Figure 1B). A 16 S rRNA gene database with the tRF lengths of originally 17327  and currently 22145 bacterial sequences using 13 restriction enzymes was constructed for TReFID, which is accessible at http://www.trefid.net. In this database, each bacterium is represented by a unique pattern consisting of tRFs with distinct bp lengths resulting from restriction with 13 enzymes. This can be visualized as a polygon (Figure 2). The computer program assesses the occurrence of a polygon representative for a bacterium in the tRF mixture obtained by restriction of the DNA from an environmental sample.
A schematic representation of how an organism in an environmental sample is retrieved from the TReFID databank
In TReFID, all 22145 sequences have been examined for primer binding of 63F, and low-quality DNA sequence deposits were discarded. Difficulties were encountered in that the tRF size obtained from a sample does not always match with that of the databanks. For an organism, the predicted tRF length can be up to 4 bp longer than the observed size (see Table 4 in ). In addition, the fragment size determination is somewhat uncertain in electrophoresis. Therefore TReFID utilizes also those tRFs in a DNA sample which deviate from that deposited in the databanks by up to 1.5% (for details see ). The match value was arbitrarily set to 2/3, which means that ≥66% of the tRFs from a polygon retrieved from an environmental sample need to have a counterpart in TReFID after the computer analysis. Controls indicated that (i) tRFs calculated for a clone library of isolates from a soil and deposited in TReFID also occur in the tRFs from DNA extracted from the same soil, (ii) tRF profiles obtained experimentally with DNA from a soil match with those predicted from the sequence information in the TReFID result list and (iii) false positives with tRF polygons related to deposits, but, being taxonomically unrelated, amount to less than 3% .
In contrast with the PAT tool , TReFID utilizes the fluorescently labelled 63F primer. The choice of the 63F primer allows us to utilize 55% of the 40000 GenBank® sequences examined by us. In contrast, only <17% can be used in the case of the 8F primer, since no sequence information upstream of 63F is available in GenBank® for the rest. The TReFID program was extended to nifH and nosZ . The database is sufficient to provide reliable data for nifH, whereas it is still meagre for nosZ, as exemplified for the DNA retrieved from a forest soil .
Our data show that the TReFID program can be used adequately to assess the biodiversity of bacteria along the salt gradient in a potash marsh (Figure 3) (S. Eilmus, C. Rösch and H. Bothe, unpublished work). In such a habitat, the salt load dictates the distribution of plants, which typically form belts. Salicornia europaea occurs at sites with the highest salt load, Puccinellia distans lives in between, whereas Aster tripolium preferentially thrives at lower salt concentrations in the soil. Out of the 1639 bacteria retrieved from the DNA from soil samples taken at the roots of these plants and identified by the use of the TReFID program, a high percentage occurred at the roots of at least two of the plants and a significant portion was found in the soil around all three halophytes (Figure 3). Thus the distribution of bacteria in such a salt marsh is apparently not so strictly dependent on the salt gradient in the soil in contrast with the situation with plants.
Bacterial community analysis of soils taken from the vicinity of the roots of three different plant species in a potash salt marsh
The computer-based tools described by Kent et al.  and by us  could complement each other in analysing the species composition of a community. The tools now allow us to assess all bacteria, regardless of their being culturable yet or not. In the present state, the certainties in the conclusions are somewhat comparable with predictions before parliamentary elections. The situation particularly for functional genes such as nosZ will improve with the ever expanding sequence information in the databanks. However, tRF analysis is strictly dependent on the sequences deposited in the databanks, since fluorescently labelled tRFs cannot be cloned and sequenced. The construction of clone libraries is mandatory to amend the quality of the databanks. Such libraries of the bacterial life in unusual habitats such as the potash mine (S. Eilmus, C. Rösch and H. Bothe, unpublished work) reveal that a large number of as yet undiscovered bacteria, with no sequence identities with any other bacterium, thrive there. Thus computer programs such as PAT and TReFID can now grossly characterize the bacterial life in an environment. However, these tools do not provide information on the major and unusual players in the bacterial community and also give no insight into their physiological activities.
The 11th Nitrogen Cycle Meeting 2005: Independent Meeting held at Estación Experimental del Zaidín, Granada, Spain, 15–17 September 2005. Organized and Edited by E.J. Bedmar (Granada, Spain), M.J. Delgado (Granada, Spain) and C. Moreno-Vivián (Córdoba, Spain).