Identification and reconstruction of various transcriptional regulons in bacteria using a computational comparative genomics approach is coming of age. During the past decade a large number of manually-curated high quality inferences of transcriptional regulatory interactions were accumulated for diverse taxonomic groups of bacteria. These data provide a good foundation for understanding molecular mechanisms of transcriptional regulation, identification of regulatory circuits, and interconnections among circuits within the cell.
The RegPrecise is a database for capturing, visualization and analysis of transcription factor regulons that were reconstructed by the comparative genomic approach.
The primary object of the database is a single regulon in a particular genome, which is described by the identified transcription factor (TF), its DNA binding site model (a profile), as well as the set of regulated genes, operons and associated transcription factor binding sites (TFBSs).
The key idea of comparative genomics approach is that the analysis of regulons of a particular TF should be carried out in several genomes simultaneously. Therefore RegPrecise provides a special object of the higher hierarchy - regulog, which represents a collection of regulons of the same TF inferred in a set of closely related genomes.
The third, highest level in the hierarchical organization of the data in RegPrecise is a collection. Currently regulogs are organized into collections of three types:
- By taxonomic group: complete set of regulogs analyzed within a particular taxonomic group of closely-related genomes.
- By transcription factor: set of regulogs operated by the same transcription factor across various taxonomic groups.
- By functional classification: set of various regulogs that control genes from a particular metabolic pathway or biological process.
Thorough comparison of the reconstructed regulons in Shewanella species with known regulons in model bacterium Escherichia coli demonstrated striking differences in the overall regulatory strategy in these two lineages of gamma-proteobacteria. Multiple interesting trends in diversification and adaptive evolution of TRNs between lineages were detected including regulon "shrinking", "expansion", "mergers", and "split-ups", as well as multiple cases of using nonorthologous regulators to control equivalent pathways or orthologous regulators to control distinct pathways.
Bacterial transcriptional regulation is very flexible and mostly not conserved between various taxonomic groups. Therefore investigation of regulation by computational genomic approaches need to be done with careful curation of data. RegPrecise database is a depository of manually curated taxonomy-specific reconstructions of bacterial regulons.
The fatty acid degradation fad genes are controlled by two nonorthologous regulators, FadR in E. coli and PsrA in Shewanella. Novel regulon PsrA controls not only fad genes in Shewanella but also other genes from the central metabolism. Though a FadR ortholog is present in Shewanella, its regulon is significantly reduced (2 targets). The fatty acid biosynthesis regulon FabR has both conserved and variable parts between E. coli and Shewanella. [Kazakov et al., 2008]
Many aspects of metabolic regulation in Shewanella species are substantially different from regulatory models known in E. coli. Among the most notable are the differences in regulons for the central carbohydrate pathways. In E. coli the central carbon metabolism is controlled by catabolic regulators FruR and Crp, whereas Shewanella species use two other transcription factors, HexR and PdhR, for this control. NagC and NagR are nonorthologous regulons implicated in the control of utilization of N-acetylglucosamine in E. coli and Shewanella, respectively. [Fredrickson et al., 2008]