Microbial Genetic Databases

General Genetics Databases Animal Genetics | Cancer Genetics | Human Genetics | Microbial Genetics | Plant & Fungi Genetics | RNA Databases


BSRD - Bacterial Small Regulatory RNA Database

The BSRD (Bacterial Small regulatory RNA Database) hosts sRNAs collected from over 783 bacterial species and 957 strains.


EcoCyc E. coli Database

EcoCyc is a scientific database for the bacterium Escherichia coli K-12 MG1655. The EcoCyc project performs literature-based curation of the entire genome, and of transcriptional regulation, transporters, and metabolic pathways.

Source: SRI International


EcoGene is a database of E. coli including genetic information and analysis tools.

Source: University of Miami

EzTaxon database

Bacterial identification used to be a difficult and tedious task requiring special expertise in each group of microorganisms. This has been greatly improved by the introduction of the 16S rRNA (rDNA) sequence, which is a gene that is 1,500 nucleotide long and is present in all bacteria. 16S rDNA is indeed most widely used in biological barcoding. One can easily identify an unknown isolate by comparing its 16S rDNA sequence with a database of known species. There are millions of 16S rDNA sequences in public databases, it is only sequences of type strains that matter in taxonomic identification. It is therefore important that only taxonomically relevant sequences be included in a database for such a purpose. This is why the EzTaxon database was developed in the first place. The original EzTaxon database (Chun et al., 2007) contained 16S rDNA sequences of the type strains of species with valid names. The EzTaxon database
(Kim et al. 2012) is an extension of the original EzTaxon database, and was developed with the aim of covering uncultured species that are often found in microbial ecological studies.

Source: The Jongsik Chun Lab. and Chunlab, Inc.

HIV databases

The HIV databases contain data on HIV genetic sequences, immunological epitopes, drug resistance-associated mutations, and vaccine trials. The website also gives access to a large number of tools that can be used to analyze these data. This project is funded by the Division of AIDS of the National Institute of Allergy and Infectious Diseases (NIAID), a part of the National Institutes of Health (NIH).

Source: Los Alamos National Security, LLC


Ecology is quickly changing into a data intensive science. Powerful new data storage and computing capabilities allow scientists to explore environments through diverse and large-scale datasets ranging from microbial DNA to satellite images of natural ecosystems. At the forefront are computational infrastructures, in which scientists can share data and knowledge, computational resources and analysis tools for ecological research. Through the iMicrobe project, we aim to extend an existing computational infrastructure called iPlant to create a data commons for microbial data sets taken from diverse environments. We will also develop methods for scientists to find, tag and reuse data sets more easily based on standard terms derived from the environmental context of a sample. This project strives to enhance the use of shared microbial data sets and promote large-scale studies of microbial ecology to understand the Earth system.

Source: iMicrobe


MG-RAST (the Metagenomics RAST) server is an automated analysis platform for metagenomes providing quantitative insights into microbial populations based on sequence data. The server primarily provides upload, quality control, automated annotation and analysis for prokaryotic metagenomic shotgun samples. MG-RAST is Firefox optimized
Source: Metagenomics


Pathogen Portal

Pathogen Portal is a repository linking to the Bioinformatics Resource Centers (BRCs) sponsored by the National Institute of Allergy and Infectious Diseases (NIAID) and maintained by The Virginia Bioinformatics Institute. The BRCs are providing web-based resources to scientific community conducting basic and applied research on organisms considered potential agents of biowarfare or bioterrorism or causing emerging or re-emerging diseases.

Source: Virginia Bioinformatics Institute


PATRIC is the Bacterial Bioinformatics Resource Center, an information system designed to support the biomedical research community’s work on bacterial infectious diseases via integration of vital pathogen information with rich data and analysis tools. PATRIC sharpens and hones the scope of available bacterial phylogenomic data from numerous sources specifically for the bacterial research community, in order to save biologists time and effort when conducting comparative analyses. The freely available PATRIC platform provides an interface for biologists to discover data and information and conduct comprehensive comparative genomics and other analyses in a one-stop shop. PATRIC is a NIH/NIAID-funded project of The University of Chicago with subcontract to the Biocomplexity Institute of Virginia Tech.

Source: University of Chicago

PhAnToMe database

PhAnToMe (Phage Annotation Tools and Methods) is a platform that we are currently developing for phage genome annotations. PhAnToMe will extend the SEED database to handle the nuances of both phages and prophages, establish a consistent nomenclature for phage genes, and develop a new tool for the identification of prophages. This new resource is expected to provide high quality annotations to over 1,000 existing phage and prophage genomes and dozens of existing phage metagenomes.
Source: PhAnToMe



RegulonDB is the primary database on transcriptional regulation in Escherichia coli K-12 containing knowledge manually curated from original scientific publications, complemented with high throughput datasets and comprehensive computational predictions.

Source: CCG/UNAM


SuperPhy, a integrated platform for the predictive genomic analyses of Escherichia coli.
Source: SuperPhy



VectorBase is a Bioinformatics Resource Center (BRC) focused on invertebrate vectors of human disease. VectorBase is one of four Bioinformatics Resource Centers funded by NIAID to provide web-based resources to the scientific community conducting basic and applied research on organisms considered potential agents of biowarfare or bioterrorism or causing emerging or re-emerging diseases.
Source: VectorBase



VIRsiRNAdb is a curated database of experimentally validated viral siRNA/shRNA targeting diverse genes of 42 important human viruses including influenza, SARS and Hepatitis viruses.

Currently database provides detailed experimental information of 1358 siRNA/shRNA which includes siRNA sequence, virus subtype, target gene, GenBank accession, design algorithm, cell type, test object, test method and efficacy (mostly quantitative efficacies). (March 2016)

Source: Institute of Microbial Technology, Chandigarh

