This website uses cookies in order to improve our services. If you proceed visiting this website you accept the usage of cookies. For more info please read our Data Privacy statement.


Plant & Fungi Genetic Databases

General Genetics Databases Animal Genetics | Cancer Genetics | Human Genetics | Microbial Genetics | Plant & Fungi Genetics | RNA Databases


1000 plants (oneKP or 1KP)

The 1000 plants (oneKP or 1KP) initiative is an international multi-disciplinary consortium that has generated large-scale gene sequencing data for over 1000 species of plants. Major supporters include Alberta Ministry of Innovation and Advanced Education, Musea Ventures (Somekh Family Foundation), Beijing Genomics Institute in Shenzhen (BGI-Shenzhen), China National GeneBank (CNGB), iPlant Tree-of-Life (iPToL) Grand Challenge, Compute Canada (Westgrid), Alberta Innovates Technology Futures (AITF-iCORE Strategic Chair). The sample selection was originally based on a series of overlapping sub-projects with scientific objectives that could be addressed by sequencing multiple plant species.

Source: University of Alberta


AspGD is an organized collection of genetic and molecular biological information about the filamentous fungi of the genus Aspergillus. Among its many species, the genus contains an excellent model organism (A. nidulans, or its teleomorph Emericella nidulans), an important pathogen of the immunocompromised (A. fumigatus), an agriculturally important toxin producer (A. flavus), and two species used in industrial processes (A. niger and A. oryzae).

AspGD contains information about genes and proteins of multiple Aspergillus species; descriptions and classifications of their biological roles, molecular functions, and subcellular localizations; gene, protein, and chromosome sequence information; tools for analysis and comparison of sequences; and links to literature information; as well as a multispecies comparative genomics browser tool (Sybil) for exploration of orthology and synteny across multiple sequenced Aspergillus species.

Source: AspGD


AthaMap provides a genome-wide map of potential transcription factor and small RNA binding sites in Arabidopsis thaliana.


Cacao Genome Project

The release of the cacao genome sequence will provide researchers with access to the latest genomic tools, enabling more efficient research and accelerating the breeding process, thereby expediting the release of superior cacao cultivars. The sequenced genotype, Matina 1-6, is representative of the genetic background most commonly found in the cacao producing countries, enabling results to be applied immediately and broadly to current commercial cultivars. Matina 1-6 is highly homozygous which greatly reduces the complexity of the sequence assembly process. While the sequence provided is a preliminary release, it already covers 92% of the genome, with approximately 35,000 genes. We will continue to refine the assembly and annotation, working toward a complete finished sequence. Updates will be made available via this website.


Candida Genome Database

This is the home of the Candida Genome Database, a resource for genomic sequence data and gene and protein information for Candida albicans and related species. CGD is based on the Saccharomyces Genome Database and is funded by the National Institute of Dental & Craniofacial Research at the US National Institutes of Health.

Source: Leland Stanford Junior University

Chlamydomonas Resource Center

The Resource Center is funded by the National Science Foundation and located in the Department of Plant Biology at the University of Minnesota. Our mission is to maintain and distribute materials for Chlamydomonas research and to provide information to researchers.

Chlamydomonas is a haploid unicellular eukaryote; each cell contains a chloroplast similar to those of plants and swims with two flagella (cilia) similar to those found in numerous other eukaryotic groups including mammals. In 2007, the haploid nuclear genome was sequenced and found to encode approximately 15,000 genes. The mitochondrial and chloroplast genomes also have been sequenced.

Source: University of Minnesota

Coffee Genome Hub

The Coffee Genome Hub is an integrated web-based database providing centralized access to coffee community genomics, genetics and breeding data and analysis tools to facilitate basic, translational and applied research in coffee.

Data available are the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics (ESTs, RNA-Seq), genetic markers and genetic maps. The hub provides also tools for easy querying, visualizing and downloading research data.

Source: Coffee

Conifer Genome Network (CGN)

The Conifer Genome Network (CGN) is a virtual nexus for researchers working in conifer genomics. The goal of the CGN is to facilitate information exchange among researchers throughout the world and to serve as a forum for advancing conifer genome sciences. The CGN web site is maintained by the Dendrome Project at the University of California, Davis. We encourage all researchers to join the CGN and alert the webmaster of conifer genome websites for posting and important developments in the field.

Source: Conifer Genome Network


CottonGen is a new cotton community genomics, genetics and breeding database being developed to enable basic, translational and applied research in cotton. It is being built using the open-source Tripal database infrastructure.

CottonGen will initially consolidate the data from CottonDB and the Cotton Marker Database, which includes sequences, genetic and physical maps, genotypic and phenotypic markers and polymorphisms, QTLs, pathogens, germplasm collections and trait evaluations, pedigrees, and relevant bibliographic citations.

Source: Mainlab at Washington State University


Dendrome is a collection of forest tree genome databases and other forest genetic information resources for the international forest genetics community. Dendrome is part of a larger collaborative effort to construct genome databases for major crop and forest species.



DictyBase is the central resource for Dictyostelid genomics.


FLOR-ID (FLOweRing Interactive Database)

Welcome to FLOR-ID (FLOweRing Interactive Database). These pages contain detailed information about gene networks involved in the flowering-time control of Arabidopsis thaliana.

The database contains information about 306 genes, published in 1646 articles, authored by 4606 scientists (March 2016)


Genome Database for Rosaceae (GDR)

Initiated in 2003, the Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database providing centralized access to Rosaceae genomics, genetics and breeding data and analysis tools to facilitate basic, translational and applied Rosaceae research.

GDR is supported by grants from the NSF Plant Genome Program (2003-2008), USDA NIFA Specialty Crop Research Program (2009-2019), USDA NIFA National Research Support Project 10 (2014-2019), and the Washington Tree Fruit Research Commission (2008-2016), Clemson University, University of Florida and Washington State University.



Gramene is a curated, open-source, integrated data resource for comparative functional genomics in crops and model plant species. Our goal is to facilitate the study of cross-species comparisons using information generated from projects supported by public funds. Gramene currently hosts annotated whole genomes in over two dozen plant species and partial assemblies for almost a dozen wild rice species in the Ensembl browser, genetic and physical maps with genes, ESTs and QTLs locations, genetic diversity data sets, structure-function analysis of proteins, plant pathways databases (BioCyc and Plant Reactome platforms), and descriptions of phenotypic traits and mutations.



GreenPhylDB is a web resource designed for comparative and functional genomics in plants. The database contains a catalogue of gene families based on gene predictions of genomes, covering a broad taxonomy of green plants.
Result of our automatic clustering is manually annotated and analyzed by a phylogenetic-based approach to predict homologous relationships. It supports evolution and functional studies to identify candidate gene affecting agronomic traits in crops.

Source: Bioversity International


LegumeIP is an integrative database and bioinformatics platform for comparative genomics and transcriptomics to facilitate the study of gene function and genome evolution in legumes,to understand mechanisms that are fundamental to the legume species, especially the process of nitrogen-fixing endosymbiosis, which will be of great value to healthy, low input sustainable agriculture by decreasing the use of fertilizers, and ultimately to develope molecular based breeding tools to improve yield and quality of crop legumes.

Source: The Samuel Roberts Noble Foundation, Inc.

LIS - Legume Information System

LIS contains agronomically useful information in legume species, including genome sequences, genetic maps, genes, gene families, and mapped traits. We will be maintaining this data in two locations: "classic LIS" at  and at this "legumeinfo" URL:

The "legumeinfo" site makes use of several open-source projects (Tripal and Chado), facilitating more efficient collaboration with other research groups.



Maize Genetics and Genomics Databases (MaizeGDB) is a community-oriented, long-term, federally funded informatics service to researchers focused on the crop plant and model organism Zea mays.



OrthologID Online identifies orthologous groups for complete genomes compiled in our database (Orthologous Group Search), and classifies user-input query sequences into orthologous groups generated from complete genomes (Query Orthology Classification). It identifies diagnostic characters that define each orthologous group, as well as diagnostic characters responsible for classifying query sequences. The output is presented in phylogenetic tree format.



The Oryzabase is a comprehensive rice science database established in 2000 by rice researcher's committee in Japan. The database is originally aimed to gather as much knowledge as possible ranging from classical rice genetics to recent genomics and from fundamental information to hot topics.

Source: NBRP-Rice


Panzea is an NSF-funded project called "Biology of Rare Alleles in Maize and its Wild Relatives". We are investigating the connection between phenotype (what we see) and genotype (the genes underlying the phenotype) - of complex traits in maize and its wild relative, teosinte, and specifically in how rare genetic variations contribute to overall plant function.


PASD - Plant Alternative Splicing Database


PGSB Arabidopsis thaliana database

PGSB Arabidopsis thaliana database is the www access to data of the Arabidopsis Genome Initiative compiled, analysed, annotated and stored at PGSB by the PGSB Arabidopsis group and enhanced by data from many external contributors. TIGR maintains an alternative Arabidopsis genome database.

MAtDB contains all Arabidopsis sequences and annotation produced by the Arabidopsis Genome Initiative, plus the mitochondrial and chloroplast genomes.

Source: Helmholtz Zentrum

Plant microRNA Knowledge Base (PmiRKB)

PmiRKB includes the miRNAs of two model plants, Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa). Four major functional modules, "SNPs", "Pri-miRNAs", "MiR—Tar", and "Self-reg", are provided.

Source: Ming Chen's Lab


The PlantRNA database compiles tRNA gene sequences retrieved from fully annotated plant nuclear, plastidial and mitochondrial genomes. This database is hosted and maintained by the Plant Molecular Biology Institute (IBMP) of Strasbourg (

The PlantRNA database provides, in addition to tRNA gene sequences with their (linear) secondary structure, other important biological information.

Currently, the PlantRNA database contains genome wide information concerning five flowering plants (Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Medicago truncatula, Brachypodium dystachyon), a bryophyte (Physcomitrella patens), a glaucophyte (Cyanophora paradoxa), two green algae (Chlamydomonas reinhardtii, Ostreococcus tauri), a brown alga (Ectocarpus siliculosus) and a pennate diatom (Phaeodactylum tricornutum).



Phytozome is the Plant Comparative Genomics portal of the Department of Energy's Joint Genome Institute. Families of related genes representing the modern descendants of ancestral genes are constructed at key phylogenetic nodes. These families allow easy access to clade-specific orthology/paralogy relationships as well as insights into clade-specific novelties and expansions. As of release v11, Phytozome provides access to fifty-eight sequenced and annotated green plant genomes, fifty-two of which have been clustered into gene families at 15 evolutionarily significant nodes. Each gene has been annotated with PFAM, KOG, KEGG, PANTHER and GO assignments, where possible. Query-based data access is provided by Phytozome's PhytoMine and BioMart instances, while bulk data sets can be accessed via the JGI's Genome Portal (registration required). JBrowse genome browsers are available for all genomes.

Source: The Regents of the University of California


PomBase is a new model organism database that provides organization of and access to scientific data for the fission yeast Schizosaccharomyces pombe. PomBase supports genomic sequence and features, genome-wide datasets and manual literature curation.

PomBase also provides a community hub for researchers, providing genome statistics, a community curation interface, news, events, documentation and mailing lists.


Rice Annotation Project (RAP)

The Rice Annotation Project (RAP) was conceptualized in 2004 upon the completion of the Oryza sativa ssp. japonica cv. Nipponbare genome sequencing by the International Rice Genome Sequencing Project with the aim of providing the scientific community with an accurate and timely annotation of the rice genome sequence.

One of the major objectives of this project is to facilitate a comprehensive analysis of the genome structure and function of rice on the basis of the annotation.

Source: National Institute of Agrobiological Sciences

Rice Expression Profile Database (RiceXPro)

The Rice Expression Profile Database (RiceXPro) is a repository of gene expression profiles derived from microarray analysis of tissues/organs encompassing the entire growth of the rice plant under natural field conditions, rice seedlings treated with various phytohormones, and specific cell types/tissues isolated by laser microdissection (LMD).

This database is part of a project on rice transcriptome analysis using microarray technology aimed at characterizing the expression profile of all predicted genes in rice and providing reference information that can be used in functional genomics.

Source: NIAS National Institute of Agrobiological Sciences, Kannondai

RiceGAAS: Rice Genome Automated Annotation System

RiceGAAS is a rice genome automated annotation system. This system integrates programs for prediction and analysis of protein-coding gene structure.

Integrated softwares are coding region prediction programs ( GENSCAN, RiceHMM, FGENESH, MZEF ), splice site prediction programs ( SplicePredictor ), homology search analysis programs ( Blast, HMMER, ProfileScan, MOTIF ), tRNA gene prediction program ( tRNAscan-SE ), repetitive DNA analysis programs ( RepeatMasker, Printrepeats ), signal scan search program ( Signal Scan ), protein localization site prediction program ( PSORT ), and program of classification and secondary structure prediction of membrane proteins ( SOSUI ).


Rice Mutant Database (RMD)

Rice Mutant Database (RMD) is developed by the Wuhan group of a joint national program, the National Special Key Program on Rice Functional Genomics of China, and maintained by the National Center of Plant Gene Research (Wuhan) at Huazhong Agricultural University.

RMD currently contains the information of approximate 129,000 rice T-DNA insertion (enhancer trap) lines generated by an enhancer trap system. (March 2016)

Source: National Center of Plant Gene Research (Wuhan)

Rice SNP-Seek Database

This site provides Genotype, Phenotype, and Variety Information for rice (Oryza sativa L.). SNP genotyping data (called against Nipponbare reference Os-Nipponbare-Reference-IRGSP-1.0) came from 3,000 Rice Genomes Project . Phenotype and passport data for the 3,000 rice varieties came from the International Rice Information System (IRIS).

We are a part of an ongoing effort by the International Rice Informatics Consortium (IRIC) to centralize information access to rice research data and provide computational tools to facilitate rice improvement via discovery of new gene-trait associations and accelerated breeding.

Source: International Rice Research Institute, Los Banos


RiceVarMap provides comprehensive information of 6,551,358 single nucleotide polymorphisms (SNPs) and 1,214,627 insertions/deletions (INDELs) identified from sequencing data of 1,479 rice accessions. The SNP genotypes of all accessions were imputed and evaluated, resulting in an overall missing data rate of 0.42% and an estimated accuracy greater than 99%.

The SNP/INDEL genotypes of all accessions are available for online queries and downloading. Users can search SNPs/INDELs by identifiers of the SNPs/INDELs, genomic regions, gene identifiers and keywords of gene annotation.

Allele frequencies within various sub-populations and the effects of the variation that may alter the protein sequence of a gene are also listed for each SNP/INDEL.

Source: National Center of Plant Gene Research (Wuhan)

Saccharomyces Genome Database (SGD)

The Saccharomyces Genome Database (SGD) provides comprehensive integrated biological information for the budding yeast Saccharomyces cerevisiae along with search and analysis tools to explore these data, enabling the discovery of functional relationships between sequence and gene products in fungi and higher organisms.

Source: Stanford University

The Arabidopsis Information Resource (TAIR)

The Arabidopsis Information Resource (TAIR) maintains a database of genetic and molecular biology data for the model higher plant Arabidopsis thaliana . Data available from TAIR includes the complete genome sequence along with gene structure, gene product information, gene expression, DNA and seed stocks, genome maps, genetic and physical markers, publications, and information about the Arabidopsis research community. Gene product function data is updated every week from the latest published research literature and community data submissions. TAIR also provides extensive linkouts from our data pages to other Arabidopsis resources.

Source: TAIR

The Banana Genome Hub

The Banana Genome Hub centralises databases of genetic and genomic data for the Musa acuminata crop Hub developed by Cirad and Bioversity International and supported by the South Green Bioinformatics platform. Data available are the complete genome sequence along with gene structure, gene product information, metabolism, gene families, transcriptomics (ESTs, RNA-Seq), genetic markers (SSR, DArT, SNPs) and genetic maps.

Source: Cirad (UMR AGAP)

The Plant DNA C-values Database

The Plant DNA C-values Database currently contains data for 8510 plant species. It combines data from the Angiosperm DNA C-values Database (release 8.0, Dec 2012), Gymnosperm DNA C-values Database (release 5.0, Dec. 2012), the Pteridophyte DNA C-values Database (release 5.0, Dec. 2012), the Bryophyte DNA C-values Database (release 3.0, Dec. 2010), together with Algae DNA C-values database (release 1.0, Dec. 2004).
Source: KEW Royal Botanic Gardens



UNITE is a rDNA sequence database designed to provide a stable and reliable platform for sequence-borne identification of ectomycorrhizal asco- and basidiomycetes. It has many of the characteristics of other sequence databases, but one of the things that sets UNITE apart from these is sequence reliability. We aim at including only high-quality sequences of well identified fungi, hence initially sacrifying quantity for quality.
Source: NordForsk


Yeast snoRNA database

Small Nucleolar RNAs (snoRNAs) from the Yeast Saccharomyces cerevisiae: A comprehensive database of S. cerevisiae H/ACA and C/D box snoRNAs.


Pin It