This website uses cookies in order to improve our services. If you proceed visiting this website you accept the usage of cookies. For more info please read our Data Privacy statement.


List of RNA Sequence Databases

General Genetics Databases Animal Genetics | Cancer Genetics | Human Genetics | Microbial Genetics | Plant & Fungi Genetics | RNA Databases

This website presents a list of different RNA databases including databases for miRNAs, noncoding RNAs, piRNAs, ribosomal RNAs, snoRNAs, tmRNAs, and tRNAs.


The roles of AU-rich elements (AREs) in post-transcriptional gene expression regulation are now well established, with significant relevance to certain disease states resulting from aberrations in these pathways. Recent experimental results strongly suggest the possibility of active ARE-mediated regulation of pre-mRNA processing. We expanded ARED Organism to search for AREs in the introns of human genes and found them in 9,114 additional genes, raising the full repertoire of the ARE-regulome to at least 50% of human protein coding genes. Here we report this significant new expansion as well as a number of additional enhancements to ARED Organism.

Source: King Faisal Specialist Hospital


Provides a total of 212,950 circRNAs including 53,687 novel RNAs. To reduce the false positives of circRNAs, we defined high-confidence circRNAs by selecting circular junction sites with supporting experiments >= 3. It resulted in 34,000 high-confidence circRNAs.

Collects 464 RNA-seq samples (without PolyA selection) covering 26 different human tissues, and provides an interface to show circRNA expression profiles across these human tissues.

Predicts circRNA-miRNA interactions and integrates with miRNA-target interactions to generate circRNA-miRNA-gene regulatory networks.

Provides genomic annotation of circRNAs with a genome browser integrated into the web interface.



deepBase, a platform for annotating and discovering small and long ncRNAs (microRNAs, siRNAs, piRNAs...) from next generation sequencing data. deepBase allows the mapping, storage, retrieval, analysis, integration, annotation, mining and visualization of next generation sequencing data from different technological platforms, tissues and cell lines of different organisms. deepBase also provided an integrative, interactive and versatile web graphical interface to display multidimensional data, and facilitate transcriptomic research and the discovery of novel ncRNAs.



DIANA LAB, Fleming: Computational predictive models are a key element of current systems biology. The focus of the DIANA lab is on the development of algorithms, databases and tools for interpreting and archiving genomic data in the framework of a systemic analysis. Current emphasis is on the analysis of microRNA (miRNA) and protein coding genes. MiRNAs are recently identified to be very abundant in mammalian organisms and play a key role in regulating development.

Source: Univ. of Thessaly & IMIS - "Athena" RC


DIANA-TarBase was initially released in 2006 and it was the first database aiming to catalogue published experimentally validated miRNA:gene interactions. DIANA-TarBase v7.0, provides for the first time hundreds of thousands of high quality manually curated experimentally validated miRNA:gene interactions, enhanced with detailed meta-data.
With DIANA-TarBase v7.0 you can easily identify positive or negative experimental results, the utilized experimental methodology, experimental conditions including cell/tissue type and treatment. The new interface provides also advanced information ranging from the binding site location, as identified experimentally as well as in silico, to the primer sequences used for cloning experiments.
More than half a million miRNA:gene interactions have been curated from published experiments implemented utilizing 356 different cell types from 24 species. DIANA-TarBase v.7 indexes 9 to 250-fold more entries than any other relevant database.



Exosomes are 30-150 nm membrane vesicles of endocytic origin secreted by most cell types in vitro. ExoCarta, an exosome database, provides with the contents that were identified in exosomes in multiple organisms.



The genomic tRNA database contains tRNA gene predictions made by tRNAscan-SE on complete or nearly complete genomes. Unless otherwise noted, all annotation is automated, and has not been inspected for agreement with published literature.

Source: University of California Santa Cruz

Integrated microbial genomes database


LNCipedia is an integrated database of 111,685 (February 2015) human annotated lncRNA transcripts obtained from different sources. In addition to basic transcript information and structure, several statistics are calculated for each entry in the database, such as secondary structure information, protein coding potential and microRNA binding sites.

The database is publicly available and allows users to query and download lncRNA sequences and structures based on different search criteria. The database may serve as a source of information on individual lncRNAs or as a starting point for large-scale studies.

Source: LNCipedia

lncrnadb - Long Noncoding RNA Database

The Reference Database For Functional Long Noncoding RNAs


lncRNASNP database

Long non-coding RNAs (lncRNAs) are a class of non-protein coding RNAs >200 nt in length, which are emerging as key factors in the regulation of various cellular processes. LncRNASNP is a database providing comprehensive resources of single nucleotide polymorphisms (SNPs) in human/mouse lncRNAs. It contains SNPs in lncRNAs, SNP effects on lncRNA structure and lncRNA:miRNA binding. LncRNASNP also integrates GWAS data and miRNA expression data into functional SNP selection for genetic association studies.

Source: Guo Lab, College of Life Science and Technology


mESAdb is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language.

Source: Konu Lab - Bilkent University

MethylTranscriptome DataBase (MeT-DB)

Methyltranscriptome is an exciting new area that studies the mechanisms and functions of methylation in transcripts. The MethylTranscriptome DataBase (MeT-DB) is the first comprehensive resource for N6-methyladenosine (m6A) mammalian methyltranscriptome. It includes a database that records publically available datasets from methylated RNA immunoprecipitation sequencing (MeRIP-Seq)(Fig.1), a recently developed technology for interrogating m6A methyltranscriptome. All data are either aligned to UCSC/hg19 or UCSC/mm9 genome build, depending on the whether the samples are human or mouse, respectively.


microRNA - Targets and Expression
Predicted microRNA targets & target downregulation scores. Experimentally observed expression patterns.

Source: Memorial Sloan-Kettering Cancer Center

miR2Disease Database

miR2Disease , a manually curated database, aims at providing a comprehensive resource of miRNA deregulation in various human diseases. Each entry in the miR2Disease contains detailed information on a miRNA-disease relationship, including miRNA ID, disease name, a brief description of the miRNA-disease relationship, miRNA expression pattern in the disease state, detection method for miRNA expression, experimentally verified miRNA target gene(s), and literature reference.



The miRBase database is a searchable database of published miRNA sequences and annotation. Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR).
Source: University of Manchester



miRDB is an online database for miRNA target prediction and functional annotations. All the targets in miRDB were predicted by a bioinformatics tool, MirTarget, which was developed by analyzing thousands of miRNA-target interactions from high-throughput sequencing experiments. Common features associated with miRNA target binding have been identified and used to predict miRNA targets with machine learning methods. miRDB hosts predicted miRNA targets in five species: human, mouse, rat, dog and chicken.

Source: miRDB


mirEX2 is a comprehensive platform for comparative analysis of primary microRNA expression data. RT–qPCR-based gene expression profiles are stored in a universal and expandable database scheme and wrapped by an intuitive user-friendly interface.



miRNEST is an integrative collection of animal, plant and virus microRNA data.

The database provides you with:

a) microRNAs from our high-throughput predictions as well as from external databases
b) predicted targets for plant candidates and experimental target support
c) integrated data from 15 external databases, which includes e.g. sequences, polymorphism, expression, promoters.
d) mirtrons, miRNA gene structures, degradome data and more!

miRNEST is being gradually developed to create an integrative resource of miRNA-associated data. The data comes from our computational predictions (new miRNAs, targets, mirtrons, miRNA gene structures) as well as from other databases and publications.

Source: Laboratory of Functional Genomics


miROrtho contains predictions of precursor miRNA genes covering several animal genomes combining orthology and a Support Vector Machine. We provide homology extended alignments of already known miRBase families and putative miRNA families exclusively predicted by our SVM and orthology pipeline.

Source: UniGe / SIB


miRTarBase: the experimentally validated microRNA-target interactions database

As a database, miRTarBase has accumulated more than three hundred and sixty thousand miRNA-target interactions (MTIs), which are collected by manually surveying pertinent literature after NLP of the text systematically to filter research articles related to functional studies of miRNAs. Generally, the collected MTIs are validated experimentally by reporter assay, western blot, microarray and next-generation sequencing experiments. While containing the largest amount of validated MTIs, the miRTarBase provides the most updated collection by comparing with other similar, previously developed databases.

Source: ISBLab

NAPP (Nucleic Acids Phylogenetic Profiling)

NAPP (Nucleic Acids Phylogenetic Profiling) is a clustering method that efficiently identifies noncoding RNA (ncRNA) elements in a bacterial genome. In short, the intergenic regions of a reference genome are tiled into overlapping 50-nt segments, and all tiles and coding sequences are classified based on their occurrence profiles in 1000 other genomes. Tiles corresponding to actual ncRNAs tend to cluster together and with certain types of protein-coding genes. We term these "RNA-rich clusters". Any non-annotated tile in such clusters can be considered as a strong ncRNA candidate (sRNA, cis-acting RNA or other ncRNAs). Furthermore, certain clusters are enriched for genes in specific functional classes, which permits to draw hypotheses on the function of associated ncRNAs.

This web server enables users to retrieve RNA-rich clusters from any genome in a list of 1000+ sequenced bacterial genomes. RNA-rich clusters can be viewed separately or, alternatively, all tiles from RNA-rich clusters can be contiged into larger elements and retrieved at once as a CSV or GFF file for use in a genome browser or comparison with other predictions/RNA-seq experiments



NONCODE is an integrated knowledge database dedicated to non-coding RNAs (excluding tRNAs and rRNAs). Now, there are 16 species in NONCODE(human, mouse, cow, rat, chicken, fruitfly, zebrafish, celegans, yeast, Arabidopsis, chimpanzee, gorilla, orangutan, rhesus macaque, opossum and platypus).The source of NONCODE includes literature and other public databases. We searched PubMed using key words ‘ncrna’, ‘noncoding’, ‘non-coding’,‘no code’, ‘non-code’, ‘lncrna’ or ‘lincrna.

We retrieved the new identified lncRNAs and their annotation from the Supplementary Material or web site of these articles. Together with the newest data from Ensembl , RefSeq, lncRNAdb and GENCODE were processed through a standard pipeline for each species.



NPInter documents functional interactions between noncoding RNAs (except tRNAs and rRNAs) and biomolecules (proteins, RNAs and DNAs) which are experimentally verified. By functional interactions, we mean primarily physical interactions, although several interactions of other forms also appear here.

Interactions are manually collected from publication in peer-reviewed journals, followed by an annotation process against known databases including NONCODE, miRBase and UniProt. We introduce a classification of the functional interaction data, which is based on the functional interaction process the ncRNA takes part in. NPInter also provides an efficient search option, allowing discovery of interactions, related publications and other information.

Source: ChenLab


piRNABank is a web analysis system and resource, which provides comprehensive information on piRNAs in the three widely studied mammals namely Human, Mouse, Rat and one fruit fly, Drosophila.
Source: Institute of Bioinformatics and Applied Biotechnology (IBAB)


piRNA cluster database

A resource for genomic piRNA clusters in animal species

Statistics for the current piRNA cluster database release:
Species in piRNA cluster database 12
Total number of SRA datasets 112
Total number of proTRAC runs 128
Total number of piRNA clusters 33964

Source: University Mainz


RNA editing is the post-transcriptional modification of RNA nucleotides from their genome encoded sequence. The most common type of editing in metazoans is deamination of Adenosine into Inosine catalyzed by the ADAR family of proteins. Subsequently, Inosine is interpreted as Guanosine by the cellular machinery.

The development of high-throughput sequencing technologies has enabled the transcriptome-wide identification of A-to-I editing sites. This database aims to present a comprehensive collection of A-to-I editing sites in human, mouse, and fly transcripts. Useful annotations were incorporated as described in the tutorial.

Source: Stanford University


The Rfam database is a collection of RNA families, each represented by multiple sequence alignments, consensus secondary structures and covariance models (CMs).
Source: EMBL-EBI



RMBase (RNA Modification Base) is designed for decoding the landscape of RNA modifications identified from high-throughput sequencing datasets. It contains ~226000 N6-Methyladenosines (m6A), ~9500 pseudouridine (Ψ) modifications, ~1000 5-methylcytosine (m5C) modifications, ~1210 2′-O-methylations (2′-O-Me) and ~3130 other types of RNA modifications.

RMBase demonstrated thousands of RNA modifications located within mRNAs, regulatory ncRNAs (e.g. lncRNAs, miRNAs, pseudogenes, circRNAs, snoRNAs, tRNAs), miRNA target sites and disease-related SNPs.

Source: Qu Lab, Sun Yat-sen University


RNAcentral is a public resource that offers integrated access to a comprehensive and up-to-date set of non-coding RNA sequences provided by a collaborating group of Expert Databases. The development of RNAcentral is coordinated by European Bioinformatics Institute and is funded by BBSRC.

Source: EMBL-EBI


RNA FRABASE: an engine with database to search the three-dimensional fragments within 3D RNA structures using as an input the sequence(s) and / or secondary structure(s) given in the dot-bracket notation.



snOPY is snoRNA orthological gene database. snOPY provides comprehensive information about snoRNAs, snoRNA gene loci and target RNAs.

Source: University of Miyazaki


snoRNABase, a comprehensive database of human H/ACA and C/D box snoRNAs.



SomamiR is a database of cancer somatic mutations in microRNAs (miRNA) and their target sites that potentially alter the interactions between miRNAs and competing endogenous RNAs (ceRNA) including mRNAs, circular RNAs (circRNA) and long noncoding RNAs (lncRNA). It also provides an integrated platform for the functional analysis of these somatic mutations.

Source: Yan Cui's Lab at University of Tennessee Health Science Center


Small non-coding RNAs (sRNAs) carry out a variety of biological functions and affect protein synthesis and protein activities in prokaryotes. Recently, numerous sRNAs and their targets were identified in Escherichia coli and in other bacteria. It is crucial to have a comprehensive resource concerning the annotation of small non-coding RNAs in microbial genomes.

This work presents an integrated database, namely sRNAMap, to collect the sRNA genes, the transcriptional regulators of sRNAs and the sRNA target genes by integrating a variety of biological databases and by surveying literature. In this resource, we collected 397 sRNAs, 62 regulators/sRNAs and 60 sRNAs/targets in seventy microbial genomes.

Source: National Chiao Tung University


SRPDB assists the study of the structures and functions of signal recognition particles (SRPs). SRPDB provides annotated SRP RNA and SRP protein sequences, phylogenetically ordered and aligned. Included are representative RNA secondary structure diagrams where each base pair is proven by comparative sequence analysis, information about other proteins that play a role in SRP-mediated protein transloaction, as well as structural information about components of SRP.



starBase is designed for decoding Pan-Cancer and Interaction Networks of lncRNAs, miRNAs, competing endogenous RNAs(ceRNAs), RNA-binding proteins (RBPs) and mRNAs from large-scale CLIP-Seq (HITS-CLIP, PAR-CLIP, iCLIP, CLASH) data and tumor samples (14 cancer types, >6000 samples).

starBase is also developed for deciphering Protein-RNA and miRNA-target interactions, such as protein-lncRNA, protein-sncRNA, protein-mRNA, protein-pseudogene, miRNA-lncRNA, miRNA-mRNA, miRNA-circRNA, miRNA-pseudogene, miRNA-sncRNA interactions and ceRNA networks from 108 CLIP-Seq (HITS-CLIP, PAR-CLIP, iCLIP, CLASH) datasets.

Source: Sun Yat-sen University


The tmRDB is a tool in the study of the structures and functions of the tmRNA (earlier called "10S RNA"). As the name implies, tmRNA has properties of tRNA and mRNA combined in a single molecule. The tmRDB provides aligned, annotated and phylogenetically ordered tmRNA sequences. The alignments of the sequences represent conserved secondary structure elements where each base pair is proven by comparative sequence analysis. Where possible, we established direct links to primary sources.



tRFs are 14-32 base long single-stranded RNA derived from mature or precursor tRNA and are distinct from the stress-induced tRNA fragments created by cleavage in the anti-codon loop. tRFs have been shown to be important for cell-cycle progression and for regulating the dynamics of RISC. We report a comprehensive database of tRFs prepared from publicly available high-throughput sequencing data of >50 short RNA libraries.

Source: tRFdb


The tRNA Gene DataBase Curated by Experts "tRNADB-CE" was constructed by analyzing 2,588 complete and 24,892 draft genomes of Bacteria and Archaea, 151 complete virus genomes, 121 complete chloroplast genomes, 12 complete eukaryote (Plant and Fungi) genomes and approximately 230 million DNA sequence entries that originated from environmental metagenomic clones.

This exhaustive search for tRNA genes from DDBJ/EMBL/GenBank was performed by running tRNAscan-SE, a computer program widely used for tRNA gene searches, in combination with ARAGORN and tRNAfinder, to enhance completeness and accuracy of the prediction.

Source: Niigata University


UTRdb is a curated database of 5' and 3' untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated and also collated as the UTRsite database where more specific information on the functional motifs and cross-links to interacting regulatory protein are provided.

In the current update the UTR entries have been organized in a gene-centric structure to better visualize and retrieve 5' and 3'UTR variants generated by alternative initiation and termination of transcription and alternative splicing. Experimentally validated miRNA targets and conserved sequence elements are also annotated. The integration of UTRdb with genomic data has allowed the implementation of an efficient annotation system and a powerful retrieval resource for the selection and extraction of specific UTR subsets.

Source: University of Bari

ym500v2 miR-Seq Database

YM500 is an integrated database for miRNA quantification, isomiR identification, arm switching discovery and novel miRNA prediction from small RNA sequencing (smRNA-seq).

In this update of YM500, we focus on the cancer miRNAome to make the database more disease-orientated. There are more than 8,000 cancer-related smRNA-seq datasets (including those of primary tumours, paired normal tissues, PBMC, and metastatic tissues) incorporated into YM500v2.

Source: National Yang-Ming University

Ribosomal RNA Databases

IRESite database

The IRESite database presents information about the experimentally studied IRES (Internal Ribosome Entry Site) segments. IRES regions are known to attract eukaryotic ribosomal translation initiation complex and thus promote translation initiation independently of the presence of the commonly utilized 5'-terminal 7mG cap structure.

Source: IRESite

ITS2 database

Internal transcriber spacer 2 ribosomal RNA database


RDP (Ribosomal Database)

RDP (Ribosomal Database) provides quality-controlled, aligned and annotated Bacterial and Archaeal 16S rRNA sequences, and Fungal 28S rRNA sequences, and a suite of analysis tools to the scientific community.
Source: Michigan State University



Ribosome profiling is a technique that provides genome wide information of translated mRNA based on deep sequencing of ribosome protected mRNA fragments (RPF). The current version of database contains 777 samples from 82 studies in eight species, processed and reanalyzed by a unified pipeline.

There are two ways to query the database: by keywords of studies or by genes. The outputs are presented in three levels. 1) Study level: including meta information of studies and reprocessed data for gene expression of translated mRNAs; 2) Sample level: including global perspective of translated mRNA and a list of the most translated mRNA of each sample from a study; 3) Gene level: including normalized sequence counts of translated mRNA on different genomic location of a gene from multiple samples and studies.

Source: Sun Yat-sen University


SILVA provides comprehensive, quality checked and regularly updated datasets of aligned small (16S/18S, SSU) and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences for all three domains of life (Bacteria, Archaea and Eukarya). SILVA are the official databases of the software package ARB.
Source: Silva


The greengenes web

The greengenes web application provides access to the 2011 version of the greengenes 16S rRNA gene sequence alignment for browsing, blasting, probing, and downloading. The data and tools presented by greengenes can assist the researcher in choosing phylogenetically specific probes, interpreting microarray results, and aligning/annotating novel sequences. If you are an ARB user, you can use greengenes to keep your own local database current.
Source: greengenes


Pin It