This website uses cookies in order to improve our services. If you proceed visiting this website you accept the usage of cookies. For more info please read our Data Privacy statement.


Cancer Genetic Databases

General Genetics Databases Animal Genetics | Cancer Genetics | Human Genetics | Microbial Genetics | Plant & Fungi Genetics | RNA Databases



arrayMap is a curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. The arrayMap database provides an entry point for meta-analysis and systems level data integration of high-resolution oncogenomic CNA data.

Source: arrayMap

Atlas of Genetics and Cytogenetics in Oncology and Haematology

- The Atlas of Genetics and Cytogenetics in Oncology and Haematology is a peer reviewed on-line journal, encyclopaedia and database in free access on the Internet, devoted to genes, cytogenetics, and clinical entities in cancer, and cancer-prone diseases.

- The aim is to cover the entire field under study: as the task is huge, the Atlas is -and will be- incomplete by that very fact.

- It presents structured reviews (cards) or traditional review papers ('deep insights'), a portal towards genetics and/or cancer databases and journals, teaching items in Genetics for students in Medicine and in Sciences, and a case report in hematology section.

- It is made for and by: clinicians and researchers in cytogenetics, molecular biology, oncology, haematology, and pathology. Contributions are reviewed before acceptance.

- It deals with cancer research and genomics. It is at the crossroads of research, virtual medical university (university and post-university e-learning), and telemedicine. It contributes to "meta-medicine", this mediation, using new information technology, between the increasing amount of knowledge and the individual, having to use the information. Towards a personalized medicine of cancer.


BRCA Share™

BRCA Share is a novel gene datashare initiative that provides scientists and commercial laboratory organizations around the world with open access to BRCA1 and BRCA2 genetic data. The program’s goal is to accelerate research on BRCA gene mutations, particularly variants of uncertain significance, to improve the ability of clinical laboratory diagnostics to predict which individuals are at risk of developing these cancers.

BRCA Share is an open user group co-founded by Inserm and Quest Diagnostics. Other participants include Laboratory Corporation of America (LabCorp) and the French UNICANCER Genetic Group (UGG), composed of sixteen academic laboratories performing BRCA1 and BRCA2 testing in France.

Source: BRCA Share™

Breast Cancer Now Tissue Bank Bioinformatics Portal

The Breast Cancer Now Tissue Bank Bioinformatics Portal is a major component of the Breast Cancer Now Tissue Bank.

Our portal is designed to address various cancer research problems, ranging from patient data and specimen type, through cancer development stage to expression patterns. Scientists are able to refine biological data queries according to various criteria across different experimental data types such as transcriptomics, genomics and proteomics to look for information on relevant genes, transcripts, proteins, microRNA, SNPs and other biological data. By bringing complex data together, it is possible for scientists to query and explore results and find new relationships among the factors that contribute to the complex pathogenesis of cancer.

Source: Breast Cancer Now Tissue Bank

Cancer Cell Metabolism Gene Database (ccmGDB)

Cancer Cell Metabolism Gene Database (ccmGDB) is a comprehensive annotation resource for cell metabolism genes in cancer. The objective of this database is to serve both the cancer cell metabolism and broader research communities by providing a useful resource about functional annotation of cell metabolism genes in various cancer types.

Source: Vanderbilt University

Cancer Genome Anatomy Project (CGAP)

The NCI's Cancer Genome Anatomy Project sought to determine the gene expression profiles of normal, precancer, and cancer cells, leading eventually to improved detection, diagnosis, and treatment for the patient. Resources generated by the CGAP initiative are available to the broad cancer community. Interconnected modules provide access to all CGAP data, bioinformatic analysis tools, and biological resources allowing the user to find "in silico" answers to biological questions in a fraction of the time it once took in the laboratory.

National Cancer Institute


Welcome to the open access database CancerResource. It is a comprehensive knowledgebase for drug-target relationships related to cancer as well as for supporting information or experimental data. Furthermore, large-scale cancer genomics data is integrated into the CancerResource database including mRNA expression and non-synonymous mutations data. Therefore, CancerResource allows an explorative data analysis based on cancer related drug-target interactions, expression and mutation data as well as drug sensitivity data.


Cancer RNA-Seq Nexus

A comprehensive database of phenotype-specific transcriptome profiling in cancer cells.
The CRN database includes 40 cancers (e.g. lung cancer, colon cancer and breast cancer) and 325 phenotype-specific subsets. Each subset contains a group of RNA-seq samples with specific phenotype or genotype, e.g. breast cancer stage II, ER+ breast cancer and Her2+ breast cancer. Thus, CRN database can facilitate the personalized medicine. For example, the triple-negative breast cancer is not responsive to current targeted therapeutics with characteristic of negative expressed ER, PR, and Her2/Neu.


Candidate Cancer Gene Database (CCGD)

The Candidate Cancer Gene Database (CCGD) was developed to disseminate the results of transposon-based forward genetic screens in mice that identify candidate cancer genes. The purpose of the database is to allow cancer researchers to quickly determine whether or not a gene, or list of genes, has been identified as a potential cancer driver in a forward genetic screen in mice.

Source: University of Minnesota


canSAR is an integrated knowledge-base that brings together multidisciplinary data across biology, chemistry, pharmacology, structural biology, cellular networks and clinical annotations, and applies machine learning approaches to provide drug-discovery useful predictions.


Catalogue of somatic mutations in cancer (COSMIC)

All cancers arise as a result of the acquisition of a series of fixed DNA sequence abnormalities, mutations, many of which ultimately confer a growth advantage upon the cells in which they have occurred. There is a vast amount of information available in the published scientific literature about these changes. COSMIC is designed to store and display somatic mutation information and related details and contains information relating to human cancers.

There are two types of data in COSMIC: Expert manual curation data and systematic screen data. It is useful to understand the differences of these data types and use them appropriately.

Source: COSMIC


The cBioPortal for Cancer Genomics provides visualization, analysis and download of large-scale cancer genomics data sets.


CellLineNavigator database

The CellLineNavigator database is a web-based workbench for large scale comparisons of a vast amount of diverse cell lines to support experimental design in the fields of genomics, systems biology and translational biomedical research. Currently, this compendium holds genome wide expression profiles of 317 different cancer cell lines, categorized into 57 different pathological states and 28 individual tissues. To enlarge the scope of CellLineNavigator the database was furthermore closely linked to commonly used bioinformatics databases and knowledge repositories.


Cervical Cancer Gene Database

The Cervical Cancer Gene Database is a manually curated catalog of experimenatlly validated genes that are thought to be involved in the different stages of cervical carcinogenesis. Each entry contains information regarding the gene and protein sequences, its location, architecture, function, chromosomal positions, accession numbers, gene, CDS sizes, gene ontology and homology to other eukaryotic genomes. In addition we provide rich cross reference to other web resources like Unigene, HPRD, HGNC, Ensemble and OMIM augmenting CCDB-specific information with external data. CCDB also provides relevant literature references of genes included in the database.



Chromosome translocation and gene fusion are frequent events in the human genome and are often the cause of many types of tumor. ChimerDB is the database of fusion sequences encompassing bioinformatics analysis of mRNA and EST sequences in the GenBank, manual collection of literature data, and integration with other known database such as OMIM. Our bioinformatics analysis identifies the fusion transcripts that have non-overlapping alignments at multiple genomic loci. Fusion events at exon-exon borders are selected to filter out the cloning artifacts in cDNA library preparation.

Source: Ewha Womans University

Colorectal Cancer Atlas

Colorectal Cancer Atlas is a database that catalogs multiple data types pertaining to
Quantitative and non-quantitative protein expression data obtained from various techniques including mass spectrometry, Western blotting, immunohistochemistry, confocal microscopy, immunoelectron microscopy and FACS
Mutation data obtained by large and small scale sequencing
Pathway data from Reactome, NCI, Cell map and HumanCyc
Protein-protein interactions from BioGRID and HPRD
Gene Ontology data from Entrez Gene

Source: Colorectal Cancer Atlas

Computational Cancer Genomics (CCG)

The Computational Cancer Genomics (CCG) lab of the Swiss Institute of Bioinformatics (SIB) is dedicated to the development of analysis tools and database resources related to genome structure and gene regulation.

Source: Swiss Institute of Bioinformatics

DPSC-Cancer database

The DPSC-Cancer database is a collection of shRNA dropout signatures profiles, covering ~16000 human genes, and derived from more than 70 Pancreatic, Ovarian and Breast human cancer cell-lines using the microarray detection platform developed in the DPSC (Donnelly - Princess Margaret Screening Centre) facility at the Moffat Lab. All shRNA dropout profiles are freely available through download or queries via this website.

Source: University of Toronto

FORCE Genetic Mutation Database

FORCE has created a Genetic Mutation Database to allow users to search for a particular mutation and connect with others who have the same mutation.

Use the Mutation Search form to search by mutation, or ethnicity. Use the Submit Mutation form to enter your own information.

Source: FORCE

IARC TP53 Database

The IARC TP53 Database compiles various types of data and information on human TP53 gene variations related to cancer. Data are compiled from the peer-reviewed literature and from generalist databases.

Source: IARC

Integrated Tumor Transcriptome Array and Clinical data Analysis

ITTACA is a database created for Integrated Tumor Transcriptome Array and Clinical data Analysis. ITTACA centralizes public datasets containing both gene expression and clinical data and currently focuses on the types of cancer that are of particular interest to the Institut Curie: breast carcinoma, bladder carcinoma, and uveal melanoma. ITTACA is developed by the Institut Curie Bioinformatics group and the Molecular Oncology group of UMR144 CNRS/Institut Curie.

Source: Institut Curie


IntOGen collects and analyses somatic mutations in thousands of tumor genomes to identify cancer driver genes.

Source: IntOGen


Lnc2Cancer, is a manually curated database that provides comprehensive experimentally supported associations between lncRNA and human cancer. The current version of Lnc2Cancer documents 1,239 entries of associations between 579 human lncRNAs and 93 human cancers through review of more than 1,500 published papers.

Source: Lnc2Cancer


The database of human DNA Methylation and Cancer (MethyCancer) is developed to study interplay of DNA methylation, gene expression and cancer. It hosts both highly integrated data of DNA methylation, cancer-related gene, mutation and cancer information from public resources, and the CpG Island (CGI) clones derived from our large-scale sequencing. Interconnections between different data types were analyzed and presented. Search tool and graphical MethyView are developed to help users access all the data and data connections and view DNA methylation in context of genomics and genetics data.

Source: Chinese Academy of Sciences


The MutationAligner website enables you to explore mutation hotspots identified in protein domains from more than 5000 patients across 22 cancer types. Using multiple sequence analysis, protein domain hotspots are identified by tallying missense mutations across analogous residues of domain-containing genes.

Source: MutationAligner

My Cancer Genome

​My Cancer Genome is a personalized cancer medicine knowledge resource for physicians, patients, caregivers and researchers.
My Cancer Genome gives up-to-date information on what mutations make cancers grow and related therapeutic implications, including available clinical trials.


NCG - Network of Cancer Genes

NCG 5.0 reports information on:

evolutionary appearance
of a manually curated list of 1,571 protein-coding cancer genes (26. February 2016)



The Oncomine™ Platform—from web applications to translational bioinformatics services—provides solutions for individual researchers and multinational companies, with robust, peer-reviewed analysis methods and a powerful set of analysis functions that compute gene expression signatures, clusters and gene-set modules, automatically extracting biological insights from the data.

Source: Thermo Fisher Scientific Inc.


The Progenetix database provides an overview of copy number abnormalities in human cancer from currently 31915 array and chromosomal Comparative Genomic Hybridization (CGH) experiments.

Additionally, the website attempts to identify and present all publications (currently 2743 articles), referring to cancer genome profiling experiments.The database & software are developed by the group of Michael Baudis at the University of Zurich.

Source: Progentix

RTCGD: Retrovirus and Transposon tagged Cancer Gene Database

The data of Retrovirus Integration Sites (RISs) in the RTCGD is obtained from using a high-throughput inverse PCR method or splinkerette method. We cloned and sequenced independent RISs from tumor and then added these sequences to RTCGD. Then, by using public UCSC mouse genome server, we positioned these RIS sequences in the mouse genome and identified candidate genes.

In addition to retrovirus tagging, Sleeping Beauty (SB) transposon systems are used for mouse mutagenesis. In RTCGD, users can also search the results of cancer-gene screening by SB transposon system.



The goal of the SNP500Cancer project is to resequence 102 reference samples to find known or newly discovered single nucleotide polymorphisms (SNPs) which are of immediate importance to molecular epidemiology studies in cancer. SNP500Cancer provides a central resource for sequence verification of SNPs.


Stem Cell Discovery Engine

SCDE enables integrated access to tissue and cancer stem cell experimental information and molecular profiling analysis tools.

Source: Harvard Stem Cell Institute

TCGA SpliceSeq

TCGA SpliceSeq is a resource for investigation of cross-tumor and tumor-normal alterations in mRNA splicing patterns of The Cancer Genome Atlas project (TCGA) RNASeq data.

This site presents interactive visualizations of transcript splicing patterns and splicing event details from analysis performed by our SpliceSeq application.

Cross-tumor analysis and plots use 15 randomly selected samples from 33 different tumor types and when available 15 independent normal samples. Most TCGA disease studies contain hundreds of samples and our data download page provides full sets of per sample splice event data.


The Cancer Genome Atlas (TCGA)

The Cancer Genome Atlas (TCGA) Data Portal provides a platform for researchers to search, download, and analyze data sets generated by TCGA. It contains clinical information, genomic characterization data, and high level sequence analysis of the tumor genomes.

Source: TCGA

The Familial Cancer Database (FaCD)

The goal of FaCD is to assist clinicians and genetic counselors in making a genetic differential diagnosis in cancer patients, as well as in becoming aware of the tumor spectrum associated with hereditary disorders that have already been diagnosed in their patients. FaCD is not an expert system, but a tool for experts. It is not a substitute for consulting an expert on the clinical genetics of cancer.

Source: FaCD

The Mouse Tumor Biology (MTB) Database

The Mouse Tumor Biology (MTB) Database supports the use of the mouse as a model system of hereditary cancer by providing electronic access to:

Information on endogenous spontaneous and induced tumors in mice, including tumor frequency & latency data,
Information on genetically defined mice (inbred, hybrid, mutant, and genetically engineered strains of mice) in which tumors arise,
Information on genetic factors associated with tumor susceptibility in mice and somatic genetic-mutations observed in the tumors,
Tumor pathology reports and images,
References, supporting MTB data
Links to other online resources for cancer


TSGene database

Since our publication of TSGene 1.0 in 2013, we have received many inquires for more details about our TSGene. Addtionally, many more cancer datasets, especially those from pan-cancer studies, have been released since we published TSGene. In this updated version, we collected hundreds of additional tumor suppressor genes from literature. TSGene2.0 now contains 1217 human genes (1018 coding and 199 non-coding genes) curated from a total of over 9000 PubMed abstracts.

The primary aim of TSGene 2.0 is to support cancer research by maintaining a high quality tumor suppressor gene list for pan-cancer analysis. This database serve a comprehensive, fully classified, richly and accurately annotated tumor suppressor gene knowledgebase, with extensive cross-references and querying interfaces freely accessible to the scientific community.

Source: Vanderbilt University

Tumor Associated Gene

The tumor-associated gene (TAG) database was designed to utilize information from well-characterized oncogenes and tumor suppressor genes to facilitate cancer research. All target genes were identified through text-mining approach from the PubMed database.
A semi-automatic information retrieving engine was built to collect specific information of these target genes from various resources and store in the TAG database. At current stage, 662 TAGs including 246 oncogenes, 265 tumor suppressor genes, and 151 genes related to oncogenesis were collected. Information collected in TAG database can be browsed through user-friendly web interfaces that provide searching genes by chromosome or by keywords.

Source: NCKU Bioinformatics Center

UCSC Cancer Genomics Browser

The browser is a suite of web-based tools to visualize, integrate and analyze cancer genomics and its associated clinical data. It is developed and maintained by the UCSC Cancer Genomics Group, led by David Haussler and Josh Stuart, working closely with the UCSC Genome Browser team at the Center for Biomolecular Science and Engineering (CBSE) at the University of California Santa Cruz (UCSC).

Source: University of California Santa Cruz

Pin It