Biological Databases


Major Online Repositories

  • NCBI is the real NCBI
    • The NCBI Bookshelf provides free online access to many biology textbooks, including previous editions of some of the most widely used ones.
    • The NCBI Online Mendelian Inheritance in Man (OMIM) is a repository of information about gene mutations associated with human diseases.
    • The European Bioinformatics Institute is the European NCBI
  • The Kyoto Encyclopedia of Genes and Genomes, KEGG is the major repository that is dedicated to systems biology data resources. While the site can be overwhelming, the overview page is a good place to start.
  • The Gene Ontology is the database where scientists are agreeing on standards for representing genes, gene products and their properties.
    • Use the AmiGO browser to explore some of the biological processes, cellular components, and molecular functions catalogued.
  • Education-focused collections: These collections provide sets of tools and databases that have been limited or simplified to make them more appropriate for student use.

Pathway Databases

  • Reactome is an open-source, open access, manually curated, peer-reviewed and highly reliable pathway database.
  • BioCyc is a collection of 1004 Pathway/Genome Databases. Each database in the BioCyc collection describes the genome and metabolic pathways of a single organism.
  • PANTHER Pathways consists of over 165, primarily signaling, pathways, each with subfamilies and protein sequences mapped to individual pathway components.
  • KEGG Pathway is a collection of manually drawn pathway maps representing our knowledge of many molecular interaction and reaction networks.

Genomic Databases

  • The Genome Browser is a database of genomes that have been sequence for . The help page is the place to start learning about what you can do at this site.
  • The Gene Expression Omnibus is a repository of data from experiments analyzing gene expression in particular cells, tissues, or cultures as they change over time or inresponse to perturbation. Start at the documentation page.
  • CCSB Interactome Database is an effort to catalog what we are learning about protein-protein interactions in humans and model organisms. It is still more of a research project than a useable database.
  • The Human Metabolome Database is a repository of information about the identities, abundance and distribution of small molecule metabolites in the human body.

Many other data bases are out their, the big repositories contain large collections of them. Being able to use them to find the information you need, without having to re-run the experiments is essential in life sciences

There are a number of places to find materials about how to use the various biological data repositories including the Molecular Sciences Student Workbench site.