Human Genome Center have constructed and provided various kinds of biological databases. In addition some major databases which are constructed by other institutes can also be searched through entry retrieval system developed in HGC. Followings are available.

JSNP database is a collection of approximately 200,000 gene-based SNP information that includes their location, methods to genotype them, their allele frequencies in Japanese general population. This project was one of the Prime Minister's National Millennium Projects in Japan, with the collaboration of Human Genome Center (HGC), Institute of Medical Science (IMS), The University of Tokyo, and Japan Science and Technology Agency (JST).


The Cell System Markup Language (CSML) is an XML format for modeling, visualizing and simulating biopathways. CSML supports to represent several pathway types including metabolic, signaling, and genetic regulatory pathways. This project aims to facilitate the exchange of biopathway data in different formats. Effort has been made for data conversion from other XML formats. In addition, to allow extensible and flexible features of CSML, the Cell System Ontology (CSO) has been developed.


The HiGet service provides full text searches on various biological databases including GenBank, RefSeq, UniProt, PDB, PROSITE and OMIM. The field specific search can narrow down the number of entries retrieved to obtain more specific results.


Database of human transcriptional start sites and full-length cDNAs (Prof. Sugano and Prof. Nakai)


Database of transcription factors and promoters of Bacillus subtilis from literatures (Prof. Nakai et al.)


Database of tunicate promoters, transcription factors and conserved regulatory regions.


Full length cDNA
Full length cDNA database supported by NEDO (Prof. Sugano's group)

>Full length cDNA

Aberrant Splicing Database
An old collection of aberrant splicing (i.e., abnormal splicing caused mostly by point mutations and revealed as hereditary diseases) by Prof. Nakai (HGC)

>Aberrant Splicing Database

Database of homologous, experimentally determined, protein-protein interactions across 9 species.


The Macrophage Pathway Knowledgebase (MACPAK) is a computational system which allows biomedical researchers to query and study the dynamic behaviors of macrophage molecular pathways. It integrates the knowledge of 230 reviews that were carefully checked by specialists for their accuracy and then converted to 230 dynamic mathematical pathway models. MACPAK comprises a total of 24,009 entities and 12,774 processes and is described in the Cell System Markup Language (CSML), an XML format that runs on the Cell Illustrator platform and can be visualized with a customized Cytoscape for further analysis.


Database of predicted true protein-protein interactions in high-throughput interaction datasets.


CGED (Cancer Gene Expression Database) is a database of gene expression profile and accompanying clinical information. This database offers graphical presentation of expression and clinical data with similarity search and sorting functions.


eF-site is a database for molecular surface of proteins along with the electrostatic potential and functional site information. It's named after "electrostatic surface of functional site".


Rat Genome Map
radiation hybrid map of OLETF rat by Otsuka GEN Res. Inst., Otsuka Pharm. Co.,Ltd and others

>Rat Genome Map

Full-malaria is the database of full-length cDNAs of parasites. 5'-end-one-pass sequences of the cDNA libraries produced from the erythrocytic malaria parasites and the tachyzoites of toxoplasma parasites are mapped onto the genome sequences. It also contains 5'-end-one-pass sequences of the cyst of Echinococcus multilocularis.


MBGD is a database for comparative analysis of completely sequenced microbial genomes, the number of which is now growing rapidly. The aim of MBGD is to facilitate comparative genomics from various points of view such as ortholog identification, paralog clustering, motif analysis and gene order comparison. MBGD is now maintained at National Institute for Basic Biology.


Bacillus subtilis ORF DB by JAFAN (Japan Functional Analysis Network of B. subitlis)


ATTED-II is a database for gene coexpression networks in Arabidopsis. The networks are constructed using publicly available microarray data.


COXPRESdb is a database for gene coexpression networks in human, mouse and rat. The networks are constructed using publicly available microarray data.


Top of Page Top of Page

The University of Tokyo The Institute of Medical Science

Copyright©2005-2017 Human Genome Center