Schema for H-Inv 7.0 - H-Inv 7.0 Gene Predictions

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

example

SQL type

info

description

bin

smallint(5) unsigned

range

Indexing field to speed chromosome range queries.

chrom

chr1

varchar(255)

values

Reference sequence chromosome or scaffold

chromStart

89295

int(10) unsigned

range

Start position in chromosome

chromEnd

745797

int(10) unsigned

range

End position in chromosome

name

varchar(255)

values

Name of item

score

761

int(10) unsigned

range

Optional score, nominal range 0-1000

strand

char(1)

values

+ or -

thickStart

238436

int(10) unsigned

range

Start of where display should be thick (start codon)

thickEnd

238558

int(10) unsigned

range

End of where display should be thick (stop codon)

reserved

int(10) unsigned

range

Used as itemRgb as of 2004-11-22

blockCount

int(10) unsigned

range

Number of blocks

blockSizes

1109,1340,149,104,157,350,

longblob

Comma separated list of block sizes

chromStarts

0,147320,149123,169722,1778...

longblob

Start positions relative to chromStart

bin

chrom

chromStart

chromEnd

name

score

strand

thickStart

thickEnd

reserved

blockCount

blockSizes

chromStarts

chr1

89295

745797

761

238436

238558

1109,1340,149,104,157,350,

0,147320,149123,169722,177801,656152,

589

chr1

565039

566030

739

565268

565615

991,

589

chr1

565110

566057

745

565268

565615

947,

589

chr1

566464

568060

723

567453

567995

1596,

589

chr1

568148

568747

675

568622

568729

599,

589

chr1

568149

568804

675

568622

568729

655,

589

chr1

568150

568843

675

568622

568729

693,

589

chr1

568156

568842

675

568753

568818

686,

589

chr1

568157

568814

675

568622

568729

657,

589

chr1

568164

568843

675

568622

568729

679,

Description

This track shows alignments of full-length cDNAs that were used as the basis of the H-Invitational Gene Database (HInv-DB version 7.0). This is the version 7.0 update from March 2010.

HInv-DB entries describe the following entities:

gene structures
functions
novel alternative splicing isoforms
non-coding functional RNAs
functional domains
sub-cellular localizations
metabolic pathways
predictions of protein 3D structure
mapping of SNPs and microsatellite repeat motifs in relation with orphan diseases
gene expression profiling
comparative results with mouse full-length cDNAs gene structures

Methods

To cluster redundant cDNAs and alternative splicing variants within the H-Inv cDNAs, a total of 41,118 H-Inv cDNAs were mapped to the human genome using the mapping pipeline developed by the Japan Biological Information Research Center (JBIRC). The mapping yielded 40,140 cDNAs that were aligned against the genome using the stringent criteria of at least 95% identity and 90% length coverage. These 40,140 cDNAs were clustered to 20,190 loci, resulting in an average of 2.0 cDNAs per locus. For the remaining 978 unmapped cDNAs, cDNA-based clustering was applied, yielding 847 clusters. In total, 21,037 clusters (20,190 mapped and 847 unmapped) were identified and integrated into H-InvDB. H-Inv cluster IDs (e.g. HIX0000001) were assigned to these clusters. A representative sequence was selected from each cluster and used for further analyses and annotation.

A full description of the construction of the HInv-DB is contained in the report by the H-Inv Consortium (see References section).

Credits

The H-InvDB is hosted at the Biomedicinal Information Research Center (BIRC), National Institute of Advanced Industrial Science and Technology (AIST) in Japan. The human-curated annotations were produced during invitational annotation meetings held in Japan during the summer of 2002, with a follow-up meeting in November 2004. Participants included 158 scientists representing 67 institutions from 12 countries.

The full-length cDNA clones and sequences were produced by the Chinese National Human Genome Center (CHGC), the Deutsches Krebsforschungszentrum (DKFZ/MIPS), Helix Research Institute, Inc. (HRI), the Institute of Medical Science in the University of Tokyo (IMSUT), the Kazusa DNA Research Institute (KDRI), the Mammalian Gene Collection (MGC/NIH) and the Full-Length Long Japan (FLJ) project.

References

Genome Information Integration Project And H-Invitational 2, Yamasaki C, Murakami K, Fujii Y, Sato Y, Harada E, Takeda J, Taniya T, Sakate R, Kikugawa S et al. The H-Invitational Database (H-InvDB), a comprehensive annotation resource for human genes and transcripts. Nucleic Acids Res. 2008 Jan;36(Database issue):D793-9. PMID: 18089548; PMC: PMC2238988

Imanishi T, Itoh T, Suzuki Y, O'Donovan C, Fukuchi S, Koyanagi KO, Barrero RA, Tamura T, Yamaguchi- Kabata Y, Tanino M et al. Integrative annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol. 2004 Jun;2(6):e162. PMID: 15103394; PMC: PMC393292

Yamasaki C, Murakami K, Takeda J, Sato Y, Noda A, Sakate R, Habara T, Nakaoka H, Todokoro F, Matsuya A et al. H-InvDB in 2009: extended database and data mining resources for human genes and transcripts. Nucleic Acids Res. 2010 Jan;38(Database issue):D626-32. PMID: 19933760; PMC: PMC2808976