Schema for Genome In a Bottle - Genome In a Bottle Structural Variants and Trios
  Database: hg38    Primary Table: giabSv Data last updated: 2020-11-13
Big Bed File Download: /gbdb/hg38/giab/structuralVariants/giabSv.bb
Item Count: 25,422
Format description: Genome In a Bottle Structural Variants (dbVar nstd175)
fieldexampledescription
chromchr1Reference sequence chromosome or scaffold
chromStart166077871Start position in chromosome
chromEnd166077996End position in chromosome
namenssv15764312Short Name of item
score0Score from 0-1000
strand.+ or -
thickStart166077871Start of where display should be thick (start codon)
thickEnd166077996End of where display should be thick (stop codon)
reserved255,0,0Used as itemRgb as of 2004-11-22
size125Size of variant
varTypedeletionType of structural variant
varRegionnsv4439980Variant Region
dbVarUrlhttps://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4439980dbVar Region Link
sample_nameUnknownSample Name
sampleset_nameHG002Sampleset Name
phenotypenot_reportedPhenotype
_mouseOverPosition: chr1:166077872-166077996, Size: 125, Variant Type: deletion, Phenotype: not_reportedmouseover string

Sample Rows
 
chromchromStartchromEndnamescorestrandthickStartthickEndreservedsizevarTypevarRegiondbVarUrlsample_namesampleset_namephenotype_mouseOver
chr1166077871166077996nssv157643120.166077871166077996255,0,0125deletionnsv4439980https://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4439980UnknownHG002not_reportedPosition: chr1:166077872-166077996, Size: 125, Variant Type: deletion, Phenotype: not_reported
chr1166077871166077996nsv44399800.166077871166077996128,128,128125copy_number_variationhttps://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4439980Position: chr1:166077872-166077996, Size: 125, Variant Type: copy_number_variation
chr1166463980166464039nssv157676670.166463980166464039255,0,059deletionnsv4439981https://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4439981UnknownHG002not_reportedPosition: chr1:166463981-166464039, Size: 59, Variant Type: deletion, Phenotype: not_reported
chr1166463980166464039nsv44399810.166463980166464039128,128,12859copy_number_variationhttps://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4439981Position: chr1:166463981-166464039, Size: 59, Variant Type: copy_number_variation
chr1166677436166677437nssv157570280.1666774361666774370,0,2551insertionnsv4448384https://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4448384UnknownHG002not_reportedPosition: chr1:166677437-166677437, Size: 1, Variant Type: insertion, Phenotype: not_reported
chr1166677436166677437nsv44483840.1666774361666774370,0,2551insertionhttps://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4448384Position: chr1:166677437-166677437, Size: 1, Variant Type: insertion
chr1166740088166740089nssv157614850.1667400881667400890,0,2551insertionnsv4448385https://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4448385UnknownHG002not_reportedPosition: chr1:166740089-166740089, Size: 1, Variant Type: insertion, Phenotype: not_reported
chr1166740088166740089nsv44483850.1667400881667400890,0,2551insertionhttps://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4448385Position: chr1:166740089-166740089, Size: 1, Variant Type: insertion
chr1167205988167205989nssv157611730.1672059881672059890,0,2551insertionnsv4448386https://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4448386UnknownHG002not_reportedPosition: chr1:167205989-167205989, Size: 1, Variant Type: insertion, Phenotype: not_reported
chr1167205988167205989nsv44483860.1672059881672059890,0,2551insertionhttps://www.ncbi.nlm.nih.gov/dbvar/variants/nsv4448386Position: chr1:167205989-167205989, Size: 1, Variant Type: insertion

Genome In a Bottle (giab) Track Description
 

Description

The tracks listed here contain data from The Genome in a Bottle Consortium (GIAB), an open, public consortium hosted by NIST. The priority of GIAB is to develop reference standards, reference methods, and reference data by authoritative characterization of human genomes for use in benchmarking, including analytical validation and technology development that will support translation of whole human genome sequencing to clinical practice. The sole purpose of this work is to provide validated variants and regions to enable technology and bioinformatics developers to benchmark and optimize their detection methods.

The Ashkenazim and the Chinese Trio tracks show benchmark SNV calls from two son/father/mother trios of Ashkenazi Jewish and Han Chinese ancestry from the Personal Genome Project, consented for commercial redistribution.

The Genome In a Bottle Structural Variants track shows benchmark SV calls (nssv) and variant regions (nsv) (5,262 insertions and 4,095 deletions, > 50 bp, in 2.51 Gb of the genome) from the son (HG002/NA24385) from the Ashkenazi Jewish trio.

Samples are disseminated as National Institute of Standards and Technology (NIST) Reference Materials.

Display Conventions and Configuration

These tracks are multi-view composite tracks that contain multiple data types (views). Each view within a track has separate display controls, as described here.

Unlike a regular genome browser track, the Ashkenazim and the Chinese Trio tracks display the genome variants of each individual as two haplotypes; SNPs, small insertions and deletions are mapped to each haplotype based on the phasing information of the VCF file. The haplotype 1 and the haplotype 2 are displayed as two separate black lanes for the browser window region. Each variant is drawn as a vertical dash. Homozygous variants will show two identical dashes on both haplotype lanes. Phased heterozygous variants are placed on one of the haplotype lanes and unphased heterozygous variants are displayed in the area between the two haplotype lanes.

Predicted de novo variants and variants that are inconsistent with phasing in the trio son can be colored in red using the track Configuration options.

Data Access

The raw data can be explored interactively with the Table Browser, or the Data Integrator. For automated analysis, the data may be queried from our REST API.

Benchmark VCF and BED files for small variants are available for GRCh37 and GRCh38 under each genome at NCBI FTP site. Structural variants are available for GRCh37 at dbVAR nst175.

References

Zook JM, McDaniel J, Olson ND, Wagner J, Parikh H, Heaton H, Irvine SA, Trigg L, Truty R, McLean CY et al. An open resource for accurately benchmarking small variant and reference calls. Nat Biotechnol. 2019 May;37(5):561-566. PMID: 30936564; PMC: PMC6500473

Zook JM, Hansen NF, Olson ND, Chapman L, Mullikin JC, Xiao C, Sherry S, Koren S, Phillippy AM, Boutros PC et al. A robust benchmark for detection of germline large deletions and insertions. Nat Biotechnol. 2020 Jun 15;. PMID: 32541955