Schema for GTEx Transcript - Transcript Expression in 53 tissues from GTEx RNA-seq of 8555 samples/570 donors
  Database: hg19    Primary Table: gtexTranscExpr Data last updated: 2021-10-05
Big Bed File Download: /gbdb/hg19/gtex/gtexTranscExpr.bb
Item Count: 176,219
The data is stored in the binary BigBed format.

Format description: BED6+5 with additional fields for category count and median values, and sample matrix fields
fieldexampledescription
chromchr1Reference sequence chromosome or scaffold
chromStart166304120Start position in chromosome
chromEnd166304999End position in chromosome
nameENST00000425271Name or ID of item, ideally both human readable and unique
score555Score from 0-1000, typically derived from total of median value from all categories
strand++ or - for strand. Use . if not applicable
name2RP11-479J7.2Alternative name for item
expCount53Number of categories
expScores0.13,0,0,0,0,0,0,1.3,1.5,2.4,1.4,1.2,1.8,1.8,1.7,0.77,5.8,1.6,0.13,0.21,0.12,0,0,0,0.19,0.25,0,0,0,0,0,0.35,0.52,0,0,0,0,0.089,0.14,0,0,0.35,0,0.098,0.16,0,0,0,0.54,0,0.12,0,0Comma separated list of category values
_dataOffset2277699797Offset of sample data in data matrix file, for boxplot on details page
_dataLen41826Length of sample data row in data matrix file

Sample Rows
 
chromchromStartchromEndnamescorestrandname2expCountexpScores_dataOffset_dataLen
chr1166304120166304999ENST00000425271555+RP11-479J7.2530.13,0,0,0,0,0,0,1.3,1.5,2.4,1.4,1.2,1.8,1.8,1.7,0.77,5.8,1.6,0.13,0.21,0.12,0,0,0,0.19,0.25,0,0,0,0,0,0.35,0.52,0,0,0,0,0.089,0 ...227769979741826
chr1166356963166421869ENST00000448643444-RP11-479J7.1530.3,0.24,0.19,0.44,0.42,0.41,0.29,0.084,0.1,0.1,0.73,0.91,0.14,0.12,0.093,0.11,0.14,0.077,0.11,0.072,0.34,0.16,0.13,0.28,0.38,0. ...285570535467333
chr1166445008166459276ENST00000426519111-RP11-276E17.2530,0,0,0,0,0,0.079,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.42,0,0,0,0,0,0,0,0,0.084,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.055,0.33,0,0.26,0,0,0,0230817917928887
chr1166808683166818445ENST00000449930444+POGK530.21,0.23,0.17,0.35,0.33,0.37,0.41,0.27,0.17,0.23,0.3,0.28,0.21,0.22,0.31,0.23,0.18,0.21,0.57,0.34,0.19,0.98,1.4,0.32,0.29,0.2,0 ...288939644460805
chr1166808693166825581ENST00000367876888+POGK532.9,2.7,1.8,4.8,4.2,5.1,5.5,5.3,5.3,5.1,6.3,5,5.5,7.2,5.6,5.8,4.8,5.3,10,7.2,3,4.6,5.4,5,5,3.5,2.4,3.6,3.1,3.8,3.8,1.5,0.95,1.5, ...108809830662502
chr1166809325166823710ENST00000367875999+POGK536,6.1,4.1,8.9,8.1,6.4,10,5.1,4,4.5,9.3,10,5.4,4.6,5.2,4.8,4.1,4.5,8.6,6,5.9,7.9,6.7,6.3,8.6,6.7,6.3,7.7,7.5,8,7.8,3.3,2.7,3.8,2. ...108803618062125
chr1166825746166832453ENST00000467021666-TADA1531.3,0.86,0.78,1.3,1.1,1.2,1.3,0.32,0.28,0.3,1.2,1.3,0.52,0.39,0.29,0.35,0.35,0.27,0.55,0.39,1.5,0.94,0.51,1.1,1.6,1.2,1.2,1.2,0. ...341474162665379
chr1166825746166845564ENST00000367874999-TADA1534.2,3.6,6.5,4.8,4.9,4.9,7.2,4.5,6.7,5.2,14,10,7.2,9.3,5.1,6.1,6.3,4,5.4,4.7,4.9,12,5.2,7,8.1,6.1,5,5.4,5.7,5.6,7.7,2.9,2.3,4.1,6 ...108797402362156
chr1166864947166877317ENST00000414590666-ILDR2530,0.094,0,0,0,0,0.1,3.5,5,0.93,2,1.7,5.3,5.7,3,2.9,1.3,0.53,0.38,0.56,0,0,0,0,0.11,0.35,0.093,0.23,0.081,0.29,0,0.21,0.072,0.16, ...200825284841584
chr1166877300166887691ENST00000614979111-ILDR2530,0,0,0,0,0,0.17,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.21,0,0.0044,0,0.071,0.039,0,0,0.1,0,0.084,0.0086,0,0,0,0,0,0.083,0,0,0.064,0,0,0, ...830967460137286

GTEx Transcript (gtexTranscExpr) Track Description
 

Description

The NIH Genotype-Tissue Expression (GTEx) project was created to establish a sample and data resource for studies on the relationship between genetic variation and gene expression in multiple human tissues. This track displays median transcript expression levels in 53 tissues, based on RNA-seq data from the GTEx midpoint milestone data release (V6, October 2015). To view the GTEx tissues in anatomical context, see the GTEx Body Map.

Data for this track were computed at UCSC from GTEx RNA-seq sequence data using the Toil pipeline running the kallisto transcript-level quantification tool.

Display Conventions

In Full and Pack display modes, expression for each transcript is represented by a colored bar chart, where the height of each bar represents the median expression level across all samples for a tissue, and the bar color indicates the tissue.

The bar chart display has the same width and tissue order for all transcripts. Mouse hover over a bar will show the tissue and median expression level. The Squish display mode draws a rectangle for each gene, colored to indicate the tissue with highest expression level if it contributes more than 10% to the overall expression (and colored black if no tissue predominates). In Dense mode, the darkness of the grayscale rectangle displayed for the transcript reflects the total median expression level across all tissues.

Click-through on a graph displays a boxplot of expression level quartiles with outliers, per tissue.

Methods

Tissue samples were obtained using the GTEx standard operating procedures for informed consent and tissue collection, in conjunction with the National Cancer Institute Biorepositories and Biospecimen. All tissue specimens were reviewed by pathologists to characterize and verify organ source. Images from stained tissue samples can be viewed via the NCI histopathology viewer. The Qiagen PAXgene non-formalin tissue preservation product was used to stabilize tissue specimens without cross-linking biomolecules.

RNA-seq was performed by the GTEx Laboratory, Data Analysis and Coordinating Center (LDACC) at the Broad Institute. The Illumina TruSeq protocol was used to create an unstranded polyA+ library sequenced on the Illumina HiSeq 2000 platform to produce 76-bp paired end reads at a depth averaging 50M aligned reads per sample.

Sequence reads for this track were quantified to the hg38/GRCh38 human genome using kallisto assisted by the GENCODE v23 transcriptome definition. Read quantification was performed at UCSC by the Computational Genomics lab, using the Toil pipeline. The resulting kallisto files were combined to generate a transcript per million (TPM) expression matrix using the UCSC tool, kallistoToMatrix. Average TPM expression values for each tissue were calculated and used to generate a bed6+5 file that is the base of the track. This was done using the UCSC tool, expMatrixToBarchartBed. The bed track was then converted to a bigBed file using the UCSC tool, bedToBigBed.

The data in the hg19/GRCh37 version of this track was generated by converting the coordinates from the hg38/GRCh38 track data. Of the 189,615 BED entries from the original hg38 track, 176,220 were mapped over by transcript name to hg19 using wgEncodeGencodeCompV24lift37 (~93% coverage).

Subject and Sample Characteristics

The scientific goal of the GTEx project required that the donors and their biospecimen present with no evidence of disease. The tissue types collected were chosen based on their clinical significance, logistical feasibility and their relevance to the scientific goal of the project and the research community. Postmortem samples were collected from non-diseased donors with ages ranging from 20 to 79. 34.4% of donors were female and 65.6% male.

Additional summary plots of GTEx sample characteristics are available at the GTEx Portal Tissue Summary page.

Credits

Samples were collected by the GTEx Consortium. RNA-seq was performed by the GTEx Laboratory, Data Analysis and Coordinating Center (LDACC) at the Broad Institute. John Vivian, Melissa Cline, and Benedict Paten of the UCSC Computational Genomics lab were responsible for the sequence read quantification used to produce this track. Kate Rosenbloom and Chris Eisenhart of the UCSC Genome Browser group were responsible for data file post-processing and track configuration.

References

J. Vivian et al., Rapid and efficient analysis of 20,000 RNA-seq samples with Toil bioRxiv bioRxiv, vol. 2, p. 62497, 2016.

GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat Genet. 2013 Jun;45(6):580-5. PMID: 23715323; PMC: PMC4010069

Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS, Peter-Demchok J, Gelfand ET et al. A Novel Approach to High-Quality Postmortem Tissue Procurement: The GTEx Project. Biopreserv Biobank. 2015 Oct;13(5):311-9. PMID: 26484571; PMC: PMC4675181

Melé M, Ferreira PG, Reverter F, DeLuca DS, Monlong J, Sammeth M, Young TR, Goldmann JM, Pervouchine DD, Sullivan TJ et al. Human genomics. The human transcriptome across tissues and individuals. Science. 2015 May 8;348(6235):660-5. PMID: 25954002; PMC: PMC4547472

DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, Reich M, Winckler W, Getz G. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics. 2012 Jun 1;28(11):1530-2. PMID: 22539670; PMC: PMC3356847