Schema for Duke Affy Exon - Affymetrix Exon Array from ENCODE/Duke

JavaScript is disabled in your web browser

You must have JavaScript enabled in your web browser to use the Genome Browser

Database: hg19 Primary Table: wgEncodeDukeAffyExonGlioblaSimpleSignalRep3V2 Data last updated: 2012-03-28
Big Bed File Download: /gbdb/hg19/bbi/wgEncodeDukeAffyExonGlioblaSimpleSignalRep3V2.bigBed
Item Count: 38,378
Format description: BED6 + exon count + constituitive exons

field	example	description
`chrom`	chr1	Chromosome (or contig, scaffold, etc.)
`chromStart`	166244884	Start position in chromosome
`chromEnd`	166245508	End position in chromosome
`name`	RP11-7G12.2	Name of item
`score`	581	Score from 0-1000. Capped number of reads
`strand`	-	+ or -
`signalValue`	5.8194	Measurement of expression value of the gene
`exonCount`	3	Number of exons used to estimate expression value
`constituitiveExons`	0	Number of constituitive exons used to estimate the expression value

Sample Rows

chrom	chromStart	chromEnd	name	score	strand	signalValue	exonCount
chr1	166244884	166245508	RP11-7G12.2	581	-	5.8194	3
chr1	166304143	166304911	RP11-479J7.2	365	+	3.3028	3
chr1	166445064	166459248	RP11-276E17.2	239	-	0.7809	3
chr1	166445406	166450798	FMO7P	356	+	3.1224	3
chr1	166535420	166549885	FMO8P	328	+	2.5789	9
chr1	166573168	166600610	FMO9P	275	+	1.5197	11
chr1	166635152	166651258	FMO10P	329	+	2.5897	11
chr1	166717522	166717642	RP11-54B9.2	308	+	2.1648	1
chr1	166745970	166761823	FMO11P	200	+	0	5
chr1	166765536	166766322	CNN2P10	578	+	5.786	4

Duke Affy Exon (wgEncodeDukeAffyExon) Track Description


	Description This track displays human tissue microarray data using Affymetrix Human Exon 1.0 ST expression arrays. This RNA expression track was produced as part of the ENCODE Project. The RNA was extracted from cells that were also analyzed by DNaseI hypersensitivity (Duke DNaseI HS), FAIRE (UNC FAIRE), and ChIP (UTA TFBS). Display Conventions and Configuration In contrast to the hg18 annotation, this track now displays exon array data that has been aggregated to the gene level for those probes that have been linked to genes. Probes not linked to genes are not included. The display for this track shows gene probe location and signal value as grayscale-colored items where higher signal values correspond to darker-colored blocks. Items with scores between 900-1000 have signal values greater than 9 that have been linearly scaled for that particular cell type. Items scoring 400-900 have signal values between 4 and 9, and the signal is simply multiplied by 100 to get the score. Items with scores between 200-400 have signal values below 4 that have been linearly scaled to fit that score range. The subtracks within this composite annotation track correspond to data from different cell types and tissues. The configuration options are shown at the top of the track description page, followed by a list of subtracks. To display only selected subtracks, uncheck the boxes next to the tracks you wish to hide. For information regarding specific microarray probes, turn on the Affy Exon Probes track, which can be found in the Expression track group. See Methods for a description as to how probe level data was processed to produce gene level annotations. Metadata for a particular subtrack can be found by clicking the down arrow in the list of subtracks. Data from these tracks are stored as bed files whose first six fields follow the bed file standard. The three additional fields are as follows: signalValue: The normalized expression value for a gene, calculated as described below. exonCount: The number of exons used in the calculation of the expression value. constitutiveExons: The number of constitutive exons used in the calculation of the expression value. Methods Cells were grown according to the approved ENCODE cell culture protocols. Total RNA was isolated from these cells using trizol extraction followed by cleanup on RNEasy column (Qiagen) that included a DNaseI step. The RNA was checked for quality using a nanodrop and an Agilent Bioanalyzer. RNA (1 µg) deemed to be of good quality was then processed either by 1) the standard Affymetrix Whole transcript Sense Target labeling protocol that included a riboreduction step, or 2) the NuGEN labeling system. The fragmented biotin-labeled cDNA was hybridized over 16 h to Affymetrix Exon 1.0 ST arrays and scanned on an Affymetrix Scanner 3000 7G using AGCC software. Data from all replicates were then normalized together. Probesets flagged as cross-hybridizing were removed from the analysis (Salomonis et al. 2010). Though these arrays provide exon-level resolution, gene-level expression was estimated by grouping probesets by gene for normalization (Bemmo et al. 2008). Probesets were assigned to genes based on the GENCODE v10 annotation (July 2011). An exon was classified as constitutive or non-constitutive based on whether it was present in all protein-coding transcripts. For genes with at least 4 constitutive probes, only constitutive probesets were used to estimate gene expression. For all other genes, including all non-protein-coding genes, all (non-cross-hybridizing) probesets that mapped to an expressed exon in any transcript of the gene were used. Gene-level expression estimates were normalized using Affymetrix Power Tools (APT) (Lockstone 2011) with the chipstream command "rma-bg, med-norm, pm-gcbg, med-polish". This chipstream calls for an RMA normalization with gc-background correction using antigenomic background probes. While the data was generated using the same microarray platform, two different experimental backgrounds were present due to a change in labeling reagents (Affymetrix vs. NuGEN; see Methods above). It was found that batch effects related to this change were causing array data to group by experimental protocol rather than cell type relatedness. We used an R script (ComBat) to correct for this batch effect (Johnson et al. 2007). Verification When biological replicates were available, data were verified by analyzing replicates displaying a Pearson correlation coefficient > 0.9. Release Notes This is release 3 of this track (April 2012). Several new cell types have been added. The name of cell line Astrocy was changed to NH-A. Credits RNA was extracted from each cell type by Greg Crawford's group at Duke University. RNA was purified and hybridized to Affymetrix Exon arrays by Sridar Chittur and Scott Tenenbaum at the University of Albany-SUNY. Data analyses were primarily performed by Nathan Sheffield (Duke University) with assistance from Melissa Cline (UCSC), Zhancheng Zhang (UNC Chapel Hill), and Darin London (Duke University). Contact: Terry Furey References Bemmo A, Benovoy D, Kwan T, Gaffney DJ, Jensen RV, Majewski J. Gene expression and isoform variation analysis using Affymetrix Exon Arrays. BMC Genomics. 2008 Nov 7;9:529. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007 Jan;8(1):118-27. Lockstone HE. Exon array data analysis using Affymetrix power tools and R statistical software. Brief Bioinform. 2011 Nov;12(6):634-44. Salomonis N, Schlieve CR, Pereira L, Wahlquist C, Colas A, Zambon AC, Vranizan K, Spindler MJ, Pico AR, Cline MS et al. Alternative splicing regulates mouse embryonic stem cell pluripotency and differentiation. Proc Natl Acad Sci U S A. 2010 Jun 8;107(23):10514-9. Data Release Policy Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.

Description

Display Conventions and Configuration

Methods

Verification

Release Notes

Credits

References

Data Release Policy