Schema for TF ChIP - Transcription Factor ChIP-seq Peaks (340 factors in 129 cell types) from ENCODE 3

JavaScript is disabled in your web browser

You must have JavaScript enabled in your web browser to use the Genome Browser

Database: hg38 Primary Table: encTfChipPkENCFF105PFS Row Count: 31,488 Data last updated: 2019-04-01
Format description: BED6+4 Peaks of signal enrichment based on pooled, normalized (interpreted) data.
On download server: MariaDB table dump directory

field	example	SQL type	info	description
`bin`	589	`smallint(5) unsigned`	range	Indexing field to speed chromosome range queries.
`chrom`	chr1	`varchar(255)`	values	Reference sequence chromosome or scaffold
`chromStart`	633910	`int(10) unsigned`	range	Start position in chromosome
`chromEnd`	634129	`int(10) unsigned`	range	End position in chromosome
`name`	.	`varchar(255)`	values	Name given to a region (preferably unique). Use . if no name is assigned
`score`	141	`int(10) unsigned`	range	Indicates how dark the peak will be displayed in the browser (0-1000)
`strand`	.	`char(1)`	values	+ or - or . for unknown
`signalValue`	235.811	`float`	range	Measurement of average enrichment for the region
`pValue`	-1	`float`	range	Statistical significance of signal value (-log10). Set to -1 if not used.
`qValue`	4.40176	`float`	range	Statistical significance with multiple-test correction applied (FDR -log10). Set to -1 if not used.
`peak`	107	`int(11)`	range	Point-source called for this peak; 0-based offset from chromStart. Set to -1 if no point-source called.

Sample Rows

bin	chrom	chromStart	chromEnd	name	score	strand	signalValue	pValue	qValue	peak
589	chr1	633910	634129	.	141	.	235.811	-1	4.40176	107
590	chr1	778598	778998	.	44	.	73.6754	-1	3.56315	200
591	chr1	827161	827479	.	159	.	266.095	-1	4.40176	171
591	chr1	861729	862129	.	30	.	50.6458	-1	2.81658	200
592	chr1	940140	940298	.	49	.	83.0633	-1	3.61446	97
592	chr1	1000019	1000399	.	634	.	1060.04	-1	4.40176	189
593	chr1	1064055	1064455	.	25	.	42.19	-1	2.41961	200
593	chr1	1065565	1065965	.	34	.	56.9249	-1	2.96346	200
593	chr1	1068848	1069052	.	57	.	96.613	-1	4.40176	102
593	chr1	1068986	1069386	.	16	.	27.9523	-1	1.3194	200

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

TF ChIP (encTfChipPk) Track Description


	Description This track represents a comprehensive set of human transcription factor binding sites based on ChIP-seq experiments generated by production groups in the ENCODE Consortium between February 2011 and November 2018. Transcription factors (TFs) are proteins that bind to DNA and interact with RNA polymerases to regulate gene expression. Some TFs contain a DNA binding domain and can bind directly to specific short DNA sequences ('motifs'); others bind to DNA indirectly through interactions with TFs containing a DNA binding domain. High-throughput antibody capture and sequencing methods (e.g. chromatin immunoprecipitation followed by sequencing, or 'ChIP-seq') can be used to identify regions of TF binding genome-wide. These regions are commonly called ChIP-seq peaks. The related Transcription Factor ChIP-seq Clusters tracks (hg19, hg38) provide summary views of this data. Display and File Conventions and Configuration The display for this track shows site location with the point-source of the peak marked with a colored vertical bar and the level of enrichment at the site indicated by the darkness of the item. The subtracks are colored by UCSC ENCODE 2 cell type color conventions on the hg19 assembly, and by similarity of cell types in DNaseI hypersensitivity assays (as in the DNase Signal) track in the hg38 assembly. The display can be filtered to higher valued items, using the Score range: configuration item. The score values were computed at UCSC based on signal values assigned by the ENCODE pipeline. The input signal values were multiplied by a normalization factor calculated as the ratio of the maximum score value (1000) to the signal value at 1 standard deviation from the mean, with values exceeding 1000 capped at 1000. This has the effect of distributing scores up to mean + 1std across the score range, but assigning all above to the maximum score. Methods The ChIP-seq peaks in this track were generated by the the ENCODE Transcription Factor ChIP-seq Processing Pipeline. Methods documentation and full metadata for each track can be found at the ENCODE project portal, using The ENCODE file accession (ENCFF) listed in the track label. Credits Thanks to the ENCODE Consortium, the ENCODE ChIP-seq production laboratories, and the ENCODE Data Coordination Center for generating and processing the datasets used here. Special thanks to Henry Pratt, Jill Moore, Michael Purcaro, and Zhiping Weng, PI, at the ENCODE Data Analysis Center (ZLab at UMass Medical Center) for providing the peak datasets, metadata, and guidance developing this track. Please check the ZLab ENCODE Public Hubs for the most updated data. References ENCODE Project Consortium. A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol. 2011 Apr;9(4):e1001046. PMID: 21526222; PMCID: PMC3079585 ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74. PMID: 22955616; PMCID: PMC3439153 Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Gabdank I, Narayanan AK, Ho M, Lee BT et al. ENCODE data at the ENCODE portal. Nucleic Acids Res. 2016 Jan 4;44(D1):D726-32. PMID: 26527727; PMC: PMC4702836 Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R et al. Architecture of the human regulatory network derived from ENCODE data. Nature. 2012 Sep 6;489(7414):91-100. PMID: 22955619 Wang J, Zhuang J, Iyer S, Lin X, Whitfield TW, Greven MC, Pierce BG, Dong X, Kundaje A, Cheng Y et al. Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res. 2012 Sep;22(9):1798-812. PMID: 22955990; PMC: PMC3431495 Wang J, Zhuang J, Iyer S, Lin XY, Greven MC, Kim BH, Moore J, Pierce BG, Dong X, Virgil D et al. Factorbook.org: a Wiki-based database for transcription factor-binding data generated by the ENCODE consortium. Nucleic Acids Res. 2013 Jan;41(Database issue):D171-6. PMID: 23203885; PMC: PMC3531197 Data Use Policy Users may freely download, analyze and publish results based on any ENCODE data without restrictions.* Researchers using unpublished ENCODE data are encouraged to contact the data producers to discuss possible coordinated publications; however, this is optional. *Users of ENCODE datasets are requested to cite the ENCODE Consortium and ENCODE production laboratory(s) that generated the datasets used, as described in Citing ENCODE.*

Description

Display and File Conventions and Configuration

Methods

Credits

References

Data Use Policy