Schema for UW RNA-seq - RNA-seq from ENCODE/UW

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

description

qName

Query template name - name of a read

flag

Flags. 0x10 set for reverse complement. See SAM docs for others.

rName

Reference sequence name (often a chromosome)

pos

1 based position

mapQ

Mapping quality 0-255, 255 is best

cigar

CIGAR encoded alignment string.

rNext

Ref sequence for next (mate) read. '=' if same as rName, '*' if no mate

pNext

Position (1-based) of next (mate) sequence. May be -1 or 0 if no mate

tLen

Size of DNA template for mated pairs. -size for one of mate pairs

seq

Query template sequence

qual

ASCII of Phred-scaled base QUALity+33. Just '*' if no quality scores

tagTypeVals

Tab-delimited list of tag:type:value optional extra fields

qName

flag

rName

pos

mapQ

cigar

rNext

pNext

tLen

seq

qual

tagTypeVals

419_1169_533

chr1

3001125

25H25M

TCTCACTGGTGTGGCCTGAGTCAGA

2IIIIIIIIII<6D//IIIIIIIII

RG:Z:20110727143716699 CS:Z:T02212122121302111012112221230210030333111212120112 AS:c:18 CQ:Z:@8@@6@2@>6//6=<2@<@28@@6@26<2@82<=8/6<>?2@@/82///; NH:c:1 IH:c:1 HI:c:1 MD:Z:25

195_419_1108

chr1

3012373

3H28M19H

GAAGACAGCCACAAGAACAAAATCCCAG

8IIIIIIII66IIIIIIIIIIIIII?@"

RG:Z:20110727143716699 CS:Z:T01301231102010110312220023000110220110032112202000 AS:c:21 CQ:Z:@?@@@@@>@@@<;@@@@8@@@@?@@@A@@@@?;@@@@6@@@@/@8@88?@ XN:c:20 NH:c:10 IH:c:1 HI:c:1 MD:Z:28

153_910_1764

chr1

3012616

19H27M4H

TCCGAAGAAACAAGCTGGAGTAGACAT

"D=IIII2//2@==III==II@IIII2

RG:Z:20110727143716699 CS:Z:T33032132000023212333203202220210232102213221133331 AS:c:20 CQ:Z:/66//@2=>@/@6=6////=6//@@@26/>2///?;///@2/@2@22/@? XN:c:20 NH:c:4 IH:c:1 HI:c:1 MD:Z:27

404_1698_1815

chr1

3012817

48M2H

ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCACT

IIIIIIIIIIIIIIIIIIII//@II66III--IIIIIIII<<IIIIDI

RG:Z:201107291881848 CS:Z:T31111301311022111110331332310020212021011000011222 AS:c:35 CQ:Z:@@@@@@@@@@@@@@@@@=@@<//2@@6@@@@-@@@@-?@@@<@@@@-8@@ XN:c:31 NH:c:10 IH:c:1 HI:c:1 MD:Z:48

546_536_133

chr1

3012817

46M4H

ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCA

IIIIIIIIIIIIIIIIIIII//D@IIIIII//IIIIIIIIIIIII6

RG:Z:201107291881848 CS:Z:T31111301311022111110331332110020212021011100013222 AS:c:39 CQ:Z:@@@@;@@?@@@@@@@@@8@@=/6/2>288=@/28??@@6@@28??68<>@ XN:c:37 NH:c:10 IH:c:1 HI:c:1 MD:Z:46

99_241_1302

chr1

3012817

48M2H

ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCACT

IIIIIIIIIIIIIIIIIIII0/@II..IIII>IIIIIIII00IIIIII

RG:Z:201107291881848 CS:Z:T31111301311022111110311332010022212021011200011222 AS:c:38 CQ:Z:@@@@@@@@@@@@@@@@@<@@@0/2<@.;8@@-2?@@-@=@@0@?@@=@@@ XN:c:34 NH:c:10 IH:c:1 HI:c:1 MD:Z:48

58_1642_631

chr1

3012817

48M2H

ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCACT

IIIIIIIIIIIIIIIIIIII..I=I--III22III6/IIIIIIIIIID

RG:Z:20110729164416858 CS:Z:T31111301311022111110301332010020212011011100011222 AS:c:35 CQ:Z:@@@@@@@@@@@@@@@@@?@@@.;//@-@@@@2;@;@6/?;@/@=@@66/2 XN:c:31 NH:c:10 IH:c:1 HI:c:1 MD:Z:48

313_720_1021

chr1

3012820

43M7H

CATTGCACCTCACACAATCATAGTGGGAGACTTTGACACCCCA

IIIIIIIIIIIIIIIII//CFC@IIIIIEGII////II@<HI6

RG:Z:20110727134646918 CS:Z:T21301311022111110301332110022212001211100010222211 AS:c:33 CQ:Z:@@;@@@@@@@@@@@>@<@//522/@2=@062@@/@/=;2/.;@66:2@6/ XN:c:30 NH:c:10 IH:c:1 HI:c:1 MD:Z:33CA8

15_1736_1131

chr1

3016366

43M7H

ATGGACCATCTAGAGACTGCTGCATCGTGGGATCCATCCCATA

IIIIIII//IIIIIIIIIIIFI//II@IIIIIIIIIIIIIII2

RG:Z:20110727134646918 CS:Z:T33102101222322221213213032311002320132001332303220 AS:c:30 CQ:Z:@@;6;@6;/@2?;/;@/@2</88/2@/2;/@?/@6@@@@<8=82>/;2// XN:c:27 NH:c:10 IH:c:1 HI:c:1 MD:Z:33A9

2_1419_209

chr1

3016506

20H25M5H

CCTGGGAAACACAGAAAAATGGATG

/FIFFFFII@=DD/2III//=@@@/

RG:Z:20110727143716699 CS:Z:T21030100022202102023202100200111123000021023120112 AS:c:18 CQ:Z:</;@/6@8//62>?/6@2/<2/88/8/8;2//6/2688@///2/2/@6/2 NH:c:1 IH:c:1 HI:c:1 MD:Z:25

Description

This track was produced as part of the mouse ENCODE Project. This track shows RNA-seq measured genome-wide in mouse tissues and cell lines. Poly-A selected mRNA was used as the source for transcriptome profiling of tissues and cell types that also had corresponding DNase I hypersensitive profiles.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here. Color differences among the views are arbitrary and they provide a visual cue for distinguishing between the different cell and tissue types. This track contains the following views:

Plus and Minus Signals: These views display clusters of overlapping read mappings on the forward and reverse genomic strands.
Signal: Density graph (wiggle) of signal enrichment based on processed data.
Alignments: Mappings of short 50-base single end reads to the genome. See the SAM Format Specification for more information on the SAM/BAM file format.

Methods

Cells were grown according to the approved ENCODE cell culture protocols. Fresh tissues were harvested from mice and stored until used for preparing total RNA samples. The total RNA was used as starting material to select poly-A RNA and used for constructing SOLiD libraries according to the protocols supplied by the manufacturer. All RNA samples were spiked in with NIST standards before libraries were constructed. The RNA-seq libraries were sequenced on ABI SOLiD sequencing platform as 50-base reads according to the manufacturer's recommendations.

Reads were aligned to the mm9 reference genome using ABI BioScope software version 1.2.1. Colorspace FASTQ format files were created using Heng Li's solid2fastq.pl script version 0.1.4 (Li et al., 2009a), representing 0, 1, 2, 3 color codes with the letters A, C, G, T respectively. Signal files were created from the BAM (Li et al., 2009b) alignments using BEDTools (Quinlan et al., 2010).

Release Notes

This is Release 1 (July 2012). It contains a total of 25 RNA-seq experiments.

Credits

These data were generated by the UW ENCODE group.

Contact: Richard Sandstrom

References

Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009 Jul 15;25(14):1754-60. PMID: 19451168; PMC: PMC2705234

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. PMID: 19505943; PMC: PMC2723002

Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010 Mar 15;26(6):841-2. PMID: 20110278; PMC: PMC2832824

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.