Schema for UW RNA-seq - RNA-seq from ENCODE/UW
  Database: mm9    Primary Table: wgEncodeUwRnaSeqLungCellPolyaMAdult8wksC57bl6AlnRep1
BAM File: /gbdb/mm9/bbi/wgEncodeUwRnaSeqLungCellPolyaMAdult8wksC57bl6AlnRep1.bam
Format description: The fields of a SAM short read alignment, the text version of BAM.
See the SAM Format Specification for more details
fielddescription
qNameQuery template name - name of a read
flagFlags. 0x10 set for reverse complement. See SAM docs for others.
rNameReference sequence name (often a chromosome)
pos1 based position
mapQMapping quality 0-255, 255 is best
cigarCIGAR encoded alignment string.
rNextRef sequence for next (mate) read. '=' if same as rName, '*' if no mate
pNextPosition (1-based) of next (mate) sequence. May be -1 or 0 if no mate
tLenSize of DNA template for mated pairs. -size for one of mate pairs
seqQuery template sequence
qualASCII of Phred-scaled base QUALity+33. Just '*' if no quality scores
tagTypeValsTab-delimited list of tag:type:value optional extra fields

Sample Rows
 
qNameflagrNameposmapQcigarrNextpNexttLenseqqualtagTypeVals
419_1169_53316chr13001125725H25M*00TCTCACTGGTGTGGCCTGAGTCAGA2IIIIIIIIII<6D//IIIIIIIIIRG:Z:20110727143716699 CS:Z:T02212122121302111012112221230210030333111212120112 AS:c:18 CQ:Z:@8@@6@2@>6//6=<2@<@28@@6@26<2@82<=8/6<>?2@@/82///; NH:c:1 IH:c:1 HI:c:1 MD:Z:25
195_419_110816chr1301237313H28M19H*00GAAGACAGCCACAAGAACAAAATCCCAG8IIIIIIII66IIIIIIIIIIIIII?@"RG:Z:20110727143716699 CS:Z:T01301231102010110312220023000110220110032112202000 AS:c:21 CQ:Z:@?@@@@@>@@@<;@@@@8@@@@?@@@A@@@@?;@@@@6@@@@/@8@88?@ XN:c:20 NH:c:10 IH:c:1 HI:c:1 MD:Z:28
153_910_17640chr13012616119H27M4H*00TCCGAAGAAACAAGCTGGAGTAGACAT"D=IIII2//2@==III==II@IIII2RG:Z:20110727143716699 CS:Z:T33032132000023212333203202220210232102213221133331 AS:c:20 CQ:Z:/66//@2=>@/@6=6////=6//@@@26/>2///?;///@2/@2@22/@? XN:c:20 NH:c:4 IH:c:1 HI:c:1 MD:Z:27
404_1698_18150chr13012817948M2H*00ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCACTIIIIIIIIIIIIIIIIIIII//@II66III--IIIIIIII<<IIIIDIRG:Z:201107291881848 CS:Z:T31111301311022111110331332310020212021011000011222 AS:c:35 CQ:Z:@@@@@@@@@@@@@@@@@=@@<//2@@6@@@@-@@@@-?@@@<@@@@-8@@ XN:c:31 NH:c:10 IH:c:1 HI:c:1 MD:Z:48
546_536_1330chr13012817746M4H*00ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCAIIIIIIIIIIIIIIIIIIII//D@IIIIII//IIIIIIIIIIIII6RG:Z:201107291881848 CS:Z:T31111301311022111110331332110020212021011100013222 AS:c:39 CQ:Z:@@@@;@@?@@@@@@@@@8@@=/6/2>288=@/28??@@6@@28??68<>@ XN:c:37 NH:c:10 IH:c:1 HI:c:1 MD:Z:46
99_241_13020chr130128171048M2H*00ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCACTIIIIIIIIIIIIIIIIIIII0/@II..IIII>IIIIIIII00IIIIIIRG:Z:201107291881848 CS:Z:T31111301311022111110311332010022212021011200011222 AS:c:38 CQ:Z:@@@@@@@@@@@@@@@@@<@@@0/2<@.;8@@-2?@@-@=@@0@?@@=@@@ XN:c:34 NH:c:10 IH:c:1 HI:c:1 MD:Z:48
58_1642_6310chr130128171148M2H*00ACACATTGCACCTCACACAATCATAGTGGGAGACTTCAACACCCCACTIIIIIIIIIIIIIIIIIIII..I=I--III22III6/IIIIIIIIIIDRG:Z:20110729164416858 CS:Z:T31111301311022111110301332010020212011011100011222 AS:c:35 CQ:Z:@@@@@@@@@@@@@@@@@?@@@.;//@-@@@@2;@;@6/?;@/@=@@66/2 XN:c:31 NH:c:10 IH:c:1 HI:c:1 MD:Z:48
313_720_10210chr13012820743M7H*00CATTGCACCTCACACAATCATAGTGGGAGACTTTGACACCCCAIIIIIIIIIIIIIIIII//CFC@IIIIIEGII////II@<HI6RG:Z:20110727134646918 CS:Z:T21301311022111110301332110022212001211100010222211 AS:c:33 CQ:Z:@@;@@@@@@@@@@@>@<@//522/@2=@062@@/@/=;2/.;@66:2@6/ XN:c:30 NH:c:10 IH:c:1 HI:c:1 MD:Z:33CA8
15_1736_11310chr130163661043M7H*00ATGGACCATCTAGAGACTGCTGCATCGTGGGATCCATCCCATAIIIIIII//IIIIIIIIIIIFI//II@IIIIIIIIIIIIIII2RG:Z:20110727134646918 CS:Z:T33102101222322221213213032311002320132001332303220 AS:c:30 CQ:Z:@@;6;@6;/@2?;/;@/@2</88/2@/2;/@?/@6@@@@<8=82>/;2// XN:c:27 NH:c:10 IH:c:1 HI:c:1 MD:Z:33A9
2_1419_2090chr13016506720H25M5H*00CCTGGGAAACACAGAAAAATGGATG/FIFFFFII@=DD/2III//=@@@/RG:Z:20110727143716699 CS:Z:T21030100022202102023202100200111123000021023120112 AS:c:18 CQ:Z:</;@/6@8//62>?/6@2/<2/88/8/8;2//6/2688@///2/2/@6/2 NH:c:1 IH:c:1 HI:c:1 MD:Z:25

UW RNA-seq (wgEncodeUwRnaSeq) Track Description
 

Description

This track was produced as part of the mouse ENCODE Project. This track shows RNA-seq measured genome-wide in mouse tissues and cell lines. Poly-A selected mRNA was used as the source for transcriptome profiling of tissues and cell types that also had corresponding DNase I hypersensitive profiles.

Display Conventions and Configuration

This track is a multi-view composite track that contains multiple data types (views). For each view, there are multiple subtracks that display individually on the browser. Instructions for configuring multi-view tracks are here. Color differences among the views are arbitrary and they provide a visual cue for distinguishing between the different cell and tissue types. This track contains the following views:

Plus and Minus Signals
These views display clusters of overlapping read mappings on the forward and reverse genomic strands.
Signal
Density graph (wiggle) of signal enrichment based on processed data.
Alignments
Mappings of short 50-base single end reads to the genome. See the SAM Format Specification for more information on the SAM/BAM file format.

Methods

Cells were grown according to the approved ENCODE cell culture protocols. Fresh tissues were harvested from mice and stored until used for preparing total RNA samples. The total RNA was used as starting material to select poly-A RNA and used for constructing SOLiD libraries according to the protocols supplied by the manufacturer. All RNA samples were spiked in with NIST standards before libraries were constructed. The RNA-seq libraries were sequenced on ABI SOLiD sequencing platform as 50-base reads according to the manufacturer's recommendations.

Reads were aligned to the mm9 reference genome using ABI BioScope software version 1.2.1. Colorspace FASTQ format files were created using Heng Li's solid2fastq.pl script version 0.1.4 (Li et al., 2009a), representing 0, 1, 2, 3 color codes with the letters A, C, G, T respectively. Signal files were created from the BAM (Li et al., 2009b) alignments using BEDTools (Quinlan et al., 2010).

Release Notes

This is Release 1 (July 2012). It contains a total of 25 RNA-seq experiments.

Credits

These data were generated by the UW ENCODE group.

Contact: Richard Sandstrom

References

Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009 Jul 15;25(14):1754-60. PMID: 19451168; PMC: PMC2705234

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. PMID: 19505943; PMC: PMC2723002

Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010 Mar 15;26(6):841-2. PMID: 20110278; PMC: PMC2832824

Data Release Policy

Data users may freely use ENCODE data, but may not, without prior consent, submit publications that use an unpublished ENCODE dataset until nine months following the release of the dataset. This date is listed in the Restricted Until column, above. The full data release policy for ENCODE is available here.