Schema for Other RefSeq - Non-Mouse RefSeq Genes
  Database: mm39    Primary Table: xenoRefGene    Row Count: 197,343   Data last updated: 2020-08-17
Format description: A gene prediction with some additional info.
On download server: MariaDB table dump directory
fieldexampleSQL type info description
bin 76smallint(5) unsigned range Indexing field to speed chromosome range queries.
name NM_052898varchar(255) values Name of gene (usually transcript_id from GTF)
chrom chr1varchar(255) values Reference sequence chromosome or scaffold
strand -char(1) values + or - for strand
txStart 3270069int(10) unsigned range Transcription start position (or end position for minus strand item)
txEnd 3742017int(10) unsigned range Transcription end position (or start position for minus strand item)
cdsStart 3286244int(10) unsigned range Coding region start (or end position for minus strand item)
cdsEnd 3741571int(10) unsigned range Coding region end (or start position for minus strand item)
exonCount 59int(10) unsigned range Number of exons
exonStarts 3270069,3270271,3270366,327...longblob   Exon start positions (or end positions for minus strand item)
exonEnds 3270248,3270287,3270442,327...longblob   Exon end positions (or start positions for minus strand item)
score 0int(11) range score
name2 XKR4varchar(255) values Alternate name (e.g. gene_id from GTF)
cdsStartStat cmplenum('none', 'unk', 'incmpl', 'cmpl') values Status of CDS start annotation (none, unknown, incomplete, or complete)
cdsEndStat cmplenum('none', 'unk', 'incmpl', 'cmpl') values Status of CDS end annotation (none, unknown, incomplete, or complete)
exonFrames -1,-1,-1,-1,-1,-1,-1,-1,-1,...longblob   Reading frame of the start of the CDS region of the exon, in the direction of transcription (0,1,2), or -1 if there is no CDS region.

Connected Tables and Joining Fields
        hgFixed.gbCdnaInfo.acc (via xenoRefGene.name)
      hgFixed.gbMiscDiff.acc (via xenoRefGene.name)
      hgFixed.gbSeq.acc (via xenoRefGene.name)
      hgFixed.gbWarn.acc (via xenoRefGene.name)
      hgFixed.imageClone.acc (via xenoRefGene.name)
      mm39.all_est.qName (via xenoRefGene.name)
      mm39.all_mrna.qName (via xenoRefGene.name)
      mm39.refGene.name (via xenoRefGene.name)
      mm39.refSeqAli.qName (via xenoRefGene.name)
      mm39.xenoMrna.qName (via xenoRefGene.name)
      mm39.xenoRefFlat.name (via xenoRefGene.name)
      mm39.xenoRefSeqAli.qName (via xenoRefGene.name)

Sample Rows
 
binnamechromstrandtxStarttxEndcdsStartcdsEndexonCountexonStartsexonEndsscorename2cdsStartStatcdsEndStatexonFrames
76NM_052898chr1-3270069374201732862443741571593270069,3270271,3270366,3270659,3270864,3270930,3271359,3271862,3275217,3275445,3275507,3275526,3275685,3275814,3276161,3276401, ...3270248,3270287,3270442,3270746,3270888,3270993,3271428,3271922,3275286,3275494,3275517,3275634,3275711,3275896,3276172,3276414, ...0XKR4cmplcmpl-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1 ...
76NM_001011971chr1-3284702374263632862443742487133284702,3285201,3285877,3285914,3491924,3740774,3740924,3742103,3742184,3742242,3742378,3742432,3742450,3285161,3285677,3285901,3287191,3492124,3740878,3741190,3742179,3742242,3742371,3742432,3742449,3742636,0Xkr4cmplcmpl-1,-1,-1,1,2,0,0,2,1,0,0,1,0,
76NM_001033036chr1-328615737416693286244374157163286157,3491924,3740774,3740924,3741036,3741358,3287191,3492124,3740878,3741036,3741316,3741669,0XKR4cmplcmpl1,2,0,1,0,0,
76NM_001012258chr1-328618837415713286244374157183286188,3286244,3286622,3491924,3740769,3740929,3741045,3741460,3286209,3286526,3287191,3492119,3740857,3740956,3741304,3741571,0xkr4cmplcmpl-1,0,1,1,0,0,0,0,
76NM_001032714chr1-328624437415743286244374157173286244,3286622,3491922,3740774,3740929,3741036,3741460,3286505,3286835,3492124,3740851,3740956,3741310,3741574,0xkr4cmplcmpl0,0,2,0,0,0,0,
76NM_001032307chr1-328624437415753286244374157173286244,3491924,3740774,3740929,3741045,3741445,3741533,3287191,3492124,3740874,3740980,3741325,3741530,3741575,0xkr4cmplcmpl1,2,1,0,0,2,0,
613NM_001077752chr1-374077637415713740776374157143740776,3740929,3741045,3741460,3740857,3740956,3741304,3741571,0xkr4cmplincmpl0,0,0,0,
9NM_001375654chr1-4069791442306040697914423048314069791,4077944,4089292,4094958,4112141,4162839,4189890,4190237,4212834,4218034,4218832,4234077,4240427,4267808,4276882,4296834, ...4069842,4077960,4089371,4095115,4112330,4163003,4189935,4190296,4212989,4218184,4218967,4234164,4240601,4267864,4277060,4297050, ...0RP1cmplincmpl0,1,0,2,0,1,1,2,0,0,0,0,0,1,0,2,0,1,0,0,2,1,0,1,0,0,0,1,2,0,2,
618NM_001195676chr1-4414416443061544147734423048144414416,4414556,4414830,4416111,4417038,4417411,4417434,4417914,4419139,4419957,4420083,4422132,4422424,4430422,4414527,4414829,4416111,4417038,4417291,4417430,4417911,4419136,4419952,4420080,4420314,4422304,4423060,4430615,0Rp1cmplcmpl-1,1,0,1,0,2,1,0,0,1,0,0,0,-1,
618NM_001280013chr1-4414827442304844148274423048264414827,4415026,4415314,4416202,4416520,4416757,4417115,4417237,4417549,4418229,4418317,4418680,4418869,4418921,4418925,4419104, ...4414953,4415062,4416052,4416517,4416617,4416895,4417233,4417288,4418148,4418257,4418631,4418788,4418920,4418925,4419027,4419113, ...0RP1cmplincmpl1,0,0,0,2,0,1,0,1,0,1,0,1,2,1,1,1,1,0,0,0,1,2,0,1,0,

Note: all start coordinates in our database are 0-based, not 1-based. See explanation here.

Other RefSeq (xenoRefGene) Track Description
 

Description

The RefSeq mRNAs gene track for the mouse (Jun. 2020 (GRCm39/mm39)) genome assembly displays translated blat alignments of vertebrate and invertebrate mRNA in GenBank.

Track statistics summary

Total genome size: 2,654,624,157 (not counting gaps)
Gene count: 22,442
Bases in genes: 838,462,469 (txStart to txEnd)
Genes percent genome coverage: % 31.585
Bases in exons: 53,564,706
Exons percent genome coverage: % 2.018

Search tips

Please note, the name searching system is not completely case insensitive. When in doubt, enter search names in all lower case to find gene names.

Methods

The mRNAs were aligned against the mouse (Jun. 2020 (GRCm39/mm39)) genome using translated blat. When a single mRNA aligned in multiple places, the alignment having the highest base identity was found. Only those alignments having a base identity level within 1% of the best and at least 25% base identity with the genomic sequence were kept.

Specifically, the translated blat command is:

blat -noHead -q=rnax -t=dnax -mask=lower target.fa query.fa target.query.psl

where target.fa is one of the chromosome sequence of the genome assembly,
and the query.fa is the mRNAs from RefSeq
The resulting PSL outputs are filtered:
pslCDnaFilter -minId=0.35 -minCover=0.25  -globalNearBest=0.0100 -minQSize=20 \
  -ignoreIntrons -repsAsMatch -ignoreNs -bestOverlap \
    all.results.psl mm39.xenoRefGene.psl
The filtered mm39.xenoRefGene.psl is converted to genePred data to display for this track.

Credits

The mRNA track was produced at UCSC from mRNA sequence data submitted to the international public sequence databases by scientists worldwide.

References

Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res. 2013 Jan;41(Database issue):D36-42. PMID: 23193287; PMC: PMC3531190

Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank: update. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D23-6. PMID: 14681350; PMC: PMC308779

Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64. PMID: 11932250; PMC: PMC187518