Schema for Segmental Dups - Duplications of >1000 Bases of Non-RepeatMasked Sequence

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

example

SQL type

description

bin

611

smallint(6)

Indexing field to speed chromosome range queries.

chrom

chr1

varchar(255)

Reference sequence chromosome or scaffold

chromStart

3531709

int(10) unsigned

Start position in chromosome

chromEnd

3535022

int(10) unsigned

End position in chromosome

name

chr5:133839256

varchar(255)

Other chromosome involved

score

int(10) unsigned

Score based on the raw BLAST alignment score. Set to 0 and not used in later versions.

strand

char(1)

Value should be + or -

otherChrom

chr5

varchar(255)

Other chromosome or scaffold

otherStart

133839256

int(10) unsigned

Start in other sequence

otherEnd

133842605

int(10) unsigned

End in other sequence

otherSize

3349

int(10) unsigned

Total size of other chromosome

uid

386

int(10) unsigned

Unique id shared by the query and subject

posBasesHit

1000

int(10) unsigned

For future use

testResult

N/A

varchar(255)

For future use

verdict

N/A

varchar(255)

For future use

chits

N/A

varchar(255)

For future use

ccov

N/A

varchar(255)

For future use

alignfile

align_both/0004/both0022602

varchar(255)

alignment file path

alignL

3371

int(10) unsigned

spaces/positions in alignment

indelN

int(10) unsigned

number of indels

indelS

int(10) unsigned

indel spaces

alignB

3291

int(10) unsigned

bases Aligned

matchB

3070

int(10) unsigned

aligned bases that match

mismatchB

221

int(10) unsigned

aligned bases that do not match

transitionsB

121

int(10) unsigned

number of transitions

transversionsB

100

int(10) unsigned

number of transversions

fracMatch

0.932847

float

fraction of matching bases

fracMatchIndel

0.927492

float

fraction of matching bases with indels

jcK

0.0703516

float

K-value calculated with Jukes-Cantor

k2K

0.0705369

float

Kimura K

bin

chrom

chromStart

chromEnd

name

score

strand

otherChrom

otherStart

otherEnd

otherSize

uid

posBasesHit

testResult

verdict

chits

ccov

alignfile

alignL

indelN

indelS

alignB

matchB

mismatchB

transitionsB

transversionsB

fracMatch

fracMatchIndel

jcK

k2K

611

chr1

3531709

3535022

chr5:133839256

chr5

133839256

133842605

3349

386

1000

N/A

align_both/0004/both0022602

3371

3291

3070

221

121

100

0.932847

0.927492

0.0703516

0.0705369

626

chr1

5423434

5429545

chr10:55653424

chr10

55653424

55659587

6163

570

1000

N/A

align_both/0000/both0000536

6167

6107

5873

234

136

0.961683

0.960425

0.0393301

0.0394048

626

chr1

5423434

5429532

chrX:108566838

chrX

108566838

108572997

6159

498

1000

N/A

align_both/0004/both0022644

6159

6098

5894

204

124

0.966546

0.965122

0.0342226

0.0342915

626

chr1

5423435

5429437

chr5:7766445

chr5

7766445

7772518

6073

387

1000

N/A

align_both/0004/both0022622

6076

5999

5732

267

149

118

0.955493

0.953744

0.0458827

0.0459669

626

chr1

5423435

5429545

chr4:97080152

chr4

97080152

97086317

6165

355

1000

N/A

align_both/0004/both0022620

6172

6103

5877

226

141

0.962969

0.961551

0.0379764

0.0380718

626

chr1

5423442

5429532

chr2:151880429

chr2

151880429

151886585

6156

301

1000

N/A

align_both/0004/both0022609

6183

120

6063

5887

176

102

0.970971

0.968894

0.0296052

0.0296465

626

chr1

5423455

5429514

chr3:11880252

chr3

11880252

11886348

6096

317

1000

N/A

align_both/0004/both0022610

6119

6036

5597

439

239

200

0.92727

0.923902

0.0765027

0.0767171

626

chr1

5423456

5429545

chr18:89137183

chr18

89137183

89143333

6150

16972

1000

N/A

align_both/0004/both0022323

6154

6085

5879

206

130

0.966146

0.964719

0.0346416

0.0347246

626

chr1

5423456

5429545

chr18:15366528

chr18

15366528

15372672

6144

16957

1000

N/A

align_both/0004/both0022172

6145

6088

5889

199

118

0.967313

0.966202

0.033421

0.0334797

626

chr1

5423456

5429437

chr11:41080586

chr11

41080586

41086626

6040

1420

1000

N/A

align_both/0000/both0001261

6050

5971

5660

311

167

144

0.947915

0.94554

0.053982

0.0540787

Description

This track shows regions detected as putative genomic duplications within the golden path. The following display conventions are used to distinguish levels of similarity:

Light to dark gray: 90 - 98% similarity
Light to dark yellow: 98 - 99% similarity
Light to dark orange: greater than 99% similarity
Red: duplications of greater than 98% similarity that lack sufficient Segmental Duplication Database evidence (most likely missed overlaps)

For a region to be included in the track, at least 2.5 Kb of the total sequence (containing at least 500 bp of non-RepeatMasked sequence) had to align and a sequence identity of at least 90% was required.

Methods

Segmental duplications play an important role in both genomic disease and gene evolution. This track displays an analysis of the global organization of these long-range segments of identity in genomic sequence.

Large recent duplications (>= 1 kb and >= 90% identity) were detected by identifying high-copy repeats, removing these repeats from the genomic sequence ("fuguization") and searching all sequence for similarity. The repeats were then reinserted into the pairwise alignments, the ends of alignments trimmed, and global alignments were generated. For a full description of the "fuguization" detection method, see Bailey et al. (2001) in the References section below. This method has become known as WGAC (whole-genome assembly comparison); for example, see Bailey et al. (2002).

Credits

These data were provided by Archana Raja and Evan Eichler at the University of Washington.

References

Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE. Recent segmental duplications in the human genome. Science. 2002 Aug 9;297(5583):1003-7. PMID: 12169732

Bailey JA, Yavor AM, Massa HF, Trask BJ, Eichler EE. Segmental duplications: organization and impact within the current human genome project assembly. Genome Res. 2001 Jun;11(6):1005-17. PMID: 11381028; PMC: PMC311093