Schema for Pairwise Alignments - Human Genomes, Chain/Net pairwise alignments, as mapped by the HPRC project

Home
Genomes
Genome Browser
Tools
Mirrors
- Euro/Asia Mirrors
- Mirroring Instructions
- US Server
- European Server
- Asian Server
Downloads
My Data
Projects
Help
About Us
- News
- Publications
- Blog
- Cite Us
- Credits
- Release Log
- Staff
- Conditions of Use
- Our History
- Jobs
- Licenses
- Contact Us

field

example

SQL type

info

description

bin

smallint(5) unsigned

range

Indexing field to speed chromosome range queries.

score

71992464

double

range

score of chain

tName

chr1

varchar(255)

values

Target sequence name

tSize

248956422

int(10) unsigned

range

Target sequence size

tStart

260783

int(10) unsigned

range

Alignment start position in target

tEnd

1356910

int(10) unsigned

range

Alignment end position in target

qName

JAHBBZ010000189.1

varchar(255)

values

Query sequence name

qSize

803069

int(10) unsigned

range

Query sequence size

qStrand

char(1)

values

Query strand

qStart

12731

int(10) unsigned

range

Alignment start position in query

qEnd

803069

int(10) unsigned

range

Alignment end position in query

id

732620

int(10) unsigned

range

chain id

bin

score

tName

tSize

tStart

tEnd

qName

qSize

qStrand

qStart

qEnd

71992464

chr1

248956422

260783

1356910

JAHBBZ010000189.1

803069

12731

803069

732620

85029734

chr1

248956422

1428667

2325086

JAHBBZ010000201.1

897271

897259

730927

31627463

chr1

248956422

2325147

2656440

JAHBBZ010000181.1

465475

1313

341483

736862

704510269

chr1

248956422

2752826

10188049

JAHBBZ010000082.1

7546380

58289

7546378

663308

3928404109

chr1

248956422

10191328

51906853

JAHBBZ010000033.1

42227865

42227157

221471

711

1286715

chr1

248956422

16552868

16567539

JAHBBZ010000033.1

42227865

6538174

6559355

745378

10069830

chr1

248956422

16565775

16680665

JAHBBZ010000033.1

42227865

6285758

6494621

742453

788

513918

chr1

248956422

26641331

26646733

JAHBBZ010000033.1

42227865

25301635

25307036

745855

3278740027

chr1

248956422

51908358

86797362

JAHBBZ010000076.1

34843587

34843369

325603

5183394290

chr1

248956422

86800940

168472103

JAHBBZ010000075.1

73274481

1176

73274446

113928

Description

This track shows regions of the human genome that are alignable to other Homo sapiens genomes. The alignable parts are shown with thick blocks that look like exons. Non-alignable parts between these are shown with thin lines like introns. More description on this display can be found below.

Other assemblies included in this track are from the HPRC project.

Display Conventions and Configuration

Chain Track

The chain track shows alignments of the human genome to other Homo sapiens genomes using a gap scoring system that allows longer gaps than traditional affine gap scoring systems. It can also tolerate gaps in both source and target assemblies simultaneously. These "double-sided" gaps can be caused by local inversions and overlapping deletions in both species.

The chain track displays boxes joined together by either single or double lines. The boxes represent aligning regions. Single lines indicate gaps that are largely due to a deletion in the query assembly or an insertion in the target assembly. assembly. Double lines represent more complex gaps that involve substantial sequence in both species. This may result from inversions, overlapping deletions, an abundance of local mutation, or an unsequenced gap in one species. In cases where multiple chains align over a particular region of the target genome, the chains with single-lined gaps are often due to processed pseudogenes, while chains with double-lined gaps are more often due to paralogs and unprocessed pseudogenes.

In the "pack" and "full" display modes, the individual feature names indicate the chromosome, strand, and location (in thousands) of the match for each matching alignment.

By default, the chains to chromosome-based assemblies are colored based on which chromosome they map to in the aligning organism. To turn off the coloring, check the "off" button next to: Color track based on chromosome.

To display only the chains of one chromosome in the aligning organism, enter the name of that chromosome (e.g. chr4) in box next to: Filter by chromosome.

Methods

The bigChain files were obtained from the HPRC S3 bucket (Amazon Web Services). For more information about how the bigChain files were generated, please refer to the HPRC publication below.

Credits

Thank you to Glenn Hickey for providing the HAL file from the HPRC project.

References

Liao WW, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, Lu S, Lucas JK, Monlong J, Abel HJ et al. A draft human pangenome reference. Nature. 2023 May;617(7960):312-324. DOI: 10.1038/s41586-023-05896-x; PMID: 37165242; PMC: PMC10172123

Hickey G, Monlong J, Ebler J, Novak AM, Eizenga JM, Gao Y, Human Pangenome Reference Consortium, Marschall T, Li H, Paten B. Pangenome graph construction from genome alignments with Minigraph-Cactus. Nat Biotechnol. 2023 May 10;. DOI: 10.1038/s41587-023-01793-w; PMID: 37165083; PMC: PMC10638906

Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, Fang Q, Xie D, Feng S, Stiller J et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020 Nov;587(7833):246-251. DOI: 10.1038/s41586-020-2871-y; PMID: 33177663; PMC: PMC7673649

Paten B, Earl D, Nguyen N, Diekhans M, Zerbino D, Haussler D. Cactus: Algorithms for genome multiple sequence alignment. Genome Res. 2011 Sep;21(9):1512-28. DOI: 10.1101/gr.123356.111; PMID: 21665927; PMC: PMC3166836