Schema for CD8 RosettaMHC - CD8 Epitopes predicted by NetMHC and Rosetta
  Database: wuhCor1    Primary Table: rosettaMhc Data last updated: 2020-03-30
Big Bed File Download: /gbdb/wuhCor1/bbi/rosetta.bb
Item Count: 718
The data is stored in the binary BigBed format.

Format description: Browser extensible data, with extended fields for detail page
fieldexampledescription
chromNC_045512v2Reference sequence chromosome or scaffold
chromStart20116Start position in chromosome
chromEnd20141End position in chromosome
nameTLIGEAVKTShort Name of item
score640Score from 0-1000
strand++ or -
thickStart20116Start of where display should be thick (start codon)
thickEnd20141End of where display should be thick (stop codon)
reserved246,169,138Used as itemRgb as of 2004-11-22
id335ID of peptide
descriptionYP_009724389_TLIGEAVKTDescription

Sample Rows
 
chromchromStartchromEndnamescorestrandthickStartthickEndreservediddescription
NC_045512v22011620141TLIGEAVKT640+2011620141246,169,138335YP_009724389_TLIGEAVKT
NC_045512v22016120186KVDGVVQQL548+2016120186219,220,222293YP_009724389_KVDGVVQQL
NC_045512v22023920267SQMEIDFLEL423+2023920267123,158,248468YP_009724389_SQMEIDFLEL
NC_045512v22024220270QMEIDFLELA481+2024220270170,198,253600YP_009724389_QMEIDFLELA
NC_045512v22025720285FLELAMDEFI423+2025720285123,158,248469YP_009724389_FLELAMDEFI
NC_045512v22034720375SQLGGLHLLI488+2034720375176,203,251613YP_009724389_SQLGGLHLLI
NC_045512v22034720372SQLGGLHLL505+2034720372189,210,246315YP_009724389_SQLGGLHLL
NC_045512v22035020375QLGGLHLLI522+2035020375201,215,238339YP_009724389_QLGGLHLLI
NC_045512v22050620534DLLLDDFVEI413+2050620534115,149,244450YP_009724389_DLLLDDFVEI
NC_045512v22050920537LLLDDFVEII466+2050920537157,189,254567YP_009724389_LLLDDFVEII

CD8 RosettaMHC (rosettaMhc) Track Description
 

Description

As a first step toward the development of diagnostic and therapeutic tools to fight the Coronavirus disease (COVID-19), it is important to characterize CD8+ T cell epitopes in the SARS-CoV-2 peptidome that can trigger adaptive immune responses. Here, we use RosettaMHC, a comparative modeling approach which leverages existing high-resolution X-ray structures from peptide/MHC complexes available in the Protein Data Bank, to derive physically realistic 3D models for high-affinity SARS-CoV-2 epitopes. We outline an application of our method to model 439 9mer and 279 10mer predicted epitopes displayed by the common allele HLA-A*02:01, and we make our models publicly available through an online database (https://rosettamhc.chemistry.ucsc.edu). As more detailed studies on antigen-specific T cell recognition become available, RosettaMHC models of antigens from different strains and HLA alleles can be used as a basis to understand the link between peptide/HLA complex structure and surface chemistry with immunogenicity, in the context of SARS-CoV-2 infection.

This track includes 718 CD8 epitopes restricted to HLA-A*02:01 as predicted by NetMHCpan4.0 and RosettaMHC. The structural models of all 718 epitopes are available in the database (see Description). All the epitopes are scored using a combined NetMHCPan4.0 (eluted ligand) predicted binding affinity and binding energy calculated in Rosetta force field (score = (0.5 * ( ((NetMHCPan affinity - Average NetMHCPan affinity) / range of NetMHCPan affinities) + ( (Rosetta binding energy - Average Rosetta binding energy ) / range of Rosetta binding energies) ) + 1 ) * 500).

Methods

Epitopes of lengths 9 and 10 from all reading frames of SARS-CoV-2 proteome are generated and filtered using NetMHCPan4.0 (eluted ligand prediction). All the epitopes predicted as strong or weak binders (a total of 718) to HLA-A*02:01 by NetMHCPan4.0 (using default %Rank cut-off) are modeled using RosettaMHC. Further, binding energies of all 718 epitopes to HLA-A*02:01 is calculated in Rosetta. Alongside all the models, their NetMHCpan predictions and binding energies are made available through a database and Supplementary Table 1 from the reference, Nerli and Sgourakis. (2020) in the References section below.

Notes

For a full description of the methods used, refer to Nerli and Sgourakis. (2020) in the References section below.

Credits

Nikolaos Sgourakis (nsgourak@ucsc.edu)

Santrupti Nerli (snerli@ucsc.edu)

Data were generated and processed at UCSC. For inquiries, please contact Nikolaos Sgourakis from the Sgourakis Research Group at UCSC.

References

Nerli and Sgourakis. 2020 (Manuscript submitted) (BioRxiv).