YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

license: mit tags: - biology

Model description

MHC-II-EpiPred (MHC-II-EpiPred, MHC II molecular epitope prediction) is a protein language model fine-tuned from ESM2 pretrained model (facebook/esm2_t33_650M_UR50D) on a T cell MHC II epitope dataset.

MHC-II-EpiPred is a classification model for predicting the class of MHC II epitope.

Dataset

The original data was downloaded from IEDB data base at https://www.iedb.org/home_v3.php. The full data can be downloaded at https://www.iedb.org/downloader.php?file_name=doc/tcell_full_v3.zip
This dataset comprises 543,717 T-cell epitope entries, spanning a variety of species and infections caused by diverse viruses. The epitope information included encompasses a broad range of potential sources, including data relevant to disease immunotherapy.

Finally, the dataset we used to train the model contains 60,256 positive and negative samples, which is stored in https://github.com/pengsihua2023/MHC-II-EpiPred/tree/main/data.

Results

MHC-II-EpiPred achieved the following results:
Training Loss (cross-entropy loss, CEL): 0.0355
Training Accuracy: 0.9916
Training F1: 0.9916
Evaluation Loss (cross-entropy loss, CEL): 0.0537
Evaluation Accuracy: 0.9824
Evaluation F1: 0.9824
Epochs: 39

Model training code at GitHub

https://github.com/pengsihua2023/MHC-II-EpiPred

How to use MHC-II-EpiPred

An example

Pytorch and transformers libraries should be installed in your system.

Install pytorch

pip install torch torchvision torchaudio

Install transformers

pip install transformers

Run the following code

Coming soon!

Funding

This project was funded by the CDC to Justin Bahl (BAA 75D301-21-R-71738).

Model architecture, coding and implementation

Sihua Peng

Group, Department and Institution

Lab: Justin Bahl

Department: College of Veterinary Medicine Department of Infectious Diseases

Institution: The University of Georgia

image/png

Downloads last month
12
Safetensors
Model size
0.7B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support