ProteoRift Model
Model Description
ProteoRift is an end-to-end machine learning model for peptide database search in mass spectrometry proteomics. The model predicts multiple peptide properties (length, missed cleavages, and modification status) directly from spectra, enabling efficient search-space reduction.
Usage
Training Data
The model was trained on large-scale mass spectrometry datasets including:
- NIST human peptide libraries
- MassIVE public datasets
System Requirements
- GPU Memory: 12GB+ recommended
- Python: 3.8+
- PyTorch: 1.10+
License
This model and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. Any commercial use, sale, or other monetization of this model and its derivatives, which include models trained on outputs from the model or datasets created from the model, is prohibited and requires prior approval. Downloading the model requires prior registration on Hugging Face and agreeing to the terms of use. By downloading this model, you agree not to distribute, publish or reproduce a copy of the model. If another user within your organization wishes to use the model, they must register as an individual user and agree to comply with the terms of use. Users may not attempt to re-identify the deidentified data used to develop the underlying model. If you are a commercial entity, please contact the corresponding author.