Update README.md
Browse files
README.md
CHANGED
|
@@ -1,6 +1,12 @@
|
|
| 1 |
---
|
| 2 |
base_model: NeeruAjith/SpiderSeqGen
|
| 3 |
library_name: peft
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
---
|
| 5 |
|
| 6 |
# Model Card for Model ID
|
|
@@ -17,25 +23,31 @@ library_name: peft
|
|
| 17 |
|
| 18 |
|
| 19 |
|
| 20 |
-
- **Developed by:**
|
| 21 |
- **Funded by [optional]:** [More Information Needed]
|
| 22 |
- **Shared by [optional]:** [More Information Needed]
|
| 23 |
-
- **Model type:**
|
| 24 |
-
- **Language(s) (NLP):**
|
| 25 |
- **License:** [More Information Needed]
|
| 26 |
-
- **Finetuned from model [optional]:**
|
| 27 |
|
| 28 |
### Model Sources [optional]
|
| 29 |
|
| 30 |
<!-- Provide the basic links for the model. -->
|
| 31 |
|
| 32 |
- **Repository:** [More Information Needed]
|
| 33 |
-
- **Paper [optional]:**
|
| 34 |
- **Demo [optional]:** [More Information Needed]
|
| 35 |
|
| 36 |
## Uses
|
| 37 |
|
| 38 |
-
<!--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
### Direct Use
|
| 41 |
|
|
|
|
| 1 |
---
|
| 2 |
base_model: NeeruAjith/SpiderSeqGen
|
| 3 |
library_name: peft
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
tags:
|
| 7 |
+
- protein
|
| 8 |
+
- spiderfiber
|
| 9 |
+
- generativemodel
|
| 10 |
---
|
| 11 |
|
| 12 |
# Model Card for Model ID
|
|
|
|
| 23 |
|
| 24 |
|
| 25 |
|
| 26 |
+
- **Developed by:** Neeru Dubey (NeeruAjith)
|
| 27 |
- **Funded by [optional]:** [More Information Needed]
|
| 28 |
- **Shared by [optional]:** [More Information Needed]
|
| 29 |
+
- **Model type:** GPT
|
| 30 |
+
- **Language(s) (NLP):** English
|
| 31 |
- **License:** [More Information Needed]
|
| 32 |
+
- **Finetuned from model [optional]:** ProtGPT2
|
| 33 |
|
| 34 |
### Model Sources [optional]
|
| 35 |
|
| 36 |
<!-- Provide the basic links for the model. -->
|
| 37 |
|
| 38 |
- **Repository:** [More Information Needed]
|
| 39 |
+
- **Paper [optional]:** https://arxiv.org/abs/2504.08437
|
| 40 |
- **Demo [optional]:** [More Information Needed]
|
| 41 |
|
| 42 |
## Uses
|
| 43 |
|
| 44 |
+
<!-- The exceptional mechanical performance of spider silk—characterized by its high tensile strength and extensibility—is largely attributed to the repetitive domains within major ampullate spidroin (MaSp) proteins. Despite this, uncovering clear relationships between these mechanical properties and the underlying repeat sequences remains a significant challenge, owing to the complex interplay among sequence, structure, and function, as well as the scarcity of comprehensive annotated datasets.
|
| 45 |
+
|
| 46 |
+
In this work, we introduce a computational approach for the design of MaSp repeat sequences with tunable mechanical characteristics. Our framework is built upon a compact GPT-style generative model, derived by distilling the ProtGPT2 protein language model. The resulting model underwent a two-stage fine-tuning process using curated data from the Spider Silkome resource. First, it was trained on 6,000 repeat sequences to learn MaSp-specific patterns. It was then fine-tuned through cross-validation on 592 repeats annotated with experimentally measured fiber-level mechanical properties.
|
| 47 |
+
|
| 48 |
+
This generative model is capable of producing biologically relevant repeat sequences aligned with target mechanical attributes, while also enabling property prediction from given sequences. Model validation was performed through analyses of sequence motifs, physicochemical properties, and secondary structure trends. In addition, predictive performance was benchmarked by aligning generated sequences against natural sequences using BLAST, and by evaluating prediction accuracy on a test set with known mechanical properties.
|
| 49 |
+
|
| 50 |
+
Overall, our framework provides a new avenue for the rational design of spider silk-inspired proteins, offering a flexible tool for engineering MaSp repeats with desired mechanical functions. -->
|
| 51 |
|
| 52 |
### Direct Use
|
| 53 |
|