mudasir13cs's picture
Update README.md
81b4597 verified
---
library_name: transformers
pipeline_tag: text-generation
license: apache-2.0
tags:
- text-generation
- lora
- peft
- presentation-templates
- information-retrieval
- gemma
base_model:
- unsloth/gemma-3-4b-it-unsloth-bnb-4bit
datasets:
- cyberagent/crello
language:
- en
---
# Field-adaptive-query-generator
## Model Details
### Model Description
A fine-tuned text generation model for query generation from presentation template metadata. This model uses LoRA adapters to efficiently fine-tune Google Gemma-3-4B for generating diverse and relevant search queries as part of the Field-Adaptive Dense Retrieval framework.
**Developed by:** Mudasir Syed (mudasir13cs)
**Model type:** Causal Language Model with LoRA
**Language(s) (NLP):** English
**License:** Apache 2.0
**Finetuned from model:** unsloth/gemma-3-4b-it-unsloth-bnb-4bit
**Paper:** [Field-Adaptive Dense Retrieval of Structured Documents](https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE12352544)
### Model Sources
- **Repository:** https://github.com/mudasir13cs/hybrid-search
- **Paper:** https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE12352544
- **Base Model:** https://huggingface.co/unsloth/gemma-3-4b-it-unsloth-bnb-4bit
## Uses
### Direct Use
This model is designed for generating search queries from presentation template metadata including titles, descriptions, industries, categories, and tags. It serves as a key component in the Field-Adaptive Dense Retrieval system for structured documents.
### Downstream Use
- Content generation systems
- SEO optimization tools
- Template recommendation engines
- Automated content creation
- Field-adaptive search query generation
- Dense retrieval systems for structured documents
- Query expansion and reformulation
### Out-of-Scope Use
- Factual information generation
- Medical or legal advice
- Harmful content generation
- Tasks unrelated to presentation templates or structured document retrieval
## Bias, Risks, and Limitations
- The model may generate biased or stereotypical content based on training data
- Generated content should be reviewed for accuracy and appropriateness
- Performance depends on input quality and relevance
- Model outputs are optimized for presentation template domain
## How to Get Started with the Model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load the model
model = AutoModelForCausalLM.from_pretrained("mudasir13cs/Field-adaptive-query-generator")
tokenizer = AutoTokenizer.from_pretrained("mudasir13cs/Field-adaptive-query-generator")
# Generate content
# Format prompt using Gemma chat template
input_text = """<start_of_turn>user
Generate 8 different search queries that users might use to find this presentation template:
Title: Modern Business Presentation
Description: This modern business presentation template features a minimalist design...
Industries: Business, Marketing
Categories: Corporate, Professional
Tags: Modern, Clean, Professional
<end_of_turn>
<start_of_turn>model
"""
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=512, temperature=0.7, do_sample=True)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
```
## Training Details
### Training Data
- **Dataset:** Presentation template dataset with metadata
- **Size:** Custom dataset with template-query pairs
- **Source:** Curated presentation template collection from structured documents
- **Domain:** Presentation templates with field-adaptive metadata
### Training Procedure
- **Architecture:** Google Gemma-3-4B with LoRA adapters
- **Base Model:** unsloth/gemma-3-4b-it-unsloth-bnb-4bit
- **Loss Function:** Cross-entropy loss
- **Optimizer:** AdamW
- **Learning Rate:** 2e-4
- **Batch Size:** 4
- **Epochs:** 3
- **Framework:** Unsloth for efficient fine-tuning
### Training Hyperparameters
- **Training regime:** Supervised fine-tuning with LoRA (PEFT)
- **LoRA Rank:** 16
- **LoRA Alpha:** 32
- **Hardware:** GPU (NVIDIA)
- **Training time:** ~3 hours
- **Fine-tuning method:** Parameter-Efficient Fine-Tuning (PEFT)
## Evaluation
### Testing Data, Factors & Metrics
- **Testing Data:** Validation split from template dataset
- **Factors:** Content quality, relevance, diversity, field-adaptive retrieval performance
- **Metrics:**
- BLEU score
- ROUGE score
- Human evaluation scores
- Query relevance metrics
- Retrieval accuracy metrics
### Results
- **BLEU Score:** ~0.75
- **ROUGE Score:** ~0.80
- **Performance:** Optimized for query generation quality in structured document retrieval
- **Domain:** High performance on presentation template metadata
## Environmental Impact
- **Hardware Type:** NVIDIA GPU
- **Hours used:** ~3 hours
- **Cloud Provider:** Local/Cloud
- **Carbon Emitted:** Minimal (LoRA training with efficient Unsloth framework)
## Technical Specifications
### Model Architecture and Objective
- **Base Architecture:** Google Gemma-3-4B transformer decoder
- **Adaptation:** LoRA adapters for parameter-efficient fine-tuning
- **Objective:** Generate relevant search queries from template metadata for field-adaptive dense retrieval
- **Input:** Template metadata (title, description, industries, categories, tags)
- **Output:** Generated search queries for structured document retrieval
### Compute Infrastructure
- **Hardware:** NVIDIA GPU
- **Software:** PyTorch, Transformers, PEFT, Unsloth
## Citation
**Paper:**
```bibtex
@article{field_adaptive_dense_retrieval,
title={Field-Adaptive Dense Retrieval of Structured Documents},
author={Mudasir Syed},
journal={DBPIA},
year={2024},
url={https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE12352544}
}
```
**Model:**
```bibtex
@misc{field_adaptive_query_generator,
title={Field-adaptive-query-generator for Presentation Template Query Generation},
author={Mudasir Syed},
year={2024},
howpublished={Hugging Face},
url={https://huggingface.co/mudasir13cs/Field-adaptive-query-generator}
}
```
**APA:**
Syed, M. (2024). Field-adaptive-query-generator for Presentation Template Query Generation. Hugging Face. https://huggingface.co/mudasir13cs/Field-adaptive-query-generator
## Model Card Authors
Mudasir Syed (mudasir13cs)
## Model Card Contact
- **GitHub:** https://github.com/mudasir13cs
- **Hugging Face:** https://huggingface.co/mudasir13cs
- **LinkedIn:** https://pk.linkedin.com/in/mudasir-sayed
## Framework versions
- Transformers: 4.35.0+
- PEFT: 0.16.0+
- PyTorch: 2.0.0+
- Unsloth: Latest