ZeroGPU-LLM-Inference / requirements.txt
Alikestocode's picture
Clarify LLM Compressor optional status - vLLM has native AWQ support
b2bf767
raw
history blame contribute delete
397 Bytes
wheel
streamlit
ddgs
gradio>=5.0.0
torch>=2.8.0
transformers>=4.53.3
spaces
sentencepiece
accelerate
vllm>=0.6.0
# llm-compressor is optional - only needed for quantizing models, not loading pre-quantized AWQ
# vLLM has native AWQ support built-in
# llmcompressor>=0.1.0 # Commented out - not needed for loading pre-quantized models
autoawq
flash-attn>=2.5.0
timm
compressed-tensors
bitsandbytes