Spaces:

Alovestocode
/

ZeroGPU-LLM-Inference

Sleeping

ZeroGPU-LLM-Inference / requirements.txt

Clarify LLM Compressor optional status - vLLM has native AWQ support

b2bf767 about 1 month ago

397 Bytes

	wheel
	streamlit
	ddgs
	gradio>=5.0.0
	torch>=2.8.0
	transformers>=4.53.3
	spaces
	sentencepiece
	accelerate
	vllm>=0.6.0
	# llm-compressor is optional - only needed for quantizing models, not loading pre-quantized AWQ
	# vLLM has native AWQ support built-in
	# llmcompressor>=0.1.0 # Commented out - not needed for loading pre-quantized models
	autoawq
	flash-attn>=2.5.0
	timm
	compressed-tensors
	bitsandbytes