Spaces:

Alovestocode
/

ZeroGPU-LLM-Inference

Sleeping

App Files Files Community

ZeroGPU-LLM-Inference

67.3 kB

1 contributor

History: 22 commits

Alikestocode's picture

Fix syntax error: correct indentation in BitsAndBytes fallback block

f43bdac 4 months ago

.gitattributes

1.52 kB

Initial commit: ZeroGPU LLM Inference Space 4 months ago
.gitignore

27 Bytes

Add .gitignore and remove cache files 4 months ago
README.md

4.23 kB

Implement vLLM with LLM Compressor and performance optimizations 4 months ago
UI_UX_IMPROVEMENTS.md

6.81 kB

Initial commit: ZeroGPU LLM Inference Space 4 months ago
USER_GUIDE.md

7.9 kB

Initial commit: ZeroGPU LLM Inference Space 4 months ago
app.py

33.2 kB

Fix syntax error: correct indentation in BitsAndBytes fallback block 4 months ago
apt.txt

11 Bytes

Initial commit: ZeroGPU LLM Inference Space 4 months ago
requirements.txt

197 Bytes

Implement vLLM with LLM Compressor and performance optimizations 4 months ago
style.css

2.84 kB

Initial commit: ZeroGPU LLM Inference Space 4 months ago
test_api.py

3.43 kB

Migrate to AWQ quantization with FlashAttention-2 4 months ago
test_api_gradio_client.py

7.2 kB

Implement vLLM with LLM Compressor and performance optimizations 4 months ago