Commit History

Implement vLLM with LLM Compressor and performance optimizations
a79facb

Alikestocode commited on

Migrate to AWQ quantization with FlashAttention-2
06b4cf5

Alikestocode commited on

Update README and clean up old files
9592189

Alikestocode commited on

Update README: Focus on CourseGPT-Pro router checkpoints
4706b45

Alikestocode commited on

Update README with correct space URL
9af4b77

Alikestocode commited on

Initial commit: ZeroGPU LLM Inference Space
f91e906

Alikestocode commited on