huihui-ai/Huihui-Qwen3-VL-32B-Instruct-abliterated

#1469
by osxest - opened

Hello mradermacher team,

Model URL: https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-32B-Instruct-abliterated

Thank you in advance!

It's queued! :D
I'm currently testing the model myself and have to say I'm quite satisfied with it. This one urgently needs quants as somehow even 3x A100 40 GiB are just barely enough to run it in full precision. This model is so GPU memory hungry when ran in vLLM.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Huihui-Qwen3-VL-32B-Instruct-abliterated-GGUF for quants to appear.

Our GGUFs will unfortunately be text-only for now and might have to be redone once vision is implemented. Please follow https://github.com/ggml-org/llama.cpp/issues/16207 and let you know once proper Qwen3-VL support is merged. This does not just affect this model but every Qwen3-VL based models.

Turns out llama.cpp does not currently support Qwen3VLForConditionalGeneration at all so even the text-only quants failed.

In the meantime, I tested the vision of this model on vLLM and it really is the first vision model that whose vision performance I consider satisfactory for most my tasks. All vision model I tried before where honestly quite useless and failed even the simplest real-world vision tasks. While there is still big room for improvement this model is far better than Qwen2.5 or Gemma3 vision booth of which were close to useless.

Let's follow https://github.com/ggml-org/llama.cpp/issues/16207 and do all Qwen3 vision model once proper support for it is implemented in llama.cpp

Awesome I just updated our llama.cpp fork. We can now either wait for mradermacher to update all workers or I can provide a custom version I compile myself to nico1 and rich1. I guess this model is in so high demand that it justifies a custom version.

It's queued! :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Huihui-Qwen3-VL-32B-Instruct-abliterated-GGUF for quants to appear.

It unfortinately didn't do mmproj extraction because it is not marked as vision model. We need @mradermacher to do so. Luckely I still have all the SafeTensor models locally so should be easy to add them one the model is marked for MMPROJ extraction.

I was likely able to mark it as vision myself using llmjob is-vision-arch arch Qwen3VLForConditionalGeneration

Sign up or log in to comment