huihui-ai/Huihui-Qwen3-VL-32B-Instruct-abliterated

#1469

by osxest - opened Oct 23

Discussion

osxest

Oct 23

Hello mradermacher team,

Model URL: https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-32B-Instruct-abliterated

Thank you in advance!

nicoboss

Oct 23

It's queued! :D
I'm currently testing the model myself and have to say I'm quite satisfied with it. This one urgently needs quants as somehow even 3x A100 40 GiB are just barely enough to run it in full precision. This model is so GPU memory hungry when ran in vLLM.

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Huihui-Qwen3-VL-32B-Instruct-abliterated-GGUF for quants to appear.

nicoboss

Oct 23

Our GGUFs will unfortunately be text-only for now and might have to be redone once vision is implemented. Please follow https://github.com/ggml-org/llama.cpp/issues/16207 and let you know once proper Qwen3-VL support is merged. This does not just affect this model but every Qwen3-VL based models.

nicoboss

Oct 25

Turns out llama.cpp does not currently support Qwen3VLForConditionalGeneration at all so even the text-only quants failed.

In the meantime, I tested the vision of this model on vLLM and it really is the first vision model that whose vision performance I consider satisfactory for most my tasks. All vision model I tried before where honestly quite useless and failed even the simplest real-world vision tasks. While there is still big room for improvement this model is far better than Qwen2.5 or Gemma3 vision booth of which were close to useless.

Let's follow https://github.com/ggml-org/llama.cpp/issues/16207 and do all Qwen3 vision model once proper support for it is implemented in llama.cpp

wqerrewetw

Oct 30

https://github.com/ggml-org/llama.cpp/pull/16780 merged

nicoboss

Oct 30

Awesome I just updated our llama.cpp fork. We can now either wait for mradermacher to update all workers or I can provide a custom version I compile myself to nico1 and rich1. I guess this model is in so high demand that it justifies a custom version.

nicoboss

Oct 30

It's queued! :D

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Huihui-Qwen3-VL-32B-Instruct-abliterated-GGUF for quants to appear.

nicoboss

Oct 30

•

edited Oct 30

It unfortinately didn't do mmproj extraction because it is not marked as vision model. We need @mradermacher to do so. Luckely I still have all the SafeTensor models locally so should be easy to add them one the model is marked for MMPROJ extraction.

nicoboss

Oct 30

I was likely able to mark it as vision myself using llmjob is-vision-arch arch Qwen3VLForConditionalGeneration

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment