Instructions to use gghfez/GLM-4.6-control-vectors with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use gghfez/GLM-4.6-control-vectors with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="gghfez/GLM-4.6-control-vectors")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("gghfez/GLM-4.6-control-vectors", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use gghfez/GLM-4.6-control-vectors with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "gghfez/GLM-4.6-control-vectors" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gghfez/GLM-4.6-control-vectors", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/gghfez/GLM-4.6-control-vectors
- SGLang
How to use gghfez/GLM-4.6-control-vectors with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "gghfez/GLM-4.6-control-vectors" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gghfez/GLM-4.6-control-vectors", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "gghfez/GLM-4.6-control-vectors" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gghfez/GLM-4.6-control-vectors", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use gghfez/GLM-4.6-control-vectors with Docker Model Runner:
docker model run hf.co/gghfez/GLM-4.6-control-vectors
gghfez/GLM-4.6-control-vectors
Creative Writing control-vectors for zai-org/GLM-4.6
Feedback is welcome and would be very helpful.
Usage
Apply the debias vector and either the positive or negative vector when starting llama-server. If both are applied, they will cancel each other out.
You can use either --control-vector [/path/to/vector.gguf] or --control-vector-scaled [/path/to/vector.gguf] [scale factor]
The debias vector must be set to 1.0
IMPORTANT: The positive and negative axis control vectors must be used along with the relevant de-bias control vector - they cannot be used on their own!
Llama.cpp / IK_Llama.cpp Example
Creative writing
llama-server --model GLM-4.6-UD-IQ2_XXS-00001-of-00003.gguf [your usual CLI arguments] \
--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__debias.gguf 1.0 \
--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__machiavellianism.gguf 1.0 \
Creative Writing without reasoning
llama-server --model GLM-4.6-UD-IQ2_XXS-00001-of-00003.gguf [your usual CLI arguments] \
--chat-template-kwargs '{"enable_thinking": false}' \
--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__debias.gguf 1.0 \
--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__machiavellianism.gguf 1.0 \
Assistant
llama-server --model GLM-4.6-IQ3_KS-00001-of-00004.gguf [your usual CLI arguments] \
--control-vector-scaled glm-4.6_communication__debias.gguf 1.0 \
--control-vector-scaled glm-4.6_communication__direct_communication.gguf 1.0 \
Limitations
With reasoning enabled on extreme quants like IQ2_XXS, very simple prompts like "Hi" may result in irrelevant replies.
- Downloads last month
- 438
We're not able to determine the quantization variants.
Model tree for gghfez/GLM-4.6-control-vectors
Base model
zai-org/GLM-4.6