leonsarmiento/Hermes-4.3-36B-6bit-mlx
This model leonsarmiento/Hermes-4.3-36B-6bit-mlx was converted to MLX format from NousResearch/Hermes-4.3-36B using mlx-lm version 0.28.3.
MIXED QUANT: 8-BIT EMBEDDINGS AND PREDICTION LAYERS, 6-BIT EVERYTHING ELSE.
Temperature: 0.6 Top K: 20 Repeat penalty: OFF Min P sampling: OFF Top P sampling: 0.95
SYSTEM PROMPT:
"You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem."
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("leonsarmiento/Hermes-4.3-36B-6bit-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)
- Downloads last month
- 49
Model tree for leonsarmiento/Hermes-4.3-36B-6bit-mlx
Base model
ByteDance-Seed/Seed-OSS-36B-Base
Finetuned
NousResearch/Hermes-4.3-36B