leonsarmiento/Hermes-4.3-36B-6bit-mlx

This model leonsarmiento/Hermes-4.3-36B-6bit-mlx was converted to MLX format from NousResearch/Hermes-4.3-36B using mlx-lm version 0.28.3.

MIXED QUANT: 8-BIT EMBEDDINGS AND PREDICTION LAYERS, 6-BIT EVERYTHING ELSE.

Temperature: 0.6 Top K: 20 Repeat penalty: OFF Min P sampling: OFF Top P sampling: 0.95

SYSTEM PROMPT:

"You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem."

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("leonsarmiento/Hermes-4.3-36B-6bit-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Downloads last month: 49

Safetensors

Model size

36B params

Tensor type

BF16

U32

Model tree for leonsarmiento/Hermes-4.3-36B-6bit-mlx

Base model

ByteDance-Seed/Seed-OSS-36B-Base

Finetuned

NousResearch/Hermes-4.3-36B

Quantized

(17)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard