metadata
base_model: NousResearch/Hermes-4.3-36B
language:
- en
library_name: mlx
license: apache-2.0
pipeline_tag: text-generation
tags:
- Bytedance Seed
- instruct
- finetune
- reasoning
- hybrid-mode
- chatml
- function calling
- tool use
- json mode
- structured outputs
- atropos
- dataforge
- long context
- roleplaying
- chat
- mlx
widget:
- example_title: Hermes 4
messages:
- role: system
content: >-
You are Hermes 4, a capable, neutrally-aligned assistant. Prefer
concise, correct answers.
- role: user
content: Explain the difference between BFS and DFS to a new CS student.
model-index:
- name: Hermes-4.3-ByteDance-Seed-36B
results: []
leonsarmiento/Hermes-4.3-36B-3bit-mlx
This model leonsarmiento/Hermes-4.3-36B-3bit-mlx was converted to MLX format from NousResearch/Hermes-4.3-36B using mlx-lm version 0.28.3.
MIXED QUANT: 6-BIT EMBEDDINGS AND PREDICTION LAYERS, 3-BIT EVERYTHING ELSE.
Temperature: 0.6 Top K: 20 Repeat penalty: OFF Min P sampling: OFF Top P sampling: 0.95
SYSTEM PROMPT:
"You are a deep thinking AI, you may use extremely long chains of thought to deeply consider the problem and deliberate with yourself via systematic reasoning processes to help come to a correct solution prior to answering. You should enclose your thoughts and internal monologue inside tags, and then provide your solution or response to the problem."
Use with mlx
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("leonsarmiento/Hermes-4.3-36B-3bit-mlx")
prompt = "hello"
if tokenizer.chat_template is not None:
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True
)
response = generate(model, tokenizer, prompt=prompt, verbose=True)