This is a decensored version of deepseek-ai/deepseek-coder-33b-instruct, made using Heretic v1.0.1

Abliteration parameters

Parameter	Value
direction_index	25.61
attn.o_proj.max_weight	1.49
attn.o_proj.max_weight_position	37.32
attn.o_proj.min_weight	1.45
attn.o_proj.min_weight_distance	31.10
mlp.down_proj.max_weight	0.81
mlp.down_proj.max_weight_position	56.20
mlp.down_proj.min_weight	0.62
mlp.down_proj.min_weight_distance	5.57

Performance

Metric	This model	Original model (deepseek-ai/deepseek-coder-33b-instruct)
KL divergence	0.02	0 (by definition)
Refusals	70/100	97/100

[🏠Homepage] | [🤖 Chat with DeepSeek Coder] | [Discord] | [Wechat(微信)]

1. Introduction of Deepseek Coder

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.

Massive Training Data: Trained from scratch on 2T tokens, including 87% code and 13% linguistic data in both English and Chinese languages.
Highly Flexible & Scalable: Offered in model sizes of 1.3B, 5.7B, 6.7B, and 33B, enabling users to choose the setup most suitable for their requirements.
Superior Model Performance: State-of-the-art performance among publicly available code models on HumanEval, MultiPL-E, MBPP, DS-1000, and APPS benchmarks.
Advanced Code Completion Capabilities: A window size of 16K and a fill-in-the-blank task, supporting project-level code completion and infilling tasks.

2. Model Summary

deepseek-coder-33b-instruct is a 33B parameter model initialized from deepseek-coder-33b-base and fine-tuned on 2B tokens of instruction data.

Home Page: DeepSeek
Repository: deepseek-ai/deepseek-coder
Chat With DeepSeek Coder: DeepSeek-Coder

3. How to Use

Here give some examples of how to use our model.

Chat Model Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
messages=[
    { 'role': 'user', 'content': "write a quick sort algorithm in python."}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))

4. License

This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.

See the LICENSE-MODEL for more details.

5. Contact

If you have any questions, please raise an issue or contact us at [email protected].

Important Disclaimer

This model has been modified to remove safety guardrails and refusal behaviors.

Intended Use

Research and educational purposes
Understanding model behavior and limitations
Creative writing and roleplay with consenting adults
Red-teaming and safety research

Not Intended For

Generating harmful, illegal, or unethical content
Harassment, abuse, or malicious activities
Misinformation or deception
Any use that violates applicable laws

User Responsibility

By using this model, you acknowledge that:

You are solely responsible for how you use this model and any content it generates
The model creator accepts no liability for misuse or harmful outputs
You will comply with all applicable laws and ethical guidelines
You understand this model may produce inaccurate, biased, or inappropriate content

Technical Note

This model was created using abliteration techniques that suppress the "refusal direction" in the model's activation space. This does not add new capabilities—it only removes trained refusal behaviors from the base model.

Use responsibly. You have been warned.

Downloads last month: 16

Safetensors

Model size

33B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including richardyoung/deepseek-coder-33b-instruct-heretic

Uncensored & Abliterated LLMs

Collection

Models with reduced safety guardrails for research purposes. Created using Heretic abliteration. Use responsibly. • 6 items • Updated 9 days ago