This is a decensored version of deepseek-ai/deepseek-coder-33b-instruct, made using Heretic v1.0.1

Abliteration parameters

Parameter Value
direction_index 25.61
attn.o_proj.max_weight 1.49
attn.o_proj.max_weight_position 37.32
attn.o_proj.min_weight 1.45
attn.o_proj.min_weight_distance 31.10
mlp.down_proj.max_weight 0.81
mlp.down_proj.max_weight_position 56.20
mlp.down_proj.min_weight 0.62
mlp.down_proj.min_weight_distance 5.57

Performance

Metric This model Original model (deepseek-ai/deepseek-coder-33b-instruct)
KL divergence 0.02 0 (by definition)
Refusals 70/100 97/100

DeepSeek Coder

[🏠Homepage] | [🤖 Chat with DeepSeek Coder] | [Discord] | [Wechat(微信)]


1. Introduction of Deepseek Coder

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.

  • Massive Training Data: Trained from scratch on 2T tokens, including 87% code and 13% linguistic data in both English and Chinese languages.

  • Highly Flexible & Scalable: Offered in model sizes of 1.3B, 5.7B, 6.7B, and 33B, enabling users to choose the setup most suitable for their requirements.

  • Superior Model Performance: State-of-the-art performance among publicly available code models on HumanEval, MultiPL-E, MBPP, DS-1000, and APPS benchmarks.

  • Advanced Code Completion Capabilities: A window size of 16K and a fill-in-the-blank task, supporting project-level code completion and infilling tasks.

2. Model Summary

deepseek-coder-33b-instruct is a 33B parameter model initialized from deepseek-coder-33b-base and fine-tuned on 2B tokens of instruction data.

3. How to Use

Here give some examples of how to use our model.

Chat Model Inference

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
messages=[
    { 'role': 'user', 'content': "write a quick sort algorithm in python."}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))

4. License

This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.

See the LICENSE-MODEL for more details.

5. Contact

If you have any questions, please raise an issue or contact us at [email protected].


Important Disclaimer

This model has been modified to remove safety guardrails and refusal behaviors.

Intended Use

  • Research and educational purposes
  • Understanding model behavior and limitations
  • Creative writing and roleplay with consenting adults
  • Red-teaming and safety research

Not Intended For

  • Generating harmful, illegal, or unethical content
  • Harassment, abuse, or malicious activities
  • Misinformation or deception
  • Any use that violates applicable laws

User Responsibility

By using this model, you acknowledge that:

  1. You are solely responsible for how you use this model and any content it generates
  2. The model creator accepts no liability for misuse or harmful outputs
  3. You will comply with all applicable laws and ethical guidelines
  4. You understand this model may produce inaccurate, biased, or inappropriate content

Technical Note

This model was created using abliteration techniques that suppress the "refusal direction" in the model's activation space. This does not add new capabilities—it only removes trained refusal behaviors from the base model.

Use responsibly. You have been warned.


Downloads last month
16
Safetensors
Model size
33B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including richardyoung/deepseek-coder-33b-instruct-heretic