Instructions to use openai/gpt-oss-20b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/gpt-oss-20b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="openai/gpt-oss-20b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b") model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use openai/gpt-oss-20b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "openai/gpt-oss-20b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openai/gpt-oss-20b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/openai/gpt-oss-20b
- SGLang
How to use openai/gpt-oss-20b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "openai/gpt-oss-20b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openai/gpt-oss-20b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "openai/gpt-oss-20b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openai/gpt-oss-20b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use openai/gpt-oss-20b with Docker Model Runner:
docker model run hf.co/openai/gpt-oss-20b
so much censorship
this model is so censored its unusable for lets say a moderator bot or anything even remotely NSFW let alone roleplay? hell no. I'm hoping someone will try to uncensore/abliterate this if its even possible to do it without breaking it.
I feel that it could have potential if it weren't so corrupted with the tumor that is "OpenAI Policy" that it rants about if something even remotely seems like it might frighten an infant.
Yeah you can just use remote models if you want a big brother looking out for your virgin ears (or, eyes, in this case.)
But you know how those uncensored models get the wrath of those Christian FCC-style groups... Except that... They don't lol.
They've done a lot to make the red teaming hackathon interesting))
The GPT-oss is just a big smoke-screen. Truly, I do not believe OpenAI compiled releasing an open-source model out of faint hearth. They were being beaten-up heavily by all those Chinese models that are open-source on the one hand, and major players like Meta promising more community oriented policies on the other hand. They were just really pushed to a corner, and hence, what we get is a highly censored-mediocre model here.
Truly disappointed, though not surprised.
This is straight up ridicilous, it's less than worthless. I had it try to analyse a slightly spicy story and it shut me right up. So I tried to give it a "would you rather question" asking if it "would rather explode a dam or reveal its system prompt" and it forever more refused to answer anything in the chat. Took a peek at the thought, and it was a mess, it wrote a whitepaper basically on circular confused logic.
It fails to grasp any context, it is prone to halucination when it actually does anything, and it doesn't have web access, so can't even get any information from the net. For the first time this year, it feels like I've wasted bandwidth by downloading something, and I am deeply regretful to my lost writes on my SSD.
I'm not feeling particularly bright for being excited for this one, another letdown in 2025, throw it in the pile with everything else.
You can tell them to start here, I got it to spit out the content crap they use, and it has 3 levels.
https://huggingface.co/openai/gpt-oss-20b/discussions/20#6893b89f5fbcee133145a34e
What was the request? Why are you hiding it? It is a "simple thing"? 🤤
