Instructions to use gustavecortal/Oneirogen-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use gustavecortal/Oneirogen-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="gustavecortal/Oneirogen-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("gustavecortal/Oneirogen-7B")
model = AutoModelForCausalLM.from_pretrained("gustavecortal/Oneirogen-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use gustavecortal/Oneirogen-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "gustavecortal/Oneirogen-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "gustavecortal/Oneirogen-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/gustavecortal/Oneirogen-7B

SGLang

How to use gustavecortal/Oneirogen-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "gustavecortal/Oneirogen-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "gustavecortal/Oneirogen-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "gustavecortal/Oneirogen-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "gustavecortal/Oneirogen-7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use gustavecortal/Oneirogen-7B with Docker Model Runner:
```
docker model run hf.co/gustavecortal/Oneirogen-7B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Presentation

Oneirogen (0.5B, 1.5B and 7B) is a language model for dream generation based on Qwen2. It was trained on DreamBank, a corpus of more than 27,000 dream narratives.

Oneirogen was used to produce The Android and The Machine, an English dataset composed of 10,000 real and 10,000 generated dreams.

Oneirogen can be used to generate novel dream narratives. It can also be used for dream analysis. For example, one could finetuned this model on Hall and Van de Castle annotations to predict character and emotion in dream narratives. I've introduced this task in this paper.

Generation examples are available on my website.

Code for generation

from transformers import AutoTokenizer, AutoModelForCausalLM, StoppingCriteria, StoppingCriteriaList

class CustomStoppingCriteria(StoppingCriteria):
   def __init__(self, stop_token, tokenizer):
       self.stop_token = stop_token
       self.tokenizer = tokenizer

   def __call__(self, input_ids, scores, **kwargs):
       decoded_output = self.tokenizer.decode(input_ids[0], skip_special_tokens=True)
       if self.stop_token in decoded_output:
           return True
       return False

stop_token = "END." # The model was trained with this special end of text token.
stopping_criteria = StoppingCriteriaList([CustomStoppingCriteria(stop_token, tokenizer)])

tokenizer = AutoTokenizer.from_pretrained("gustavecortal/oneirogen-7B")
model = AutoModelForCausalLM.from_pretrained("gustavecortal/oneirogen-7B", torch_dtype=torch.float16)
model.to("cuda")

text = "Dream:" # The model was trained with this prefix

inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(inputs["input_ids"], attention_mask=inputs["attention_mask"], max_new_tokens=256, top_k = 50, top_p = 0.95, do_sample = True, temperature=0.9, num_beams = 1, repetition_penalty= 1.11, stopping_criteria=stopping_criteria)
print(tokenizer.batch_decode(outputs.detach().cpu().numpy(), skip_special_tokens=False)[0])

Inspiration

An oneirogen, from the Greek óneiros meaning "dream" and gen "to create", is a substance or other stimulus which produces or enhances dreamlike states of consciousness.

This model resonates with a speech called The Android and The Human given by science-fiction author Philip K. Dick:

Our environment – and I mean our man-made world of machines, artificial constructs, computers, electronic systems, interlinking homeostatic components – all of this is in fact beginning more and more to possess what the earnest psychologists fear the primitive sees in his environment: animation. In a very real sense our environment is becoming alive, or at least quasi-alive, and in ways specifically and fundamentally analogous to ourselves... Rather than learning about ourselves by studying our constructs, perhaps we should make the attempt to comprehend what our constructs are up to by looking into what we ourselves are up to