Instructions to use HuggingFaceH4/zephyr-7b-alpha with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use HuggingFaceH4/zephyr-7b-alpha with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="HuggingFaceH4/zephyr-7b-alpha")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-alpha")
model = AutoModelForCausalLM.from_pretrained("HuggingFaceH4/zephyr-7b-alpha")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use HuggingFaceH4/zephyr-7b-alpha with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "HuggingFaceH4/zephyr-7b-alpha"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceH4/zephyr-7b-alpha",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/HuggingFaceH4/zephyr-7b-alpha

SGLang

How to use HuggingFaceH4/zephyr-7b-alpha with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "HuggingFaceH4/zephyr-7b-alpha" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceH4/zephyr-7b-alpha",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "HuggingFaceH4/zephyr-7b-alpha" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "HuggingFaceH4/zephyr-7b-alpha",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use HuggingFaceH4/zephyr-7b-alpha with Docker Model Runner:
```
docker model run hf.co/HuggingFaceH4/zephyr-7b-alpha
```

This license is NonCommercial, can we not have Apache2.0 like Mistral was?

by artificialgenerations4gsdfg - opened Oct 11, 2023

Discussion

artificialgenerations4gsdfg

Oct 11, 2023

As a SWE I know the common licenses Apache/MIT/GPL/etc, and know how I can/can't incorporate them, but I don't know why this is using a CC license, and I cannot use this for anything unfortunately. Reading through it, it's pretty harsh actually, and I would have guessed this finetune would have been released in the same spirit as Mistral, ie Apache.

Any chance for Apache?

lewtun

Hugging Face H4 org Oct 11, 2023

We opted for the NC license in order to comply with the license from one of the source datasets (UltraChat https://huggingface.co/datasets/stingning/ultrachat). If the dataset owners are happy to change the license to a permissive one, then it would likely be fine to update the license of this model as well since UltraFeedback is MIT licensed.

clem

Oct 11, 2023

cc @cyl @stingning maybe from ultrachat?

cyl

Oct 12, 2023

Hi, thanks for your suggestion. We have changed the license to MIT license for UltraChat. @clem @lewtun

latent-variable

Oct 12, 2023

https://huggingface.co/datasets/stingning/ultrachat/discussions/3#65278c58bc018e940257de29
@lewtun UltraChat is now MIT licensed.
Best regards,
Lino

eek

Oct 12, 2023

That's great to hear, I guess now it's ok to change Zephyr to MIT? @lewtun ?

backendmagier

Oct 13, 2023

a License Change for Commercial use would be huge! Id love to use the model!

lewtun

Hugging Face H4 org Oct 13, 2023

The Zephyr license is now MIT 🤗. Thank you very much @cyl @stingning for enabling this change!

lewtun changed discussion status to closed Oct 13, 2023

Aspie96

Oct 13, 2023

(I am not a lawyer and this is not legal advice).

The MIT licene is great.

However, if models are copyright-worthy at all (which is an open question, and I think the answer is and should be "no"), it's likely that this model is a derivative of Mistral, while it's not necessarily the case that it's a derivative of training data (which would be a problem anyways, since that'd include Mistral's training data).

If the user is bound to the MIT license, they are likely bound to the Apache 2.0 license (from Mistral) too anyways, so I think the Apache 2.0 license would actually make more sense, for this repo, than the MIT license (even if some of the data is MIT-licensed).

alexweberk

Oct 14, 2023

•

edited Oct 14, 2023

Curious, since UltraChat is synthetic data generated by ChatGPT by OpenAI, wouldn't it violate 2(c)(iii) of their terms of use if this model was somehow used for production for commercial purposes?

(iii) use output from the Services to develop models that compete with OpenAI

(not a lawyer, just a naive dev asking. Love the model btw)

Aspie96

Oct 14, 2023

@alexweberk I'm not sure if it's even relevant unless you are a client of OpenAI. ChatGPT's TOS aren't a law. What right would they have to restrict this model, regardless of what they say in the TOS? Using an argument from copyright would require a copyright-maximalist take that would not benefit what OpenAI needs to do to train models.

alexweberk

Oct 16, 2023

Thanks @Aspie96 ! I would love to think that way, and you raise great points. Still curious what the implications are if you were to use this model for business use cases (I hope there are none).

Aspie96

Oct 16, 2023

I hope so too.

As I said, I am not a lawyer, and also there are multiple jurisdictions in the world. Regardless, I don't see how a mere user of this model, commercial or not, would be bound by OpenAI's TOS.

gopalsai

Nov 2, 2023

@lewtun thank you for the model! A quick question, since UltraFeedback is using commercial models (including Bard) and Llama2 series (which have their own license), I am still a bit confused on how the model (or the dataset for matter) can be issued under MIT. Thanks in advance!
Reference: https://github.com/OpenBMB/UltraFeedback#dataset-construction

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment