YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Video Generation Endpoint API (Universal Handler)

This repository is configured for deployment as a Hugging Face Inference Endpoint using a Universal Custom Handler. It supports both Text-to-Video and Image-to-Video workflows, returning results as GIF, WebM, or raw Frames (ZIP).


Endpoint URL

After deployment, your endpoint will look like:

https://<your-endpoint>.aws.endpoints.huggingface.cloud

Authentication

All requests require a Hugging Face token with permission to call the endpoint.

Authorization: Bearer YOUR_HF_TOKEN

Request Format

Requests must be wrapped in a top-level inputs object.

1. Text-to-Video (T2V)

{
  "inputs": {
    "prompt": "cinematic drone shot of a futuristic city",
    "num_frames": 32,
    "outputs": ["gif"]
  }
}

2. Image-to-Video (I2V)

To animate a static image, provide a base64-encoded string in the image field.

{
  "inputs": {
    "prompt": "slow zoom in, volumetric fog",
    "image": "BASE64_STRING_HERE",
    "num_frames": 32,
    "outputs": ["gif"]
  }
}

API Parameters

Field Type Default Description
prompt string required Description of the video or motion.
image string null (New) Base64-encoded input image for I2V.
negative_prompt string "" Elements to avoid in the generation.
num_frames int 32 Total frames to generate.
fps int 12 Playback frame rate.
height int 512 Video height (must be divisible by 32).
width int 512 Video width (must be divisible by 32).
seed int null Random seed for reproducibility.
num_inference_steps int 30 Higher = better quality, slower generation.
guidance_scale float 7.5 How strictly to follow the prompt.
outputs array ["gif"] Output formats: ["gif", "webm", "zip"].
return_base64 bool true Returns file content as base64 string.

Output Configuration

You can customize specific output formats by adding a matching key to inputs.

"inputs": {
  "outputs": ["webm"],
  "webm": { 
    "quality": "best", 
    "fps": 24 
  }
}
  • Gif: { "fps": int }
  • WebM: { "fps": int, "quality": "fast"|"good"|"best" }

Response Format

Success response:

{
  "ok": true,
  "outputs": {
    "gif_base64": "R0lGODlh...", 
    "webm_base64": "..."
  },
  "diagnostics": {
    "timing_ms": { ... },
    "mode": "i2v"  // or "t2v"
  }
}

usage Examples (curl)

1. Simple Text-to-Video (GIF)

curl -sS -X POST "https://<ENDPOINT>.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "a cyberpunk street in rain, neon lights",
      "num_frames": 24,
      "outputs": ["gif"]
    }
  }' \
| jq -er '.outputs.gif_base64' | base64 --decode > output.gif

2. Image-to-Video (GIF)

# MacOS/Linux: Convert image to base64
IMG_B64=$(base64 -i my_input.jpg)

curl -sS -X POST "https://<ENDPOINT>.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d "{
    \"inputs\": {
      \"prompt\": \"waves crashing on the shore, moving water\",
      \"image\": \"$IMG_B64\",
      \"num_frames\": 32,
      \"outputs\": [\"gif\"]
    }
  }" \
| jq -er '.outputs.gif_base64' | base64 --decode > animated.gif

3. High-Quality Video (WebM)

curl -sS -X POST "https://<ENDPOINT>.aws.endpoints.huggingface.cloud" \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": {
      "prompt": "slow pan over a mars landscape",
      "num_frames": 48,
      "fps": 24,
      "outputs": ["webm"],
      "webm": { "quality": "best" }
    }
  }' \
| jq -er '.outputs.webm_base64' | base64 --decode > output.webm
Downloads last month
486
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support