OpenVLA - MCX Card Task

Fine-tuned OpenVLA (7B) for the MCX card pick-and-place task in Isaac Sim using a Franka Panda robot.

Training Details

Parameter Value
Base model openvla/openvla-7b
Learning rate 2e-5
Batch size 8
Epochs 8 (checkpoint at epoch 6)
Optimizer AdamW (weight_decay=0.01)
Scheduler Cosine with warmup (5% of total steps)
Gradient clipping 1.0
Precision bfloat16
Gradient checkpointing Enabled
Hardware 1x NVIDIA A100 80GB

Dataset

  • Source: tshiamor/mcx-card-openvla
  • Task: MCX card pick-and-place manipulation
  • Language instruction: "Pick up the blue block and place it on the target"
  • Action dimensions: 7 (end-effector control)
  • Format: Episode-based with per-step language instructions

Usage

from transformers import AutoModelForVision2Seq, AutoProcessor
import torch

model = AutoModelForVision2Seq.from_pretrained(
    "tshiamor/openvla-mcx-card",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)
processor = AutoProcessor.from_pretrained("tshiamor/openvla-mcx-card", trust_remote_code=True)

action = model.predict_action(
    image,
    instruction="Pick up the blue block and place it on the target",
    processor=processor
)

Dependencies

  • transformers >=4.40.0, <4.50.0
  • torch==2.5.1 (CUDA 12.1)
  • timm >=0.9.10, <1.0.0
Downloads last month
57
Safetensors
Model size
8B params
Tensor type
BF16
·
Video Preview
loading

Model tree for tshiamor/openvla-mcx-card

Base model

openvla/openvla-7b
Finetuned
(16)
this model

Dataset used to train tshiamor/openvla-mcx-card