StarVLA GR00T#
This page focuses on the GR00T pipeline and documents FGManip training/evaluation examples based on the StarVLA codebase. Official GR00T resource: NVIDIA Isaac GR00T.
Integration status: StarVLA/GR00T currently runs as a two-process integration (policy server + FGManip evaluator). This differs from single-process runners used by some other policy pages.
1. Evaluation (StarVLA + FGManip)#
Current setup runs two environments because the project is not yet packaged as a single library.
1.1 Required Script Changes#
Update both scripts before execution:
run_eval.shrun_policy_server.sh
export Python=/export/anaconda3/envs/maniskill/bin/python
export FGManip_HOME=/export/xuhy/zpy/FGManip
CKPT_PATH=<path to VLA checkpoint>
Note: Qwen-GR00T is trained with EE delta action. In this case,
control_mode should be EE and action dim is 7.
Other models may require a different control mode and action dimension.
1.2 Run Evaluation#
Use two terminals/environments:
Environment A (policy server, StarVLA model):
conda activate starVLA
cd core/vla/eval
bash run_policy_server.shEnvironment B (FGManip env runner):
conda activate maniskill
cd core/vla/eval
bash starVLA/examples/FGManip/eval_files/run_eval.sh2. Training (StarVLA codebase)#
conda activate starVLAPrepare data and config:
Place datasets (e.g.
lerobot_xx) atFGManip/core/policies/starvla/starVLA/playground/Dataset/.Modify FGManip mixture in
core/policies/vla/starvla/starVLA/dataloader/gr00t_lerobot/mixtures.py.StarVLA is not fully compatible with LeRobot 3.0 dataset layout. Convert to 2.1 format using
lerobot_v30_to_v21, and addmodality.jsonundermeta.
bash run_libero_train.sh3. Maintenance Checklist#
Model Variant: GR00T variant name, checkpoint source, expected control mode.
Runtime Topology: single-process vs dual-environment, port mapping, server/client launch order.
Dataset Versioning: raw format, converted format, conversion script/version, required metadata files.
Task Mapping: FGManip env IDs, object/part naming conventions, prompt templates.
Failure Notes: common mismatch symptoms (shape mismatch, wrong control mode, missing modality fields) and fixes.