User Guide · Policies · Overview

Policies overview

MetaFine vendors seven VLA backbones under core/policies/<name>, each a self-contained subpackage installed separately. The verified training paths are LeRobot and StarVLA; π0.5 closed-loop inference in the simulator is verified. Every backbone has its own evaluate.py + README.md documenting its exact flags.

Vendored backbones

Backbone	Package path	Status	Page
π0.5	`core/policies/pi05`	infer verified	π0 / π0.5
StarVLA	`core/policies/starvla`	training verified	StarVLA
π0	`core/policies/pi0`	vendored	π0 / π0.5
ACT	`core/policies/act`	vendored	ACT
DP3	`core/policies/diffusion_policy_3d`	vendored	DP3
OpenVLA	`core/policies/openvla`	vendored	OpenVLA / OFT
OpenVLA-OFT	`core/policies/openvla-oft`	vendored	OpenVLA / OFT

"vendored" = the upstream code ships in-tree and installs, but the train/eval path hasn't been validated end-to-end inside MetaFine yet. Treat those as starting points, not turnkey.

# Base MetaFine
$ pip install -e .

# A specific policy + its native deps (conflicting pins → install in isolation)
$ pip install -e core/policies/pi05
$ pip install -e core/policies/openvla-oft

Training

Training runs inside the policy's own framework — there is no universal trainer. Two verified paths consume a LeRobot dataset:

LeRobot — feed the convert_to_lerobot output directory to the standard LeRobot training pipeline.
StarVLA — place the dataset under core/policies/starvla/starVLA/playground/Dataset/, register it in dataloader/gr00t_lerobot/mixtures.py; StarVLA expects LeRobot 2.1 (convert a 3.0 dataset with lerobot_v30_to_v21 + add modality.json), then bash run_libero_train.sh. See core/policies/starvla/train/README.md.

Evaluation / inference

Each backbone has its own evaluate.py (standalone argparse) — there is no shared --task-graph adapter. Flags are documented in core/policies/<name>/README.md. The π0.5 example:

$ python core/policies/pi05/evaluate.py \
    --policy-path /path/to/pretrained_model \
    --env-id grasp_part \
    --object-name 100221 \
    --part-name cap \
    --obs-mode rgb \
    --control-mode pd_joint_delta_pos \
    --n-episodes 50 \
    --device cuda \
    --task "Grasp the cap of the bottle." \
    --record-dir eval_out --save-video

For the staged diagnostic (semantic-intervention / object-swap — the protocol behind the Understanding axis), each backbone provides a wrapper, e.g. bash core/policies/pi05/run_eval_three_stage.sh.