User Guide · Core Concepts · Architecture

Architecture

MetaFine sits on a three-layer pipeline: atomic skills compose into task graphs, task graphs drive recording and rollout, and rollouts feed a three-dimension diagnostic. Every concept maps onto something concrete in the source tree.

The pipeline

A MetaFine evaluation is the same shape every time: an atomic-skill primitive is composed into a multi-step task graph; the task graph drives a recording or rollout; the rollout's trajectory and per-stage outcomes are scored along three orthogonal axes.

[ Atomic skills ][ Compositional task graph ][ Diagnostic evaluation ]
 core/skill.py configs/*.yaml · core/predicates.py utils/eval_*.py

21 typed primitives Stages + success predicates Understanding / Perception / Behavior
@register_skill YAML or Python DSL one results.json per run

Layer 1 — Atomic skills

A skill is a motion-planning solver that achieves one well-defined interaction with one well-defined part of an articulated asset — grasp this handle, rotate this knob 90°, slide this drawer 5 cm. The 21 atomic skills MetaFine ships fall into three phases:

  • Interaction — engages the object (transitions contact state). E.g. grasp_part, press, flip_switch, lift_lid.
  • Continuation — operates on an already-engaged part; safe to chain after an interaction. E.g. pure_rotate, pure_slide, pure_insert, release_gripper.
  • Bundle — pre-composed multi-step routines kept atomic for now (until we split them). E.g. lid_opening, stand_up, toggle_switch.

Every skill declares the affordances it requires of its target part — grasp_part requires graspable, pure_rotate requires rotatable, etc. See Affordances for the closed-set vocabulary.

Layer 2 — Compositional task graphs

A task graph is a YAML chain of skill calls with optional per-stage success predicates. Multi-step tasks ("grasp the cap, then lift it 5 cm") become 20-line YAML files rather than new env classes:

name: grasp_and_lift_cap
stages:
  - skill: grasp_part
    target: { object: 100221, part: cap }
    success: grasped("cap")
  - skill: pure_lift
    success: and(grasped("cap"), lifted("cap", height_m: 0.05))

Predicates compile to a callable evaluated each step — so per-stage success rates are computed for free. The predicate DSL (and / or / not plus six atomic predicates) is documented under Predicate DSL.

Layer 3 — Diagnostic evaluation

A rollout produces three orthogonal signals that get scored along the three axes:

  • Understanding — per-stage success rates over the task graph; surfaces where the chain breaks.
  • Perception — domain-randomisation sweeps (lighting, view, jitter) with AUSC normalisation; surfaces robustness to visual variation.
  • Behavior — trajectory smoothness (jerk RMS, velocity variance, path length); surfaces jerky / hesitant / chunk-artefact policies.

The three scores are emitted into a single results.json per run, so two policies can be compared across the full diagnostic plane.

How a request flows

  1. Resolve. The task graph names a skill plus a target part. The SKILL_REGISTRY looks up the skill spec; the asset's capabilities.json confirms the target part offers the required affordances. A mismatch fails fast with a clear error.
  2. Plan. The skill solver constructs a motion plan (or the policy network predicts an action chunk).
  3. Roll out. Each step's observation is captured; the stage predicate is evaluated.
  4. Score. At end-of-episode, the three diagnostic dimensions are aggregated; the result is appended to the run's results.json.

Source-tree map

ModuleRole
core/skill.py21 motion-planning skill solvers.
core/skill_registry.py@register_skill, SKILL_REGISTRY, the 11-affordance vocabulary.
core/predicates.pyPredicate-DSL compiler.
core/env.py19 Gym envs (single-skill + bundle).
core/scene.pySceneBuilders (data-driven, no per-asset branches).
core/env_mixins.pyEvalDREnvMixin — camera / light jitter helpers.
utils/task_graph.pyTaskGraph dataclass + YAML loader + runner.
utils/eval_metrics.pyEpisodeResult, EvalSummary, compute_smoothness.
utils/eval_sweep.pydr_sweep + standard_dr_sweeps with AUSC.
utils/eval_setup.pymake_eval_env — dispatches single-skill ↔ task-graph mode.