Atlas · Servo7 · branch JvL/digital_twin_integration

Digital twin integration — interactive flow

Click the mode buttons, toggle the DT flag, hit record, then "Tick" (or turn on Auto). Packets travel along whichever path is currently active. The record queue fills while recording in non-replay modes; REPLAY drains it back through the bus (FIFO).

Update 2026-05-16 (later): resurveyed after 9 commits. Soft snap-back was reverted to an immediate snap_twin_to_self() on DT disable. Robot.commanded_action is now a single-slot store (last action) populated inside set_state. STATE_UPDATE now carries four joint vectors per follower (joint_positions, hw_joint_positions, dt_joint_positions, commanded_joint_positions). The frontend renders a new AtlasJointDebugStrip that diffs all three.

Mode

Twin

Simulate

mode: teleop DT: off rec: off

Record queue

0 / 32 recorded idle

What happens on tick?

Pick a mode + Tick. Or hit Auto.

Event log

How to read this

Pick a mode — only that leader is lit; the right paths to the orchestrator turn solid.
Toggle DT — flips the gate inside the follower. On = HW box dims to "FROZEN", commands route through the gate to the twin. Off = HW is the executor and the twin gets mirrored writes (dashed line). Disable triggers an immediate snap_twin_to_self() — no soft settle window (that was reverted).
Hit Record — REC badge pulses on the twin. Now every tick (in non-REPLAY modes) appends to _record_queue and you see a slot fill on the right.
Switch to REPLAY — recording is gated off automatically (twin.begin_replay() sets _replay_active). Each tick popleft()s one slot from the front of the queue (FIFO), animates it back up to the orchestrator, through the bus, to the follower. The HW or twin moves depending on DT.
Every set_state captures commanded_action — that single-slot store is what the frontend's new AtlasJointDebugStrip reads to draw the "action" row alongside "state" (hw) and "dt" (twin). Amber when commanded drifts from hw, cyan when dt drifts from hw.
E-STOP — no commands are produced. Nothing animates.

Where state actually lives (the part that doesn't lie)

Store	Owner	Who writes · who reads
`SystemState.states`	SystemState	written by every robot's state publish · read by orchestrator (TELEOP callback) + viewer (URDF) + STATE_UPDATE broadcast
`RobotController._queue`	RobotController	pushed by viser jog · drained by orchestrator at 30Hz in CONTROLLER mode
`DigitalTwin._record_queue`	DigitalTwin	FIFO. written by `_record()` when recording + not replay-active · drained by REPLAY via `popleft()` · snapshotted by viewer scrub
`Robot.commanded_action`	Each Robot	NEW. single-slot store (last action). populated at the top of `set_state()` before routing. surfaced as `commanded_joint_positions` in STATE_UPDATE for the joint-debug strip
`orchestrator.control_mode`	Orchestrator	set by frontend WS (SET_CONTROL_MODE) · broadcast as MODE_CHANGED
`Robot._use_digital_twin`	Each Robot	toggled by `enable_digital_twin()` · independent of control_mode · disable does an immediate `snap_twin_to_self()` (no settle window)

The one insight that makes the rest make sense

A DigitalTwin is a Robot, and it lives inside its parent Robot.

It inherits from the same Robot ABC as PiperRobot or SO100Robot. Same set_state(action), same get_state(), same RobotState dataclass on the wire. That's why the orchestrator, the system state, and the viewer don't have to care whether they're talking to silicon or to a P/D integrator — they just talk to "a Robot".

The twin isn't a sibling of the real robot — it's a field on the real robot. piper.digital_twin holds an instance of another Robot subclass. The "DT toggle" doesn't switch between two different APIs — it switches which Robot instance the parent's set_state call routes to: the parent's own hardware path, or the twin sitting in self.digital_twin.

Class hierarchy

Abstract base

Hardware (real motors)

Digital twin (sim / record)

Remote (network proxy)

Hollow triangle = inheritance (UML extends). The dashed inset at the top-left shows the composition relationship: any concrete Robot can hold another Robot as a field via self.digital_twin — the twin doesn't sit next to the parent, it sits inside it.

Folder layout

Only the parts relevant to the robot/digital-twin story. Annotations: key files defining the contract, new stuff added on this branch, // notes.

servo7/ ├── robot/ │ ├── factory.py // RobotFactory.create(config) — picks the right class from RobotType │ ├── robot_types.py // RobotType enum · RobotState · RobotObservation · RobotStatus │ │ │ ├── hardware/ // real motors / cameras / IK │ │ ├── base.py // Robot ABC · DT routing · snap_twin_to_self · commanded_action │ │ ├── base_ros.py // shared RosRobot base │ │ ├── piper.py // PiperRobot │ │ ├── dual_piper.py // DualPiperRobot │ │ ├── so100.py // SO100Robot │ │ ├── dual_so100.py // DualSO100Robot │ │ ├── so100_follower.py // SO100FollowerRobot │ │ ├── quest_vr.py // QuestVRRobot — used as leader (input) │ │ ├── r1.py // R1RosRobot │ │ └── r1t.py // R1TRosRobot — teleop variant │ │ │ ├── mock/ // deterministic stand-ins for tests + jogging │ │ ├── mock_6dof.py // DummyRobot — the generic 6-DOF mock │ │ ├── mock_6dof_signal.py // SignalRobot — DummyRobot with deterministic signal trace │ │ └── mock_r1.py // MockRobotNode for R1 integration tests │ │ │ └── remote/ // network proxies — used when twin/HW run in different processes │ ├── robot_client.py // RemoteRobotClient (ZMQ) — looks like a Robot, forwards to RobotServer │ ├── robot_server.py // RobotServer — wraps a local Robot, exposes it over ZMQ │ ├── ros_robot_client.py │ └── ros_robot_server.py │ ├── digital_twin/ // NEW package on this branch — Robot subclasses that simulate │ ├── base.py // DigitalTwin ABC — adds _record_queue, replay gate, snap() │ ├── pid_6dof.py // DigitalTwinPID — P/D 6-DOF integrator at 100Hz │ ├── r1_mujoco.py // R1LiteSimRobot — MuJoCo-backed twin │ └── mujoco/ // embedded sim env (URDF, scene, scripted policies) │ └── r1/ … │ └── nodes/ ├── orchestrator_node.py // mode state machine · command dispatch worker · broadcast ├── robot_node.py // owns one Robot per follower · subscribes "action.{id}" ├── frontend_node.py // WS :8765 server · SET_CONTROL_MODE / STATE_UPDATE handlers ├── system_state_node.py // SystemState container + callback registry ├── ai_node.py // AI policy leader (predict_sync) ├── recorder_node.py // dataset episode recorder ├── training_node.py // LoRA / policy training lifecycle └── message_protocol.py // MessageType enum, ControlMode enum, message dataclasses

What a "robot with a twin" looks like at runtime

The twin literally lives inside the parent robot. It's a field, not a sibling. piper.digital_twin holds a DigitalTwinPID instance — which is itself a Robot. The drawing below shows the nesting verbatim: outer box = the real Piper, inner box = the twin it owns.

Two independent if statements, not an if/else. The twin always sees the write (so its pose tracks the real arm even when DT routing is off — that's how the joint-debug-strip stays meaningful). The hardware path only runs when the routing flag is off; when it's on, the motors stay still and only the inner twin moves. Same Robot contract on both sides — nothing outside this diagram changes either way.

Quick recipe — how a new digital twin gets added

New subclass under servo7/digital_twin/ extending DigitalTwin (recording / replay-gate machinery is inherited).
Override the two @abstractmethods: connect() and disconnect().
Override _set_state_impl(action) and _get_state_impl() for the sim's write/read. (Don't override set_state / get_state directly unless you really need to — the DT-routing logic in the base class only kicks in if you let it.)
Add an enum value to RobotType in robot/robot_types.py.
Wire it up in RobotFactory.create() — map the new type to your class.
Configure it on the hardware robot at boot: real_robot.set_digital_twin(new_twin); flip it on with real_robot.enable_digital_twin(True).
Nothing in the orchestrator, system state, or viewer changes.

One contract, many transports

The whole point of PubSub: nothing above the bus knows or cares whether bytes travel via callbacks, an mp.Queue, a ZMQ socket, or a UDP datagram. Same four methods (publish / subscribe / get_latest / close) on every implementation. Same client code, swap the bus at construction time.

That's how the same RobotServer + RemoteRobotClient pair runs in-process in tests, in a subprocess for local isolation, and across hosts in production.

The remote pair — wrapping a Robot, not a RobotNode

Important design choice: RemoteRobotClient and RobotServer wrap a single Robot instance — not a RobotNode.

That means the remoting boundary sits one level below where you might expect. Anything that takes a Robot can take a remote-wrapped one without noticing. The orchestrator, the RobotNode, the SystemState callbacks — none of them know whether the Robot under them is local silicon, a digital twin, or a ZMQ stub pointing at another machine.

The flip side: today there's one RobotNode per follower. The modularity is at the Robot level, not yet at the RobotNode level. See the "Future" panel below.

Same RobotServer class, same RemoteRobotClient class — the only thing that changes between dev/test/prod is which PubSub implementation gets passed in.

How to instantiate — same client, four transports

RemoteRobotClient.over_zmq(
    robot_type="piper",
    host="piper-host.lan",
    state_port=5555, command_port=5556,
)
# production · server already running on another host · ZmqBus.connect

RemoteRobotClient.over_zmq_local_server(
    robot_config="piper.yaml",
    state_port=5555, command_port=5556,
)
# spawn a robot_server subprocess on localhost, talk to it via ZMQ
# useful when the robot lib needs a different env than atlas

RemoteRobotClient.over_in_process(robot_config="dummy.yaml")
# server runs in the SAME process, talks over InProcessBus
# useful for tests — exercises the proxy code path with zero IPC overhead

RemoteRobotClient.over_mp_subprocess(robot_config="dummy.yaml")
# mp.Process running RobotServer, talk over MpBus (mp.Queue pair)
# useful when you want process isolation without OS sockets

All four return a RemoteRobotClient — which IS a Robot. Pass it to RobotNode or call real_robot.set_digital_twin(client) just like any other Robot.

Where this sits in the abstraction stack

Future direction

Today: one RobotNode per follower; modularity lives at the Robot level. You can swap silicon for sim, swap local for remote, attach a twin — all without touching anything above RobotNode. But you can't yet teach a single RobotNode to look after multiple robots at once.

Where this is going: the long-term plan is RobotNode as a collection. The communication boundary already exists at the right place (per-Robot, not per-Node), so the future shape is additive — wrap each member of the collection in whatever transport it needs, the rest of the system keeps treating them as Robots.

The dashed boxes on the right haven't happened yet — that's the planned direction. The point of the diagram is to show that the building blocks (transport-agnostic Robot wrappers) are already in place to support it without changing the surface.

Source map from working tree on 2026-05-16. Interactive simulation is an abstraction — particles and queue counts are illustrative, not real-time taps. When the code drifts, edit this page (knowledge/docs/assets/digital_twin_flow.html) so the picture stays honest.