Version: 8

Train an End-to-End Diffusion Model

This guide walks you through collecting ML training data and then using a Diffusion Policy for end-to-end ML control.

By following this guide, you'll learn how to:

Collect demonstration data using teleoperation (or an ML oracle)
Train and export a diffusion-based policy model
Integrate the policy into a MoveIt Pro Objective for deployment

Required Version

This feature requires MoveIt Pro version 8.7 or newer.

note

This is a feature exclusive to the MoveIt Pro Ultimate tier.

Diffusion policy overview

Diffusion policy is a cutting-edge method for learning robot control policies using diffusion models, offering strong empirical performance and solid theoretical foundations. It outperforms traditional behavior cloning approaches, especially in tasks requiring long-horizon planning or complex sensorimotor coordination. The approach benefits from a relatively simple training process and stable convergence, avoiding the complexities of adversarial or reinforcement learning. Its compatibility with standard datasets and frameworks, along with several open-source implementations, has contributed to its rapid adoption in the robotics community. Thanks to its reliability, efficiency, and generalization ability, diffusion policy is increasingly used as a default baseline in state-of-the-art robotics control pipelines.

Teleoperation Setup

MoveIt Pro supports many types of teleoperation, built into the UI, for human-in-the-loop demonstrations. In this tutorial, we recommend you use a Meta Quest device. See our tutorial on setting up the Quest here. Follow this tutorial up to and including the Run the Meta Quest APK section, but completing the rest of the tutorial will not be necessary for this guide.

note

MoveIt Pro also supports using Oracles - automated scripted demos using our standard Objective approach.

Data collection and processing

PyTorch integration

PyTorch is an open-source machine learning framework that provides flexible tools for building, training, and deploying deep learning models using dynamic computation graphs and GPU acceleration. The framework provides two key concepts for integrating user data into the training process: Datasets and DataLoaders. The PyTorch DataLoader utility efficiently loads data from a dataset in batches, with optional shuffling, parallelism, and preprocessing. The Datasets are responsible for providing access to individual data samples and their labels, defining how data is read, preprocessed, and returned to the DataLoader. We provide the MCAPDataset, which loads ROS bag MCAP-formatted data into a compatible format.

For the MCAPDataset to function correctly, the recorded ROS bag must contain the following topics.

/joint_states_synchronized (sensor_msgs/msg/JointState)
/joint_commands_synchronized (sensor_msgs/msg/JointState)
/wrist_camera/color_synchronized (sensor_msgs/msg/Image)
/scene_camera/color_synchronized (sensor_msgs/msg/Image)
/demonstration_indicator (std_msgs/msg/String)

The topics should be synchronized. The best way to achieve this is to run a node that subscribes to all needed topics and republishes them. We provide the diffusion_policy_observation_recorder node to do exactly that.

diffusion_policy_observation_recorder node

The diffusion_policy_observation_recorder node is responsible for subscribing to sensor topics, including joint states and camera images, and republishing the data on synchronized topics. The following parameters are configurable.

joints: The names of the joints to record. The configured joints must be contained in the robot model defined by the SRDF.
wrist_camera_topic: The name of the wrist camera topic to record. The images will be republished to the /wrist_camera/color_synchronized topic.
scene_camera_topic: The name of the scene camera topic to record. The images will be republished to the /scene_camera/color_synchronized topic.
joint_states_topic: The name of the joint states topic to get the joint data. The topic is usually named /joint_states. The joint states will be republished to the /joint_states_synchronized topic.
joint_commands_topic: The name of the joint states topic to get the joint command data. The default value is /joint_states. The model performance is generally enhanced if you publish the commanded data to a topic, e.g., /joint_commands, and configure it for this parameter. The topic is usually named /joint_states. The joint commands will be republished to the /joint_commands_synchronized topic.
dt: The publishing period for the synchronized topics. A value of 0.1 seconds is a reasonable default. For tasks requiring higher reactivity, e.g. pushing a block, a smaller value for delta t is better. For tasks more forgiving of slower reaction, a higher delta t is acceptable. Some degree of trial and error will be necessary to determine the best value of this parameter.
image_size: Defaults to [240, 320]. This parameter will rescale the image topics before republishing. This is likely a good value for most applications, though experimentation with the best value for a given application is possible.

As an example, the following can be added to a site config launch file to include this node:

 <node pkg="moveit_pro_ml_ros" exec="diffusion_policy_observation_recorder_main">
    <param name="joints" value="[joint_1, joint_2, joint_3, joint_4, joint_5, joint_6, joint_7, robotiq_85_left_knuckle_joint]" />
    <param name="wrist_camera_topic" value="/zed2i_camera/right/image_raw_color" />
    <param name="scene_camera_topic" value="/zed2i_camera/left/image_raw_color" />
    <param name="joint_states_topic" value="/joint_states" />
    <param name="joint_commands_topic" value="/joint_commands" />
    <param name="dt" value="0.1" />
    <param name="image_size" value="[240, 320]" />
</node>

ROS Bag Data Collection

Record the data with the following ROS bag command.

ros2 bag record --max-cache-size 32000000000 --snapshot-mode -s mcap /robot_description /joint_states_synchronized /joint_commands_synchronized /demonstration_indicator /wrist_camera/color_synchronized /scene_camera/color_synchronized  -o {PATH_TO_DATA_DIRECTORY}

You can now begin data collection:

Manually perturb the scene, e.g., move the robot to a novel start state, and randomize the position of objects to be interacted with, etc.
Publish a string with data “Starting demonstration” to the demonstration indicator topic.
Provide a demonstration either by teleoperation or an oracle policy
Publish a string with data “Stopping demonstration” to the demonstration indicator topic.
Go to step 1 and repeat.

A general rule is to collect between 50 and 150 demonstrations for a task. The data should cover a wide range of conditions. It is important to demonstrate various failure mode recoveries in the data.

note

About training with ROS topic interfaces
The control logic of data collection is designed to be implemented via ROS topics and services. The ROS bag MCAP Data loader used in our PyTorch training scripts relies on a string topic called /demonstration_indicator. Whenever a demonstration begins, it is expected that a string with the data Starting demonstration will be published to the topic. When the demonstration finishes, a string with the data Stopping demonstration is expected to be published. If a start demonstration message is published when a demonstration is already active, that demonstration will be restarted.

The provided ROS bag record command runs in snapshot mode, which creates a service named /rosbag2_recorder/snapshot. When the service is called, the current data in the back will be saved to a chunk. This is beneficial because we want the chunks to line up with the demonstrations to avoid collecting too much or too little data

Finally, our data collection Objective subscribed to a topic for end effector control. We recommend a VR headset for use with the Objective. We have also configured a data collection Objective for direct joint control, which can be used with leader-follower setups or GELLO models.

Collecting Data with Oracle Policies

In this guide, we explored collecting teleoperation data, but MoveIt Pro can also be used to create Objectives that automatically create training data without human interaction.

To do this, create and run an Objective and record the trajectory and camera data like you would with normal teleop.

We recommend using Model Predictive Control (MPC) Behaviors to get the smoothest data that is easy to describe. See the example Pushing Sim Objective in the diffusion policy workspace.

Train the model

Local setup

Prerequisites for your training PC

NVIDIA GPU with 16 GB VRAM or better (e.g., RTX‑4080)
32 GB RAM recommended
1 TB disk space recommended for collecting data
MoveIt Pro Ultimate license

You will need around 10 hours to train a new skill.

Cloud instance

We do not officially support any particular GPU cloud provider. MoveIt Pro requires access to either a virtual machine or a bare metal instance. The minimum requirements are an NVIDIA GPU with at least 16 GB of VRAM, 32 GB of CPU RAM, and 50 GB of disk space. We have verified hyperbolic.ai as a viable service for GPU-powered model training. Training a new diffusion policy on an NVIDIA H100 80GB GPU should cost around $10. You will need to transfer your data to the instance for training. The data is expected to exceed 1 GB, so a remote storage option is likely needed. One option is to add the training data to the git repository for your workspace. You must set up git LFS for this to be a viable option.

Following the MoveIt Pro installation instructions, install MoveIt Pro. You will likely need to follow the steps for installing Docker. Make sure to follow the steps for installing the NVIDIA container toolkit. This enables the Docker container to access the GPU for training.

Training the model

Once MoveIt Pro is installed and configured, clone the diffusion_model_trainer GitHub repository and add your workspace as a submodule. Make sure to clone the repository recursively to pull in all of the submodules. You also need to set up git LFS for the large model files.

git clone https://github.com/PickNikRobotics/diffusion_model_trainer.git
cd diffusion_model_trainer
git submodule update --init --recursive
git lfs install
git lfs pull

warning

Contact support if you are an Ultimate licensed customer and need help accessing this repository.

After the first iteration of data collection is complete, you can run a dev container with the following command.

moveit_pro dev

This command will automatically configure the environment specified with pixi.

note

Please note that the Python version in the environment may differ from the installed Python version of ROS 2. In this case, you may run into errors when running ROS 2 commands. It is best to only use the dev container for training and exporting the model. If you need to run ROS 2 commands in the dev container, you can run moveit_pro shell in another terminal.

To train the model, run the following command.

ros2 run diffusion_model_trainer train_diffusion_model --  --config-name=train_diffusion_model task.dataset_path={PATH_TO_DATA_DIRECTORY}

This will train the model and create checkpoints every 50 epochs. We recommend periodically exporting the ONNX model and testing on the robot or in simulation to validate the model.

The diffusion model training is configured via Hydra. Hydra is a Python framework for managing complex configurations, enabling you to compose, override, and organize settings dynamically from multiple sources like YAML files and command-line arguments. Hydra has become widely adopted specifically for configuring model hyperparameters. The Hydra YAML should be configured on a per-application basis.

You need to set the following parameters for your application:

joint_based_task.yaml
- dataset.joint_names
- dataset.joint_states_topic
- dataset.joint_commands_topic
- shape_meta.obs./joint_states_synchronized.shape
- shape_meta.obs.action.shape
Note: We only support the DDIMScheduler.

Deploy the model

MoveIt Pro uses ONNX to execute models within its Behavior Tree framework. Since model publishers typically do not provide scripts for exporting PyTorch models, we include ONNX export scripts to enable the use of diffusion policies with MoveIt Pro.

Prerequisites for Robot PC

NVIDIA GPU with 8 GB VRAM or better (e.g., RTX‑4080)
MoveIt Pro Ultimate license

Exporting the model

In a MoveIt Pro dev container, run the following command.

ros2 run diffusion_model_trainer export_onnx_diffusion_unet_image_workspace --  --config-name=export_onnx_model task.dataset_path={PATH_TO_DATA_DIRECTORY}

This will create two ONNX models in the data directory. You can copy those into your site configs models directory for use with MoveIt Pro. Create the models directory if necessary. You will need to run colcon build if this is your first time copying the models into that directory.

Running in MoveIt Pro

Create an Objective that uses the DiffusionPolicy Behavior. It depends on the JTAC controller, so you will need to make sure the controller is activated before running the DiffusionPolicy Behavior. Make sure the exported ONNX models are installed in a site config. We recommend putting them in a directory called models. Run the Objective and verify the policy in action.

Conclusion

You have now completed the end-to-end setup for the Diffusion Policy in MoveIt Pro. Starting from demonstration data collection using either teleoperation or the ML Oracle, through training and exporting the model as ONNX, and finally integrating the policy into a MoveIt Pro Objective, you have transformed raw demonstrations into a deployable learned controller.

Diffusion policy overview​

Teleoperation Setup​

Data collection and processing​

PyTorch integration​

diffusion_policy_observation_recorder node​

ROS Bag Data Collection​

Collecting Data with Oracle Policies​

Train the model​

Local setup​

Cloud instance​

Training the model​

Deploy the model​

Prerequisites for Robot PC​

Exporting the model​

Running in MoveIt Pro​

Conclusion​

Diffusion policy overview

Teleoperation Setup

Data collection and processing

PyTorch integration

diffusion_policy_observation_recorder node

ROS Bag Data Collection

Collecting Data with Oracle Policies

Train the model

Local setup

Cloud instance

Training the model

Deploy the model

Prerequisites for Robot PC

Exporting the model

Running in MoveIt Pro

Conclusion