Full-Body Visual Self-Modeling of Robot Morphologies

Boyuan Chen, Robert Kwiatkowski, Carl Vondrick, Hod Lipson

Internal computational models of physical bodies are fundamental to the ability of robots and animals alike to plan and control their actions. These "self-models" allow robots to consider outcomes of multiple possible future actions, without trying them out in physical reality. Recent progress in fully data-driven self-modeling has enabled machines to learn their own forward kinematics directly from task-agnostic interaction data. However, forward-kinematics models can only predict limited aspects of the morphology, such as the position of end effectors or velocity of joints and masses. A key challenge is to model the entire morphology and kinematics, without prior knowledge of what aspects of the morphology will be relevant to future tasks. Here, we propose that instead of directly modeling forward-kinematics, a more useful form of self-modeling is one that could answer space occupancy queries, conditioned on the robot's state. Such query-driven self models are continuous in the spatial domain, memory efficient, fully differentiable and kinematic aware. In physical experiments, we demonstrate how a visual self-model is accurate to about one percent of the workspace, enabling the robot to perform various motion planning and control tasks. Visual self-modeling can also allow the robot to detect, localize and recover from real-world damage, leading to improved machine resiliency.


Overview Video (with narrations and subtitles)


Latest version: arXiv:2111.06389 [cs.CV]

Code and Dataset

We release our codebase at this link. Please follow the instructions on our GitHub page to use the code and the dataset..


Columbia University


This research was supported by DARPA MTO Lifelong Learning Machines (L2M) Program HR0011-18-2-0020, NSF NRI Award #1925157, NSF CAREER Award #2046910, and a gift from Facebook Research.


If you have any questions, please feel free to contact Boyuan Chen