We have developed ThinkGrasp, a plug-and-play vision-language grasping system for heavy clutter environment grasping strategies.
We develop an approach for efficient open-vocabulary language-conditioned manipulation policy learning.
We study how to ground language for robotic grasping while preserve the geometric structure of its symmetry.
We train an equivariant network for pose prediction from single 2D image by using induced and restricted representations.