Dexterous Functional Grasping

All videos play at 1x speed

Objects placed upright


Hard Objects Grasp


New setup

We install 3 cameras for capturing the front, side and top views of the object for more robustness across inital object orientation.


Affordance prediction from multiple views

We get the affordance predictions from each of the three cameras capturing the top, side and front view of the object. The prediction with the highest confidence score is finally used to retarget the hand in the desired pre-grasp position. Below, we show the affordance predictions from all three cameras for different objects.



Drill







Mug