Affordance prediction from multiple views
We get the affordance predictions from each of the three cameras capturing the top, side and front view of the object. The prediction with the highest confidence score is finally used to retarget the hand in the desired pre-grasp position.
Below, we show the affordance predictions from all three cameras for different objects.
|
|