RoboVision hub Prev: Trajectory Planning
Next: World Models

Vision-Language Grounding

In Tier 1 the 's image encoder was a black box. Now we open it: how embeddings, , and let the robot see what language describes.

Step 1 of 1