RoboVision tab AlgoVision
Discuss in AIs Social

RoboVision

Vision–language–action (VLA) loops, then classical kinematics with WebGL (Three.js) — stepped like AlgoVision, for both ML policy intuition and geometry.

Learning path
Tier 1–2 · Foundations

VLA policy loop (OpenVLA-style)

Cameras + instruction → vision encoder, LM, action tokens, decode to joint or EE deltas, and sync with the real robot.

Step-through · ~20 min · no WebGL

Sim Loop & Reward Shaping

The RL training loop: observation, policy inference, environment step, reward signal, replay buffer, and episode lifecycle.

Step-through · ~20 min · SVG animation

Demo Collection Pipeline

How robot demonstrations become training data: teleoperation recording, annotation, RLDS/HDF5 conversion, and validation.

Step-through · ~18 min · SVG flowchart

Planar 2-DOF arm & IK

Two revolute joints, analytic inverse kinematics, 3D visualization. Drag the target or step through the derivation.

Three.js · ~25 min · Interactive

3-DOF Arm & Orientation

Add a wrist joint for tool orientation. Redundancy, null-space motion, and joint limits with interactive Three.js.

Three.js · ~25 min · Interactive

Jacobian & Velocity IK

Numerical IK via the Jacobian matrix. Live matrix display, condition number, and damped least-squares toggle.

Three.js · ~25 min · Interactive
Tier 3 · Bridging Sim & Real

Sim-to-Real Transfer

Domain randomization, system identification, and progressive transfer. Bridging the reality gap from sim to hardware.

Step-through · ~22 min · SVG

Multi-Task Policy

One policy, many tasks: language conditioning, task embeddings, curriculum learning, and multi-task generalization.

Step-through · ~22 min · SVG

Grasp Planning & Contact

Parallel-jaw grasping with force closure, antipodal grasps, grasp quality metrics, and friction analysis.

Three.js · ~25 min · Interactive

Trajectory Planning

Configuration space, RRT path planning, spline smoothing, velocity profiles, and collision-free motion.

Three.js · ~25 min · Interactive
Tier 4 · Perception & 3D

Vision-Language Grounding

CLIP embeddings, open-vocabulary detection, visual grounding, spatial reasoning, and depth perception for robots.

Step-through · ~22 min · SVG

World Models & Prediction

Latent dynamics, reward prediction, MPC, Dreamer, TD-MPC, and planning in imagination instead of simulation.

Step-through · ~22 min · SVG

6-DOF Manipulation

Full 3D arm control: quaternions, 6×6 Jacobian, numerical IK, wrist singularities, and 7-DOF redundancy.

Three.js · ~25 min · Interactive

Force Control & Compliance

Impedance control, stiffness tuning, hybrid position-force, peg insertion, and compliance layers for VLA deployment.

Three.js · ~25 min · Interactive
Tier 5 · Full Integration

End-to-End Deployment

The complete pipeline: camera → VLA → action chunk → safety filter → IK → robot hardware. The capstone.

Step-through · ~22 min · SVG

Failure Recovery & Safety

OOD detection, workspace limits, force monitoring, E-stop, recovery strategies, and building trustworthy systems.

Step-through · ~22 min · SVG

16 labs — 5 tiers — ~6 hours total — cognitive (ML) + physical (kinematics) tracks.