Simulation Loop & Reward Shaping

How a robot policy trains in simulation: the loop that generates experience, and how the shapes what the agent learns. This is the loop that produces the data a or RL policy learns from.

Step 1 of 1 —