On-Policy Imitation Learning

On-Policy Imitation Learning | Generated by AI

Home 2026.05

On-Policy Distillation is a reinforcement learning technique where a student policy is trained to imitate a teacher policy using data generated by the student’s own current behavior (on-policy data), rather than relying on a fixed offline dataset.

Back Donate