Integrate easy-to-modify versions of common RL algorithms

Algorithms

  • PPO
  • SAC + HER
  • simple Q-Learning

Functionality

  • saving and loading of models
  • logging metrics similar to SB3 algorithms
  • rendering
  • at least one performance test per algorithm

I'll probably mostly rely on Clean-RL.

Edited by Ghost User