Enable A2C and PPO
A2C and PPO are a little different from the other SB3 algorithms that run in our framework so far (e.g. they don't have a replay buffer). With some adjustments in the code, we should still be able to use them.
Die monatliche GitLab Wartung findet am Donnerstag den 15.1. statt, daher wird GitLab an diesem Tag zwischen 17 Uhr und 18 Uhr nicht erreichbar sein
A2C and PPO are a little different from the other SB3 algorithms that run in our framework so far (e.g. they don't have a replay buffer). With some adjustments in the code, we should still be able to use them.