Consider generating the random seed differently
In main.py we set the seed to a random seed if the provided seed is 0 with
cfg['seed'] = int(time.time())
This however means that when running performance tests and the launcher launches the trials at the exact same time (which happens), the random seeds of the runs are exactly the same. I had 3 runs of cleanppo
that had equal learning curves during a performance test (which at least means that we have high reproducibility for that algorithm if we know the random seed).
If we want to avoid this, we'll have to generate the random seed differently.