Consider generating the random seed differently

In main.py we set the seed to a random seed if the provided seed is 0 with

cfg['seed'] = int(time.time())

This however means that when running performance tests and the launcher launches the trials at the exact same time (which happens), the random seeds of the runs are exactly the same. I had 3 runs of cleanppo that had equal learning curves during a performance test (which at least means that we have high reproducibility for that algorithm if we know the random seed).

If we want to avoid this, we'll have to generate the random seed differently.