Hyperopt result does not match actual best value
When optimizing the learning rate for SAC+HER I noticed that the "best value" given by the optimization results does not match the actual best values (which I know from looking at the logs). The given "best value" was not even used in any run. Why is that the case?