Skip to content

Unstable Hyperparams handling

Ghost User requested to merge algo-crash-handling into devel

Sometimes, SB3 algorithms become unstable when the wrong hyperparameters are chosen. In this case, we now catch the corresponding ValueError and return an hyperopt-score of 0. We also return n_epochs as the number of run repochs instead of the actual run epochs, because of the following case:

The hyperopt starts, the first hyperparameter config is unstable, the algorithm fails in the first epoch and returns a hyperopt_score = 0 and epochs = 0. This is still the best score because it is the first, so now the maximal number of epochs is 0*1,5 = 0, leading all following configurations to stop immediately.

Other changes:

  • make sac the default algorithm
  • added another mujoco installation error-fix to the readme

Merge request reports