Verify and fix hyperopt mechanism
The hyperparameter optimization seems not to work properly. I have already fixed the instructions in the README.md in devel, and the easiest thing to debug the hyperopt is to use the command provided there, i.e., python experiment/train.py +performance=FetchReach/sac_her-opti.yaml --multirun
.
For me, this started well but after a while it threw the following error:
| learning_rate | 0.0217 |
| n_updates | 11949 |
---------------------------------
Training finished!
Finishing main training function.
MLflow run: <ActiveRun: >.
Hyperopt score: 0.03333333333333334, epochs: 6.
/home/eppe/Scilab-RL/hydra_plugins/hydra_custom_optuna_sweeper/_impl.py:285: FutureWarning: _tell has been deprecated in v2.5.0. This feature will be removed in v4.0.0. See https://github.com/optuna/optuna/releases/tag/v2.5.0.
study._tell(trial, state, values)
Error executing job with overrides: ['+algorithm.learning_rate=0.02169932803339823', '++algorithm.replay_buffer_kwargs.n_sampled_goal=8', 'algorithm=sac', '+performance=FetchReach/sac_her-opti.yaml', 'n_epochs=6']
Traceback (most recent call last):
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/hydra/_internal/utils.py", line 386, in <lambda>
lambda: hydra.multirun(
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 140, in multirun
ret = sweeper.sweep(arguments=task_overrides)
File "/home/eppe/Scilab-RL/hydra_plugins/hydra_custom_optuna_sweeper/custom_optuna_sweeper.py", line 45, in sweep
return self.sweeper.sweep(arguments)
File "/home/eppe/Scilab-RL/hydra_plugins/hydra_custom_optuna_sweeper/_impl.py", line 271, in sweep
ret.return_value) == 3, "The return value of main() should be a triple where the first element " \
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
Exception: Traceback (most recent call last):
File "/home/eppe/Scilab-RL/hydra_plugins/hydra_custom_joblib_launcher/_core.py", line 84, in run_job
ret.return_value = task_function(task_cfg)
File "/home/eppe/Scilab-RL/experiment/train.py", line 164, in main
launch(cfg, logger, kwargs)
File "/home/eppe/Scilab-RL/experiment/train.py", line 121, in launch
train(baseline, train_env, eval_env, cfg, logger)
File "/home/eppe/Scilab-RL/experiment/train.py", line 59, in train
baseline.learn(total_timesteps=total_steps, callback=callback, log_interval=None)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/sac/sac.py", line 289, in learn
return super(SAC, self).learn(
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 352, in learn
rollout = self.collect_rollouts(
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 563, in collect_rollouts
action, buffer_action = self._sample_action(learning_starts, action_noise)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/off_policy_algorithm.py", line 407, in _sample_action
unscaled_action, _ = self.predict(self._last_obs, deterministic=False)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/base_class.py", line 539, in predict
return self.policy.predict(observation, state, mask, deterministic)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/policies.py", line 302, in predict
actions = self._predict(observation, deterministic=deterministic)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/sac/policies.py", line 362, in _predict
return self.actor(observation, deterministic)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/sac/policies.py", line 185, in forward
return self.action_dist.actions_from_params(mean_actions, log_std, deterministic=deterministic, **kwargs)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/distributions.py", line 178, in actions_from_params
self.proba_distribution(mean_actions, log_std)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/distributions.py", line 210, in proba_distribution
super(SquashedDiagGaussianDistribution, self).proba_distribution(mean_actions, log_std)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/stable_baselines3/common/distributions.py", line 152, in proba_distribution
self.distribution = Normal(mean_actions, action_std)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/torch/distributions/normal.py", line 50, in __init__
super(Normal, self).__init__(batch_shape, validate_args=validate_args)
File "/home/eppe/Scilab-RL/venv/lib/python3.9/site-packages/torch/distributions/distribution.py", line 55, in __init__
raise ValueError(
ValueError: Expected parameter loc (Tensor of shape (1, 4)) of distribution Normal(loc: torch.Size([1, 4]), scale: torch.Size([1, 4])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan, nan]], device='cuda:0')
For debugging, it is best to disable the multiprocessing by commenting the override hydra/launcher: custom_joblib
in main.yaml
.
Edited by Manfred Eppe