Find hyperparameters for Blocks-o1 or fix learning process
I checked whether the hyperparameter optimization for Blocks-o1 (with gripper_random, gripper_above, gripper_none) work. I also checked the AntReacher env. However, in all cases there is no success. It does work for Blocks-o1_gripper_random, though.
I even tried increasing the distance_threshold for determining whether a goal has been achieved from 0.05 to 0.1. In that case, the testing success rate is higher, but only because it is by chance that the block is initially at the right position. There is no learning progress.
Either I did not find yet good hyperparameters or something is broken. To reproduce, use the new optimization file conf/performance/Blocks/o1-above-sac_her-opti.yaml
in the branch test_hyperparam_opt_wandb