Check online resampling in HAC FUTURE / FUTURE2 strategy
During the online resampling, there are multiple strategies. FUTURE2 and FUTURE should show the same behaviour, but it appears like there is a difference in the performance. This needs to be checked. One reason for the difference may be the following: In future2, we use the achieved goal at t+1 to determine the hindsight goal. This is how it should be. However, there is still the future2 strategy where we use the next achieved goal at t. These should be the same, but it appears that future2 performs better.
It should be checked whether the next achieved goal at t is always the same as the achieved goal at t+1. There is a check already in hher_replay_buffer.py below the elif self.goal_selection_strategy == GoalSelectionStrategy.FUTURE:
condition. This needs to be evaluated to understand better the differences.