Refactor CustomEvalCallback
Currently, many of the framework's features are provided by the CustomEvalCallback
. It handles:
- the train / test loop
- the evaluation
- early stopping
- rendering & recording
Should new algorithms use the callback?
How do new algorithms get information like n_eval_episodes
, which is currently only provided to the callback?
We need to discuss these questions and refactor the callback to ease the integration of new algorithms and provide a clear structure.