Currently, many of the framework's features are provided by the
CustomEvalCallback. It handles:
- the train / test loop
- the evaluation
- early stopping
- rendering & recording
Should new algorithms use the callback?
How do new algorithms get information like
n_eval_episodes, which is currently only provided to the callback?
We need to discuss these questions and refactor the callback to ease the integration of new algorithms and provide a clear structure.