`learning_starts`limited to only one episode

We are limiting the sb3 option learning_starts to be at most one episode long. However, it is common to permit significantly more environment steps before training.

I think we should get rid of the following lines:

    if 'learning_starts' in alg_kwargs:
        alg_kwargs['learning_starts'] = max(alg_kwargs['learning_starts'], max_ep_steps)
    else:
        alg_kwargs['learning_starts'] = max_ep_steps

found here https://collaborating.tuhh.de/ckv0173/Scilab-RL/-/blob/devel/util/util.py#L74