rlberry-scool API

Manager

Main classe

rlberry.manager.ExperimentManager(agent_class)

Class to train, optimize hyperparameters, evaluate and gather statistics about an agent.

Evaluation and plot

rlberry.manager.evaluate_agents(...[, ...])

Evaluate and compare each of the agents in experiment_manager_list.

rlberry.manager.plot_writer_data(...[, ...])

Given a list of ExperimentManager or a folder, plot data (corresponding to info) obtained in each episode.

rlberry.manager.read_writer_data(data_source)

Given a list of ExperimentManager or a folder, read data (corresponding to info) obtained in each episode.

rlberry.manager.compare_agents(agent_source)

Compare several trained agents using the mean over n_simulations evaluations for each agent.

rlberry.manager.AdastopComparator([n, K, B, ...])

Compare sequentially agents, with possible early stopping.

Agents & Environments

Basic agents

rlberry_scool.agents.linear.LSVIUCBAgent(...)

A version of Least-Squares Value Iteration with UCB (LSVI-UCB), proposed by Jin et al. (2020).

rlberry_scool.agents.tabular_rl.QLAgent(env)

Q-Learning Agent.

rlberry_scool.agents.tabular_rl.SARSAAgent(env)

SARSA Agent.

Basic environments

rlberry_scool.envs.GridWorld([nrows, ncols, ...])

Simple GridWorld environment.

rlberry_scool.envs.Chain([L, fail_prob])

Simple chain environment.

Agent importation tools

rlberry.agents.stable_baselines.StableBaselinesAgent(env)

Wraps an StableBaselines3 Algorithm with a rlberry Agent.

Environment tools

rlberry.envs.gym_make(id[, wrap_spaces])

Same as gym.make, but wraps the environment to ensure unified seeding with rlberry.

Seeding

rlberry.seeding.safe_reseed(obj, seeder[, ...])

Calls obj.reseed(seed_seq) method if available; If a obj.seed() method is available, call obj.seed(seed_val), where seed_val is generated by the seeder.

rlberry.seeding.set_external_seed(seeder)

Set seeds of external libraries.

Environment Wrappers

rlberry.wrappers.discretize_state.DiscretizeStateWrapper(...)

Discretize an environment with continuous states and discrete actions.

rlberry.wrappers.RescaleRewardWrapper(env, ...)

Rescale the reward function to a bounded range.

rlberry.wrappers.WriterWrapper(env, writer)

Wrapper for environment to automatically record reward or action in writer.