rlberry.manager.AdastopComparator¶
- class rlberry.manager.AdastopComparator(n=5, K=5, B=10000, comparisons=None, alpha=0.01, beta=0, seed=None)[source]¶
 Bases:
MultipleAgentsComparatorCompare sequentially agents, with possible early stopping. At maximum, there can be n times K fits done.
See adastop library for more details (https://github.com/TimotheeMathieu/adastop)
- Parameters:
 - n: int, or array of ints of size self.n_agents, default=5
 If int, number of fits before each early stopping check. If array of int, a different number of fits is used for each agent.
- K: int, default=5
 number of check.
- B: int, default=None
 Number of random permutations used to approximate permutation distribution.
- comparisons: list of tuple of indices or None
 if None, all the pairwise comparison are done. If = [(0,1), (0,2)] for instance, the compare only 0 vs 1 and 0 vs 2
- alpha: float, default=0.01
 level of the test
- beta: float, default=0
 power spent in early accept.
- seed: int or None, default = None
 
- Attributes:
 - agent_names: list of str
 list of the agents’ names.
- managers_paths: dictionary
 managers_paths[agent_name] is a list of the paths to the trained experiment managers. Can be loaded with ExperimentManager.load.
- decision: dict
 decision of the tests for each comparison, keys are the comparisons and values are in {“equal”, “larger”, “smaller”}.
- n_iters: dict
 number of iterations (i.e. number of fits) used for each agent. Keys are the agents’ names and values are ints.
Methods
compare(manager_list[, n_evaluations, verbose])Run Adastop on the managers from manager_list
compute_mean_diffs(k, Z)Compute the absolute value of the sum differences.
Returns a dataframe with the results of the tests.
partial_compare(eval_values[, verbose])Do the test of the k^th interim.
plot_results([agent_names, axes])visual representation of results.
plot_results_sota([agent_names, axes])visual representation of results when the first agent is compared to all the others.
Print the results of the test.
- compare(manager_list, n_evaluations=50, verbose=True)[source]¶
 Run Adastop on the managers from manager_list
- Parameters:
 - manager_list: list of ExperimentManager kwargs
 List of manager containing agents we want to compare.
- n_evaluations: int, default = 50
 number of evaluations used to estimate the score used for AdaStop.
- verbose: bool
 Print Steps.
- Returns
 - ——-
 - decisions: dictionary with comparisons as index and with values str in {“equal”, “larger”, “smaller”, “continue”}
 Decision of the test at this step.
- compute_mean_diffs(k, Z)¶
 Compute the absolute value of the sum differences.
- get_results()¶
 Returns a dataframe with the results of the tests.
- partial_compare(eval_values, verbose=True)¶
 Do the test of the k^th interim.
- Parameters:
 - eval_values: dict of agents and evaluations
 keys are agent names and values are concatenation of evaluations till interim k, e.g. {“PP0”: [1,1,1,1,1], “SAC”: [42,42,42,42,42]}
- verbose: bool
 print Steps
- Returns
 - ——-
 - decisions: dictionary with comparisons as index and with values str in {“equal”, “larger”, “smaller”, “continue”}
 Decision of the test at this step.
- id_finished: bool
 Whether the test is finished or not.
- T: float
 Test statistic.
- bk: float
 Thresholds of the tests.
- plot_results(agent_names=None, axes=None)¶
 visual representation of results.
- Parameters:
 - agent_nameslist of str or None
 - axestuple of two matplotlib axes of None
 if None, use the following: fig, (ax1, ax2) = plt.subplots(2, 1, gridspec_kw={“height_ratios”: [1, 2]}, figsize=(6,5))
- plot_results_sota(agent_names=None, axes=None)¶
 visual representation of results when the first agent is compared to all the others.
- Parameters:
 - agent_nameslist of str or None
 - axestuple of two matplotlib axes of None
 if None, use the following: fig, (ax1, ax2) = plt.subplots(2, 1, gridspec_kw={“height_ratios”: [1, 2]}, figsize=(6,5))