.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/plot_writer_wrapper.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_plot_writer_wrapper.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_plot_writer_wrapper.py:


==============================================
Record reward during training and then plot it
==============================================

This script shows how to modify an agent to easily record reward or action
during the fit of the agent and then use the plot utils.

.. note::
    If you already ran this script once, the fitted agent has been saved
    in rlberry_data folder. Then, you can comment-out the line

    .. code-block:: python

        agent.fit(budget=10)

    and avoid fitting the agent one more time, the statistics from the last
    time you fitted the agent will automatically be loaded. See
    `rlberry.manager.plot_writer_data` documentation for more information.

.. GENERATED FROM PYTHON SOURCE LINES 21-82


.. rst-class:: sphx-glr-horizontal


    *

      .. image-sg:: /auto_examples/images/sphx_glr_plot_writer_wrapper_001.png
         :alt: Cumulative Reward
         :srcset: /auto_examples/images/sphx_glr_plot_writer_wrapper_001.png
         :class: sphx-glr-multi-img

    *

      .. image-sg:: /auto_examples/images/sphx_glr_plot_writer_wrapper_002.png
         :alt: Cumulative Reward
         :srcset: /auto_examples/images/sphx_glr_plot_writer_wrapper_002.png
         :class: sphx-glr-multi-img


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [INFO] 13:51: ... trained! 
    [INFO] 13:51: Saved ExperimentManager(UCBVIAgent) using pickle. 
    [INFO] 13:51: The ExperimentManager was saved in : 'rlberry_data/temp/manager_data/UCBVIAgent_2025-03-07_13-51-50_43ae8643/manager_obj.pickle' 


|

.. code-block:: python3


    import numpy as np

    from rlberry_scool.envs import GridWorld
    from rlberry.manager import plot_writer_data, ExperimentManager
    from rlberry_scool.agents import UCBVIAgent
    import matplotlib.pyplot as plt

    # We wrape the default writer of the agent in a WriterWrapper
    # to record rewards.


    class VIAgent(UCBVIAgent):
        name = "UCBVIAgent"

        def __init__(self, env, **kwargs):
            UCBVIAgent.__init__(self, env, writer_extra="reward", horizon=50, **kwargs)


    env_ctor = GridWorld
    env_kwargs = dict(
        nrows=3,
        ncols=10,
        reward_at={(1, 1): 0.1, (2, 9): 1.0},
        walls=((1, 4), (2, 4), (1, 5)),
        success_probability=0.7,
    )

    env = env_ctor(**env_kwargs)
    xp_manager = ExperimentManager(VIAgent, (env_ctor, env_kwargs), fit_budget=10, n_fit=3)

    xp_manager.fit(budget=10)
    # comment the line above if you only want to load data from rlberry_data.


    # We use the following preprocessing function to plot the cumulative reward.
    def compute_reward(rewards):
        return np.cumsum(rewards)


    # Plot of the cumulative reward.
    output = plot_writer_data(
        xp_manager, tag="reward", preprocess_func=compute_reward, title="Cumulative Reward"
    )
    # The output is for 500 global steps because it uses 10 fit_budget * horizon

    # Log-Log plot :
    fig, ax = plt.subplots(1, 1)
    plot_writer_data(
        xp_manager,
        tag="reward",
        preprocess_func=compute_reward,
        title="Cumulative Reward",
        ax=ax,
        show=False,  # necessary to customize axes
    )
    ax.set_xlim(100, 500)
    ax.relim()
    ax.set_xscale("log")
    ax.set_yscale("log")


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 3.161 seconds)


.. _sphx_glr_download_auto_examples_plot_writer_wrapper.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example


    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_writer_wrapper.py <plot_writer_wrapper.py>`

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_writer_wrapper.ipynb <plot_writer_wrapper.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_