.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/demo_env/video_plot_twinrooms.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_demo_env_video_plot_twinrooms.py: =============================== A demo of twinrooms environment =============================== Illustration of TwinRooms environment .. video:: ../../video_plot_twinrooms.mp4 :width: 600 .. GENERATED FROM PYTHON SOURCE LINES 11-39 .. code-block:: python3 from rlberry_research.envs.benchmarks.generalization.twinrooms import TwinRooms from rlberry_scool.agents.mbqvi import MBQVIAgent from rlberry.wrappers.discretize_state import DiscretizeStateWrapper from rlberry.seeding import Seeder seeder = Seeder(123) env = TwinRooms() env = DiscretizeStateWrapper(env, n_bins=20) env.reseed(seeder) horizon = 20 agent = MBQVIAgent(env, n_samples=10, gamma=1.0, horizon=horizon) agent.reseed(seeder) agent.fit() observation, info = env.reset() env.enable_rendering() for ii in range(10): action = agent.policy(observation) observation, reward, terminated, truncated, info = env.step(action) done = terminated or truncated if (ii + 1) % horizon == 0: observation, info = env.reset() env.render() video = env.save_video("_video/video_plot_twinrooms.mp4") .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.000 seconds) .. _sphx_glr_download_auto_examples_demo_env_video_plot_twinrooms.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: video_plot_twinrooms.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: video_plot_twinrooms.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_