{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# This notebook present the most basic use of Grid2Op" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Objectives**\n", "\n", "This notebook will cover some basic raw functionality at first. It will then show how these raw functionalities are encapsulated with easy to use functions.\n", "\n", "The recommended way to use these is to through the Runner, and not by getting through the instanciation of class one by one." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import os\n", "import sys\n", "import grid2op" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
run previous cell, wait for 2 seconds
\n", "" ], "text/plain": [ "" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "res = None\n", "try:\n", " from jyquickhelper import add_notebook_menu\n", " res = add_notebook_menu()\n", "except ModuleNotFoundError:\n", " print(\"Impossible to automatically add a menu / table of content to this notebook.\\nYou can download \\\"jyquickhelper\\\" package with: \\n\\\"pip install jyquickhelper\\\"\")\n", "res" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0) Summary of RL method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Though the `Grid2Op` package can be used to perform many different tasks, these set of notebooks will be focused on the machine learning part, and its usage in a Reinforcement learning framework. \n", "\n", "The reinforcement learning is a framework that allows to train \"agent\" to solve time dependant domain. We tried to cast the grid operation planning into this framework. The package `Grid2Op` was inspired by it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In a reinforcement learning (RL), there are 2 distinct entities:\n", "* **Environment**: is a modeling of the \"world\" in which the *agent* takes some *actions* to achieve some pre definite objectives.\n", "* **Agent**: will do actions on the environment that will have consequences.\n", "\n", "These 2 entities exchange 3 main type of information:\n", "* **Action**: it's an information sent by the Agent that will modify the internal state of the environment.\n", "* **State** / **Observation**: is the (partial) view of the environment by the Agent. The Agent receive a new state after each actions. He can use the observation (state) at time step *t* to take the action at time *t*.\n", "* **Reward**: is the score received by the agent for the previous action.\n", "\n", "A schematic representaiton of this is shown in the figure bellow (Credit: [Sutton & Barto](http://incompleteideas.net/book/bookdraft2017nov5.pdf)):" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![title](img/reinforcement-learning.jpg)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this notebook, we will develop a simple Agent that takes some action (powerline disconnection) based on the observation of the environment." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For more information about the problem, please visit the [Example_5bus](Example_5bus.ipynb) notebook which dive more into the casting of the real time operation planning into a RL framework. Note that this notebook is still under development at the moment.\n", "\n", "A good material is also provided in the white paper [Reinforcement Learning for Electricity Network Operation](https://arxiv.org/abs/2003.07339) presented for the L2RPN 2020 Neurips edition." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## I) Creating an Environment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### I.A) Default settings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We provide a one function call that will handle the creation of the Environment with default values.\n", "\n", "In this example we will use the `rte_case14_redisp`. In a testing environment setting.\n", "\n", "To define/create it, we can call:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/tezirg/Code/Grid2Op.BDonnot/getting_started/grid2op/MakeEnv/Make.py:223: UserWarning: You are using a development environment. This environment is not intended for training agents.\n", " warnings.warn(_MAKE_DEV_ENV_WARN)\n" ] } ], "source": [ "env = grid2op.make(\"rte_case14_redisp\", test=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NB** By setting \"test=True\" in the above call, we use only 2 different months or for our environment. If you remove it, grid2op.make will attempt to download more data. By default data corresponding to this environment will be downloaded in your \"home\" directory, which correspond to the location returned by this script:\n", "\n", "```python\n", "import os\n", "print(f\"grid2op dataset will be downloaded in {os.path.expanduser('~/data_grid2op')}\")\n", "```\n", "\n", "If you want another default saving path, you can add a .grid2opconfig.json and specify where to download the data.\n", "\n", "Only 4 environment are available locally when you install grid2op:\n", "- `rte_case5_example`, only available locally, its main goal is to provide example on really small system that people can really study manually and have intuitions on it.\n", "- `rte_case14_test`, only available locally, its main goal is to be used inside unit test of grid2op. It is **NOT** recommended to use it.\n", "- `rte_case14_redisp` an environment based on the IEEE case14 files, which introduces the redispatching. Only 2 dataset are available for this environment. More can be downloaded automatically by specifying \"local=False\" [default value of argument \"local\"]\n", "- `rte_case14_realistic` an environment \"realistic\" based on the same grid which has been adapted to be closer to \"real\" grid. More can be downloaded automatically by specifying \"local=False\" [default value of argument \"local\"]\n", "\n", "Other environments can be used and are available through the \"make\" command, to get a list of the possible environments you can do:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['l2rpn_2019',\n", " 'l2rpn_case14_sandbox',\n", " 'rte_case14_realistic',\n", " 'rte_case14_redisp']" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grid2op.list_available_remote_env() # this only works if you have an internet connection" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And it's also possible to list the environments you have already downloaded (if any). **NB** we remind that downloading is automatic and is done the first time you call \"make\" with an environment that has not been already locally downloaded." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['rte_case14_realistic', 'rte_case14_redisp', 'l2rpn_2019']" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "grid2op.list_available_local_env()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### I.B) Custom settings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the `make` function you pass additional arguments to customize the environment (useful for training):\n", " - `param`: Parameters used for the Environment. `grid2op.Parameters.Parameters`\n", " - `backend` : The backend to use for the computation. If provided, it must be an instance of class `grid2op.Backend.Backend`.\n", " - `action_class`: Type of BaseAction the BaseAgent will be able to perform. If provided, it must be a subclass of `grid2op.BaseAction.BaseAction`\n", " - `observation_class`: Type of BaseObservation the BaseAgent will receive. If provided, It must be a subclass of `grid2op.BaseAction.BaseObservation`\n", " - `reward_class`: Type of reward signal the BaseAgent will receive. If provided, It must be a subclass of `grid2op.BaseReward.BaseReward`\n", " - `gamerules_class`: Type of \"Rules\" the BaseAgent need to comply with. Rules are here to model some operational constraints. If provided, It must be a subclass of `grid2op.RulesChecker.BaseRules`\n", " - `data_feeding_kwargs`: Dictionnary that is used to build the `data_feeding` (chronics) objects.\n", " - `chronics_class`: The type of chronics that represents the dynamics of the Environment created. Usually they come from different folders.\n", " - `data_feeding`: The type of chronics handler you want to use.\n", " - `volagecontroler_class`: The type of `grid2op.VoltageControler.VoltageControler` to use\n", " - `chronics_path`: Path where to look for the chronics dataset (optional)\n", " - `grid_path`: The path where the powergrid is located. If provided it must be a string, and point to a valid file present on the hard drive.\n", " \n", "For example, to set the number of substation changes allowed per step:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/tezirg/Code/Grid2Op.BDonnot/getting_started/grid2op/MakeEnv/Make.py:223: UserWarning: You are using a development environment. This environment is not intended for training agents.\n", " warnings.warn(_MAKE_DEV_ENV_WARN)\n" ] } ], "source": [ "from grid2op.Parameters import Parameters\n", "\n", "custom_params = Parameters()\n", "custom_params.MAX_SUB_CHANGED = 1\n", "env = grid2op.make(\"rte_case14_redisp\", param=custom_params, test=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**NB** The function \"make\" is highly customizable. For example, you can change the reward you are using:\n", "\n", "```python\n", "from grid2op.Reward import L2RPNReward\n", "env = grid2op.make(reward_class=L2RPNReward)\n", "```\n", "\n", "We also gave the possibility to assess different rewards. This can be done with the following code:\n", "\n", "```python\n", "\n", "from grid2op.Reward import L2RPNReward, FlatReward\n", "env = grid2op.make(reward_class=L2RPNReward,\n", " other_rewards={\"other_reward\" : FlatReward })\n", "```\n", "These result of these reward can be accessed in the \"info\" return value of the call to env.step. See the official document of reward [here](https://grid2op.readthedocs.io/en/latest/reward.html) for more information." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## II) Creating an Agent" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "An *Agent* is the name given to the \"operator\" / \"bot\" / \"algorithm\" that will perform some modification of the powergrid when he faces some \"observation\".\n", "\n", "Some example of Agents are provided in the file [Agent.py](grid2op/Agent/Agent.py).\n", "\n", "A deeper look at the different Agent provided can be found in the [4_StudyYourAgent](4_StudyYourAgent.ipynb.ipynb) notebook. We suppose here we use the most simple Agent, the one that does nothing" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "from grid2op.Agent import DoNothingAgent\n", "my_agent = DoNothingAgent(env.helper_action_player)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## III) Assess how the Agent is performing" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The performance of each Agent is assessed with the reward. For this example, the reward is a *FlatReward* that just computes how many times step the *Agent* has sucessfully managed before breaking any rules. For more control on this reward, it is recommended to use look at the document of the Environment class.\n", "\n", "More example of rewards are also available on the official document or [here](https://grid2op.readthedocs.io/en/latest/reward.html)." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "done = False\n", "time_step = int(0)\n", "cum_reward = 0.\n", "obs = env.reset()\n", "reward = env.reward_range[0]\n", "max_iter = 10\n", "while not done:\n", " act = my_agent.act(obs, reward, done) # chose an action to do, in this case \"do nothing\"\n", " obs, reward, done, info = env.step(act) # implement this action on the powergrid\n", " cum_reward += reward\n", " time_step += 1\n", " if time_step > max_iter:\n", " break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now evaluate how well this *agent* is performing:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "This agent managed to survive 11 timesteps\n", "It's final cumulated reward is 12072.310007859756\n" ] } ], "source": [ "print(\"This agent managed to survive {} timesteps\".format(time_step))\n", "print(\"It's final cumulated reward is {}\".format(cum_reward))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## IV) More convenient ways to asses an agent" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the above steps have been detailed as a \"quick start\", to give an example of the main classes of the Grid2Op package. Having to code all the above is quite tedious, but offers a lot of flexibility.\n", "\n", "Implementing all this before starting to evaluate an agent can be quite tedious. What we expose here is a much shorter way to perfom all of the above. In this section we will expose 2 ways:\n", "* The quickest way, using the grid2op.main API, most suited when basic computations need to be carried out.\n", "* The recommended way using a *Runner*, it gives more flexibilities than the grid2op.main API but can be harder to configure.\n", "\n", "For this section, we assume the same as before:\n", "* The Agent is \"Do Nothing\"\n", "* The Environment is the default Environment\n", "* PandaPower is used as the backend\n", "* The chronics comes from the files included in this package\n", "* etc." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### IV.A) Using the grid2op.runner API" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When only simple assessment need to be performed, the grid2op.main API is perfectly suited. This API can also be access with the command line:\n", "```bash\n", "python3 -m grid2op.main\n", "```\n", "\n", "We detail here its usage as an API, to assess the performance of a given Agent.\n", "\n", "As opposed to building en environment from scratch (see the previous section) this requires much less effort: we don't need to initialize (instanciate) anything. All is carried out inside the Runner called by the *main* function.\n", "\n", "We ask here 1 episode (eg. we play one scenario until: either the agent does a game over, or until the scenario ends). But this method would work as well if we more." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "from grid2op.Runner import Runner\n", "runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent)\n", "res = runner.run(nb_episode=1, max_iter=max_iter)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A call of the single 2 lines above will:\n", "* Create a valid environment\n", "* Create a valid agent\n", "* Assess how well an agent performs on one episode." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The results are:\n", "\tFor chronics located at /home/tezirg/Code/Grid2Op.BDonnot/getting_started/grid2op/data/rte_case14_redisp/chronics/0\n", "\t\t - cumulative reward: 10948.05\n", "\t\t - number of time steps completed: 10 / 10\n" ] } ], "source": [ "print(\"The results are:\")\n", "for chron_name, _, cum_reward, nb_time_step, max_ts in res:\n", " msg_tmp = \"\\tFor chronics located at {}\\n\".format(chron_name)\n", " msg_tmp += \"\\t\\t - cumulative reward: {:.2f}\\n\".format(cum_reward)\n", " msg_tmp += \"\\t\\t - number of time steps completed: {:.0f} / {:.0f}\".format(nb_time_step, max_ts)\n", " print(msg_tmp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is particularly suited to evaluate different agents, for example we can quickly evaluate a second agent. For the below example, we can import an agent class *PowerLineSwitch* whose job is to connect and disconnect the power lines in the power network. This *PowerLineSwitch* Agent will simulate the effect of disconnecting each powerline on the powergrid and take the best action found (its execution can take a long time, depending on the scenario and the amount of powerlines on the grid). **The execution of the code below can take a few moments**" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The results are:\n", "\tFor chronics located at /home/tezirg/Code/Grid2Op.BDonnot/getting_started/grid2op/data/rte_case14_redisp/chronics/0\n", "\t\t - cumulative reward: 10950.26\n", "\t\t - number of time steps completed: 10 / 10\n" ] } ], "source": [ "from grid2op.Agent import PowerLineSwitch\n", "runner = Runner(**env.get_params_for_runner(), agentClass=PowerLineSwitch)\n", "res = runner.run(nb_episode=1, max_iter=max_iter)\n", "print(\"The results are:\")\n", "for chron_name, _, cum_reward, nb_time_step, max_ts in res:\n", " msg_tmp = \"\\tFor chronics located at {}\\n\".format(chron_name)\n", " msg_tmp += \"\\t\\t - cumulative reward: {:.2f}\\n\".format(cum_reward)\n", " msg_tmp += \"\\t\\t - number of time steps completed: {:.0f} / {:.0f}\".format(nb_time_step, max_ts)\n", " print(msg_tmp)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using this API it's also possible to store the results for a detailed examination of the aciton taken by the Agent. Note that writing on the hard drive has an overhead on the computation time.\n", "\n", "To do this, only a simple argument need to be added to the *main* function call. An example can be found below (where the outcome of the experiment will be stored in the `saved_experiment_donothing` directory):" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The results are:\n", "\tFor chronics located at /home/tezirg/Code/Grid2Op.BDonnot/getting_started/grid2op/data/rte_case14_redisp/chronics/0\n", "\t\t - cumulative reward: 10950.26\n", "\t\t - number of time steps completed: 10 / 10\n" ] } ], "source": [ "runner = Runner(**env.get_params_for_runner(),\n", " agentClass=PowerLineSwitch\n", " )\n", "res = runner.run(nb_episode=1, max_iter=max_iter, path_save=os.path.abspath(\"saved_experiment_donothing\"))\n", "print(\"The results are:\")\n", "for chron_name, _, cum_reward, nb_time_step, max_ts in res:\n", " msg_tmp = \"\\tFor chronics located at {}\\n\".format(chron_name)\n", " msg_tmp += \"\\t\\t - cumulative reward: {:.2f}\\n\".format(cum_reward)\n", " msg_tmp += \"\\t\\t - number of time steps completed: {:.0f} / {:.0f}\".format(nb_time_step, max_ts)\n", " print(msg_tmp)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "actions.npy\t\t\t episode_meta.json _parameters.json\r\n", "agent_exec_times.npy\t\t episode_times.json rewards.npy\r\n", "disc_lines_cascading_failure.npy observations.npy\r\n", "env_modifications.npy\t\t other_rewards.json\r\n" ] } ], "source": [ "!ls saved_experiment_donothing/0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the informations saved are showed above. For more information about them, please don't hesitate to read the documentation of [Runner](https://grid2op.readthedocs.io/en/latest/runner.html).\n", "\n", "**NB** A lot more of informations about *Action* is provided in the [2_Action_GridManipulation](2_Action_GridManipulation.ipynb) notebook. In the [3_TrainingAnAgent](3_TrainingAnAgent.ipynb) there is an quick example on how to read / write action from a saved repository.\n", "\n", "In the notebook 7 more details are given as for the advantages of the runner, especially for post analysis of the agent performances." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The use of `make` + `runner` makes it easy to assess the performance of trained agent. Beside, Runner has been particularly integrated with other tools and makes easy the replay / post analysis of the episode. It is the recommended method to use in grid2op for evaluation." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.2" } }, "nbformat": 4, "nbformat_minor": 2 }