{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Dfs0\n", "\n", "See [Dfs0 in MIKE IO Documentation](https://dhi.github.io/mikeio/user-guide/dfs0.html)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import mikeio" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Reading data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = mikeio.read(\"data/Oresund_ts.dfs0\")\n", "ds" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(ds)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The mikeio read function returns a `Dataset` which is a container of `DataArray`s.\n", "\n", "A `DataArray` can be selected by name or by index." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da = ds[\"Drogden: Surface elevation\"] # or ds.Drogden_Surface_elevation or ds[2]\n", "da" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Upon `read`, specific items can be selected with the `items` argument using name or index." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = mikeio.read(\"data/Oresund_ts.dfs0\", items=[0,2,3])\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wildcards can be used to select multiple items:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = mikeio.read(\"data/Oresund_ts.dfs0\", items=\"*Surf*\")\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A specific time subset can be using .sel:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.sel(time=slice(\"2018-03-04\",\"2018-03-04 12:00\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Or with positional indexing using .isel:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.isel(time=slice(10,20))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The Dataset and DataArray have a number of useful attributes like `time`, `items`, `ndims`, `shape`, `values` (only DataArray) etc" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.time" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.items" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da.item" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da.shape" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "da.values" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The time series can be plotted with the plot method." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.plot();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A simple timeseries Dataset can easily be converted to a Pandas DataFrame." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = ds.to_pandas()\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing data\n", "\n", "Often, time series data will come from a csv or an Excel file. Here is an example of how to read a csv file with pandas and then write the pandas DataFrame to a dfs0 file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv(\"data/naples_fl.csv\", skiprows=1, parse_dates=True, index_col=0)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You will probably have the need to parse certain a specific data formats many times, then it is a good idea to create a function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def read_ncei_obs(filename):\n", " # old name : new name\n", " mapping = {'TAVG (Degrees Fahrenheit)': 'temperature_avg_f',\n", " 'TMAX (Degrees Fahrenheit)': 'temperature_max_f',\n", " 'TMIN (Degrees Fahrenheit)': 'temperature_min_f',\n", " 'PRCP (Inches)': 'prec_in'}\n", " \n", " df_renamed = (\n", " pd.read_csv(filename, skiprows=1, parse_dates=True, index_col=0)\n", " .rename(columns=mapping)\n", " )\n", " sel_cols = mapping.values() # No need to repeat ['temperature_avg_f',...]\n", " df_selected = df_renamed[sel_cols]\n", " return df_selected" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = read_ncei_obs(\"data/naples_fl.csv\")\n", "df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df.tail()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert temperature to Celsius and precipitation to mm." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_final = df.assign(temperature_max_c=(df['temperature_max_f'] - 32)/1.8,\n", " prec_mm=df['prec_in'] * 25.4)\n", "\n", "df_final.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df_final.loc['2021'].plot();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Creating a dfs0 file from a dataframe is pretty straightforward.\n", "\n", "1. Convert the dataframe to a `Dataset`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = mikeio.from_pandas(df_final)\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Write the `Dataset` to a dfs0 file." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds.to_dfs(\"output/naples_fl.dfs0\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's read it back in again..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "saved_ds = mikeio.read(\"output/naples_fl.dfs0\")\n", "saved_ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, EUM types are undefined. But it can be specified. Let's select a few colums." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df2 = df_final[['temperature_max_c', 'prec_in']]\n", "df2.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mikeio import ItemInfo, EUMType, EUMUnit\n", "\n", "ds2 = mikeio.from_pandas(df2, \n", " items=[\n", " ItemInfo(EUMType.Temperature),\n", " ItemInfo(EUMType.Precipitation_Rate, EUMUnit.inch_per_day)]\n", " )\n", "ds2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## EUM" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mikeio.eum import ItemInfo, EUMType, EUMUnit\n", "\n", "EUMType.search(\"wind\")\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "EUMType.Wind_speed.units" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Inline Exercise\n", "\n", "What is the best EUM Type for \"peak wave direction\"? What is the default unit? " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# insert your code here" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Precipitation data" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv(\"data/precipitation.csv\", parse_dates=True, index_col=0)\n", "df.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from mikecore.DfsFile import DataValueType\n", "\n", "(mikeio.from_pandas(df, items=ItemInfo(EUMType.Precipitation_Rate, EUMUnit.mm_per_hour, data_value_type=DataValueType.MeanStepBackward))\n", " .to_dfs(\"output/precipitation.dfs0\")\n", ")\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Selecting " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = mikeio.read(\"output/precipitation.dfs0\", items=[1,4]) # select item by item number (starting from zero)\n", "ds" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = mikeio.read(\"output/precipitation.dfs0\", items=[\"Precipitation station 5\",\"Precipitation station 1\"]) # or by name (in the order you like it)\n", "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Inline Exercise\n", "\n", "Read all items to a variable ds. Select \"Precipitation station 3\" - which different ways can you select this item? " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# insert your code here" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import utils\n", "\n", "utils.sysinfo()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.0" } }, "nbformat": 4, "nbformat_minor": 4 }