{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Dfs0\n",
    "\n",
    "See [Dfs0 in MIKE IO Documentation](https://dhi.github.io/mikeio/user-guide/dfs0.html)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import mikeio"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Reading data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = mikeio.read(\"data/Oresund_ts.dfs0\")\n",
    "ds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "type(ds)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The mikeio read function returns a `Dataset` which is a container of `DataArray`s.\n",
    "\n",
    "A `DataArray` can be selected by name or by index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "da = ds[\"Drogden: Surface elevation\"]   # or ds.Drogden_Surface_elevation or ds[2]\n",
    "da"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Upon `read`, specific items can be selected with the `items` argument using name or index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = mikeio.read(\"data/Oresund_ts.dfs0\", items=[0,2,3])\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Wildcards can be used to select multiple items:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = mikeio.read(\"data/Oresund_ts.dfs0\", items=\"*Surf*\")\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A specific time subset can be using .sel:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.sel(time=slice(\"2018-03-04\",\"2018-03-04 12:00\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or with positional indexing using .isel:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.isel(time=slice(10,20))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Dataset and DataArray have a number of useful attributes like `time`, `items`, `ndims`, `shape`, `values` (only DataArray) etc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.items"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "da.item"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "da.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "da.values"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The time series can be plotted with the plot method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.plot();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "A simple timeseries Dataset can easily be converted to a Pandas DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = ds.to_pandas()\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing data\n",
    "\n",
    "Often, time series data will come from a csv or an Excel file. Here is an example of how to read a csv file with pandas and then write the pandas DataFrame to a dfs0 file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"data/naples_fl.csv\", skiprows=1, parse_dates=True, index_col=0)\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You will probably have the need to parse certain a specific data formats many times, then it is a good idea to create a function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def read_ncei_obs(filename):\n",
    "    # old name : new name\n",
    "    mapping = {'TAVG (Degrees Fahrenheit)': 'temperature_avg_f',\n",
    "               'TMAX (Degrees Fahrenheit)': 'temperature_max_f',\n",
    "               'TMIN (Degrees Fahrenheit)': 'temperature_min_f',\n",
    "               'PRCP (Inches)': 'prec_in'}\n",
    "    \n",
    "    df_renamed = (\n",
    "        pd.read_csv(filename, skiprows=1, parse_dates=True, index_col=0)\n",
    "           .rename(columns=mapping)\n",
    "    )\n",
    "    sel_cols = mapping.values() # No need to repeat ['temperature_avg_f',...]\n",
    "    df_selected = df_renamed[sel_cols]\n",
    "    return df_selected"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = read_ncei_obs(\"data/naples_fl.csv\")\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Convert temperature to Celsius and precipitation to mm."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_final = df.assign(temperature_max_c=(df['temperature_max_f'] - 32)/1.8,\n",
    "                     prec_mm=df['prec_in'] * 25.4)\n",
    "\n",
    "df_final.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df_final.loc['2021'].plot();"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Creating a dfs0 file from a dataframe is pretty straightforward.\n",
    "\n",
    "1. Convert the dataframe to a `Dataset`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = mikeio.from_pandas(df_final)\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. Write the `Dataset` to a dfs0 file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds.to_dfs(\"output/naples_fl.dfs0\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's read it back in again..."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "saved_ds = mikeio.read(\"output/naples_fl.dfs0\")\n",
    "saved_ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By default, EUM types are undefined. But it can be specified. Let's select a few colums."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df2 = df_final[['temperature_max_c', 'prec_in']]\n",
    "df2.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mikeio import ItemInfo, EUMType, EUMUnit\n",
    "\n",
    "ds2 = mikeio.from_pandas(df2, \n",
    "                         items=[\n",
    "                   ItemInfo(EUMType.Temperature),\n",
    "                   ItemInfo(EUMType.Precipitation_Rate, EUMUnit.inch_per_day)]\n",
    "           )\n",
    "ds2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## EUM"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mikeio.eum import ItemInfo, EUMType, EUMUnit\n",
    "\n",
    "EUMType.search(\"wind\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "EUMType.Wind_speed.units"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inline Exercise\n",
    "\n",
    "What is the best EUM Type for \"peak wave direction\"? What is the default unit? "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# insert your code here"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Precipitation data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"data/precipitation.csv\", parse_dates=True, index_col=0)\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mikecore.DfsFile import DataValueType\n",
    "\n",
    "(mikeio.from_pandas(df, items=ItemInfo(EUMType.Precipitation_Rate, EUMUnit.mm_per_hour, data_value_type=DataValueType.MeanStepBackward))\n",
    "        .to_dfs(\"output/precipitation.dfs0\")\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Selecting "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = mikeio.read(\"output/precipitation.dfs0\", items=[1,4]) # select item by item number (starting from zero)\n",
    "ds"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "ds = mikeio.read(\"output/precipitation.dfs0\", items=[\"Precipitation station 5\",\"Precipitation station 1\"]) # or by name (in the order you like it)\n",
    "ds"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Inline Exercise\n",
    "\n",
    "Read all items to a variable ds. Select \"Precipitation station 3\" - which different ways can you select this item?  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# insert your code here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import utils\n",
    "\n",
    "utils.sysinfo()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}