{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# SOSE Heat and Salt Budgets\n", "\n", "Evaluating the conservation of heat and salt content in the Southern Ocean State Estimate \n", "\n", "Author: [Ryan Abernathey](http://github.com/rabernat)\n", "\n", "![SOSE Logo](http://sose.ucsd.edu/images/SOSEpic.png)\n", "\n", "## Connect to Dask Cluster\n", "\n", "Below we create a dask cluster with 30 nodes to do our work for us." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from dask.distributed import Client, progress\n", "from dask_kubernetes import KubeCluster\n", "cluster = KubeCluster(n_workers=30)\n", "cluster" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "** ☝️ don't forget to follow along on the dashboard **" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "client = Client(cluster)\n", "client" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Import necessary Python packages" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "from matplotlib import pyplot as plt\n", "import gcsfs\n", "import dask\n", "import dask.array as dsa\n", "import numpy as np\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Open SOSE Dataset from Cloud Storage" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds = xr.open_zarr(gcsfs.GCSMap('pangeo-data/SOSE'))\n", "ds" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('Total Size: %6.2F GB' % (ds.nbytes / 1e9))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A trick for optimization: split the dataset into coordinates and data variables, and then drop the coordinates from the data variables.\n", "This makes it easier to align the data variables in arithmetic operations." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "coords = ds.coords.to_dataset().reset_coords()\n", "dsr = ds.reset_coords(drop=True)\n", "dsr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "coords" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualize Some Data\n", "\n", "As a sanity check, let's look at some values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import holoviews as hv\n", "from holoviews.operation.datashader import regrid\n", "hv.extension('bokeh')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%opts Image [width=900, height=400 colorbar=True] (cmap='Magma')\n", "\n", "hv_image = hv.Dataset(ds.THETA.where(ds.hFacC>0).rename('THETA')).to(hv.Image, kdims=['XC', 'YC'], dynamic=True)\n", "with dask.config.set(scheduler='single-threaded'):\n", " display(regrid(hv_image, precompute=True))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create xgcm grid\n", "\n", "[Xgcm](http://xgcm.readthedocs.io) is a package which helps with the analysis of GCM data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import xgcm\n", "grid = xgcm.Grid(ds, periodic=('X', 'Y'))\n", "grid" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tracer Budgets\n", "\n", "Here we will do the heat and salt budgets for SOSE. In integral form, these budgets can be written as\n", "\n", "$$\n", "\\mathcal{V} \\frac{\\partial S}{\\partial t} = G^S_{adv} + G^S_{diff} + G^S_{surf} + G^S_{linfs}\n", "$$\n", "\n", "\n", "$$\n", "\\mathcal{V} \\frac{\\partial \\theta}{\\partial t} = G^\\theta_{adv} + G^\\theta_{diff} + G^\\theta_{surf} + G^\\theta_{linfs} + G^\\theta_{sw}\n", "$$\n", "\n", "where $\\mathcal{V}$ is the volume of the grid cell. The terms on the right-hand side are called _tendencies_. They add up to the total tendency (the left hand side).\n", "\n", "The first term is the convergence of advective fluxes. The second is the convergence of diffusive fluxes. The third is the explicit surface flux. The fourth is the correction due to the linear free-surface approximation. The fifth is shortwave penetration (only for temperature).\n", "\n", "### Flux Divergence\n", "\n", "First we define a function to calculate the convergence of the advective and diffusive fluxes, since this has to be repeated for both tracers." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def tracer_flux_budget(suffix):\n", " \"\"\"Calculate the convergence of fluxes of tracer `suffix` where \n", " `suffix` is `TH` or `SLT`. Return a new xarray.Dataset.\"\"\"\n", " conv_horiz_adv_flux = -(grid.diff(dsr['ADVx_' + suffix], 'X') +\n", " grid.diff(dsr['ADVy_' + suffix], 'Y')).rename('conv_horiz_adv_flux_' + suffix)\n", " conv_horiz_diff_flux = -(grid.diff(dsr['DFxE_' + suffix], 'X') +\n", " grid.diff(dsr['DFyE_' + suffix], 'Y')).rename('conv_horiz_diff_flux_' + suffix)\n", " # sign convention is opposite for vertical fluxes\n", " conv_vert_adv_flux = grid.diff(dsr['ADVr_' + suffix], 'Z', boundary='fill').rename('conv_vert_adv_flux_' + suffix)\n", " conv_vert_diff_flux = (grid.diff(dsr['DFrE_' + suffix], 'Z', boundary='fill') +\n", " grid.diff(dsr['DFrI_' + suffix], 'Z', boundary='fill') +\n", " grid.diff(dsr['KPPg_' + suffix], 'Z', boundary='fill')).rename('conv_vert_diff_flux_' + suffix)\n", " \n", " all_fluxes = [conv_horiz_adv_flux, conv_horiz_diff_flux, conv_vert_adv_flux, conv_vert_diff_flux]\n", " conv_all_fluxes = sum(all_fluxes).rename('conv_total_flux_' + suffix)\n", " \n", " return xr.merge(all_fluxes + [conv_all_fluxes])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_slt = tracer_flux_budget('SLT')\n", "budget_slt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_th = tracer_flux_budget('TH')\n", "budget_th" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Surface Fluxes\n", "\n", "The surface fluxes are only active in the top model layer. We need to specify some constants to convert to the proper units and scale factors to convert to integral form. They also require some xarray special sauce to merge with the 3D variables." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# constants\n", "heat_capacity_cp = 3.994e3\n", "runit2mass = 1.035e3\n", "\n", "# treat the shortwave flux separately from the rest of the surface flux\n", "surf_flux_th = (dsr.TFLUX - dsr.oceQsw) * coords.rA / (heat_capacity_cp * runit2mass)\n", "surf_flux_th_sw = dsr.oceQsw * coords.rA / (heat_capacity_cp * runit2mass)\n", "\n", "# salt\n", "surf_flux_slt = dsr.SFLUX * coords.rA / runit2mass\n", "lin_fs_correction_th = -(dsr.WTHMASS.isel(Zl=0, drop=True) * coords.rA)\n", "lin_fs_correction_slt = -(dsr.WSLTMASS.isel(Zl=0, drop=True) * coords.rA)\n", "\n", "# in order to align the surface fluxes with the rest of the 3D budget terms,\n", "# we need to give them a z coordinate. We can do that with this function\n", "def surface_to_3d(da):\n", " da.coords['Z'] = dsr.Z[0]\n", " return da.expand_dims(dim='Z', axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Shortwave Flux\n", "\n", "Special treatment is needed for the shortwave flux, which penetrates into the interior of the water column" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def swfrac(coords, fact=1., jwtype=2):\n", " \"\"\"Clone of MITgcm routine for computing sw flux penetration.\n", " z: depth of output levels\"\"\"\n", " \n", " rfac = [0.58 , 0.62, 0.67, 0.77, 0.78]\n", " a1 = [0.35 , 0.6 , 1.0 , 1.5 , 1.4]\n", " a2 = [23.0 , 20.0 , 17.0 , 14.0 , 7.9 ]\n", " \n", " facz = fact * coords.Zl.sel(Zl=slice(0, -200))\n", " j = jwtype-1\n", " swdk = (rfac[j] * np.exp(facz / a1[j]) +\n", " (1-rfac[j]) * np.exp(facz / a2[j]))\n", " \n", " return swdk.rename('swdk')\n", "\n", "_, swdown = xr.align(dsr.Zl, surf_flux_th_sw * swfrac(coords), join='left', )\n", "swdown = swdown.fillna(0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# now we can add the to the budget datasets and they will align correctly\n", "# into the top cell (lazily filling with NaN's elsewhere)\n", "budget_slt['surface_flux_conv_SLT'] = surface_to_3d(surf_flux_slt)\n", "budget_slt['lin_fs_correction_SLT'] = surface_to_3d(lin_fs_correction_slt)\n", "budget_slt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_th['surface_flux_conv_TH'] = surface_to_3d(surf_flux_th)\n", "budget_th['lin_fs_correction_TH'] = surface_to_3d(lin_fs_correction_th)\n", "budget_th['sw_flux_conv_TH'] = -grid.diff(swdown, 'Z', boundary='fill').fillna(0.)\n", "budget_th" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add it all up\n", "\n", "The total tendency should be given by" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_th['total_tendency_TH'] = (budget_th.conv_total_flux_TH + \n", " budget_th.surface_flux_conv_TH.fillna(0.) +\n", " budget_th.lin_fs_correction_TH.fillna(0.) + \n", " budget_th.sw_flux_conv_TH)\n", "budget_th" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_slt['total_tendency_SLT'] = (budget_slt.conv_total_flux_SLT + \n", " budget_slt.surface_flux_conv_SLT.fillna(0.) +\n", " budget_slt.lin_fs_correction_SLT.fillna(0.))\n", "budget_slt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Include the \"truth\"\n", "\n", "MITgcm keeps track of the true total tendence in the `TOT?TEND` variables.\n", "We can use these as check on our budget calculation." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "volume = (coords.drF * coords.rA * coords.hFacC)\n", "client.scatter(volume)\n", "day2seconds = (24*60*60)**-1\n", "\n", "budget_th['total_tendency_TH_truth'] = dsr.TOTTTEND * volume * day2seconds\n", "budget_slt['total_tendency_SLT_truth'] = dsr.TOTSTEND * volume * day2seconds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Validate Budget\n", "\n", "Now we do some checks to verify that the budget adds up.\n", "\n", "### Vertical and Horizontal Integrals of Budget\n", "\n", "We will take an average over the first 10 timesteps" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time_slice = dict(time=slice(0, 10))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "valid_range = dict(YC=slice(-90,-30))\n", "\n", "def check_vertical(budget, suffix):\n", " ds_chk = (budget[[f'total_tendency_{suffix}', f'total_tendency_{suffix}_truth']]\n", " .sel(**valid_range).sum(dim=['Z', 'XC']).mean(dim='time'))\n", " return ds_chk\n", "\n", "def check_horizontal(budget, suffix):\n", " ds_chk = (budget[[f'total_tendency_{suffix}', f'total_tendency_{suffix}_truth']]\n", " .sel(**valid_range).sum(dim=['YC', 'XC']).mean(dim='time'))\n", " return ds_chk" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "th_vert = check_vertical(budget_th.isel(**time_slice), 'TH').load()\n", "th_vert.total_tendency_TH.plot(linewidth=2)\n", "th_vert.total_tendency_TH_truth.plot(linestyle='--', linewidth=2)\n", "plt.legend()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "th_horiz = check_horizontal(budget_th.isel(**time_slice), 'TH').load()\n", "th_horiz.total_tendency_TH.plot(linewidth=2, y='Z')\n", "th_horiz.total_tendency_TH_truth.plot(linestyle='--', linewidth=2, y='Z')\n", "plt.legend()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slt_vert = check_vertical(budget_slt.isel(**time_slice), 'SLT').load()\n", "slt_vert.total_tendency_SLT.plot(linewidth=2)\n", "slt_vert.total_tendency_SLT_truth.plot(linestyle='--', linewidth=2)\n", "plt.legend()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slt_horiz = check_horizontal(budget_slt.isel(**time_slice), 'SLT').load()\n", "slt_horiz.total_tendency_SLT.plot(linewidth=2, y='Z')\n", "slt_horiz.total_tendency_SLT_truth.plot(linestyle='--', linewidth=2, y='Z')\n", "plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Histogram Analysis of Error\n", "\n", "The curves look the same. But how do we know whether our error is truly \"small\"?\n", "Answer: we compare the distribution of the error\n", "\n", "$$ P( G_{est} - G_{truth} ) $$\n", "\n", "to the distribution of the other terms in the equation, e.g. $P(G_{adv})$.\n", "\n", "First we try to determine the appropriate range for our histograms by looking at the maximum values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_th.isel(**time_slice).max().load()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_slt.isel(**time_slice).max().load()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# parameters for histogram calculation\n", "th_range = (-2e7, 2e7)\n", "slt_range = (-1e8, 1e8)\n", "valid_region = dict(YC=slice(-90, -30))\n", "nbins = 301" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# budget errors\n", "error_th = budget_th.total_tendency_TH - budget_th.total_tendency_TH_truth\n", "error_slt = budget_slt.total_tendency_SLT - budget_slt.total_tendency_SLT_truth" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# calculate theta error histograms over the whole time range\n", "adv_hist_th, hbins_th = dsa.histogram(budget_th.conv_horiz_adv_flux_TH.sel(**valid_region).data,\n", " bins=nbins, range=th_range)\n", "err_hist_th, hbins_th = dsa.histogram(error_th.sel(**valid_region).data,\n", " bins=nbins, range=th_range)\n", "err_hist_th, adv_hist_th = dask.compute(err_hist_th, adv_hist_th)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bin_c_th = 0.5*(hbins_th[:-1] + hbins_th[1:])\n", "plt.semilogy(bin_c_th, adv_hist_th, label='Advective Tendency')\n", "plt.semilogy(bin_c_th, err_hist_th, label='Budget Residual')\n", "plt.legend()\n", "plt.title('THETA Budget')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# calculate salt error histograms over the whole time range\n", "adv_hist_slt, hbins_slt = dsa.histogram(budget_slt.conv_horiz_adv_flux_SLT.sel(**valid_region).data,\n", " bins=nbins, range=slt_range)\n", "err_hist_slt, hbins_slt = dsa.histogram(error_slt.sel(**valid_region).fillna(-9e13).data,\n", " bins=nbins, range=slt_range)\n", "err_hist_slt, adv_hist_slt = dask.compute(err_hist_slt, adv_hist_slt)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "bin_c_slt = 0.5*(hbins_slt[:-1] + hbins_slt[1:])\n", "plt.semilogy(bin_c_slt, adv_hist_slt, label='Advective Tendency')\n", "plt.semilogy(bin_c_slt, err_hist_slt, label='Budget Residual')\n", "plt.title('SALT Budget')\n", "plt.legend()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These figures show that the error is extremely small compared to the other terms in the budget.\n", "\n", "## Weddell Sea Budget Timeseries\n", "\n", "Finally, we can do some science: compute the salinity budget for a specific region, such as the upper Weddell Sea." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_slt_weddell = (budget_slt\n", " .sel(YC=slice(-80, -68), XC=slice(290, 360), Z=slice(0, -500))\n", " .sum(dim=('XC', 'YC', 'Z'))\n", " .load())\n", "budget_slt_weddell" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(18,8))\n", "for v in budget_slt_weddell.data_vars:\n", " budget_slt_weddell[v].rolling(time=30).mean().plot(label=v)\n", "plt.ylim([-4e7, 4e7])\n", "plt.legend()\n", "plt.grid()\n", "plt.title('Weddell Sea Salt Budget')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(18,8))\n", "for v in budget_slt_weddell.data_vars:\n", " budget_slt_weddell[v].rolling(time=30).mean().plot(label=v)\n", "plt.ylim([-4e6, 4e6])\n", "plt.legend()\n", "plt.grid()\n", "plt.title('Weddell Sea Salt Budget')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These figures show that, while they advective terms are the largest ones in the budget, the actual variability in salinity is driven primarily by the surface fluxes.\n", "\n", "### Removing Climatlogy\n", "\n", "The timeseries is pretty short, but nevertheless we can try to remove the climatology to get a better idea of what drives the interannual variability" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "budget_slt_weddell_mm = budget_slt_weddell.groupby('time.month').mean(dim='time')\n", "budget_slt_weddell_anom = budget_slt_weddell.groupby('time.month') - budget_slt_weddell_mm" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(18,8))\n", "for v in budget_slt_weddell.data_vars:\n", " budget_slt_weddell_anom[v].rolling(time=30).mean().plot(label=v)\n", "plt.ylim([-2.5e7, 2.5e7])\n", "plt.legend()\n", "plt.grid()\n", "plt.title('Weddell Sea Anomaly Salt Budget')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(18,8))\n", "for v in budget_slt_weddell.data_vars:\n", " budget_slt_weddell_anom[v].rolling(time=30).mean().plot(label=v)\n", "plt.ylim([-2.5e6, 2.5e6])\n", "plt.legend()\n", "plt.grid()\n", "plt.title('Weddell Sea Anomaly Salt Budget')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The monthly anomaly is also premoninantly driven by surface forcing." ] } ], "metadata": { "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }