{ "cells": [ { "cell_type": "markdown", "source": [ "# Probability map of phytoplankton in the North Sea using DIVAnd and a neural network\n", "The first step is to load the required modules" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "using DIVAnd\n", "using DIVAndNN\n", "using LinearAlgebra\n", "using Statistics\n", "using Random\n", "using Dates" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "The domain and the directory path `datadir` is defined in the file `emodnet_bio_grid.jl`" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "include(\"../scripts/emodnet_bio_grid.jl\");\n", "include(\"../scripts/validate_probability.jl\");\n", "include(\"../scripts/PhytoInterp.jl\");" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Create working directories" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "mkpath(datadir)\n", "mkpath(joinpath(datadir,\"tmp\"))" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Helper function to download file from an URL is necessary" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "function maybedownload(url,fname)\n", " if !isfile(fname)\n", " mv(download(url),fname)\n", " else\n", " @info(\"$url is already downloaded\")\n", " end\n", "end" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Download the GEBCO Bathymetry" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "bathname = joinpath(datadir,\"gebco_30sec_4.nc\");\n", "bathisglobal = true\n", "maybedownload(\"https://dox.ulg.ac.be/index.php/s/RSwm4HPHImdZoQP/download\",\n", " joinpath(datadir,\"gebco_30sec_4.nc\"))" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Download a sample data file.\n", "Here we use the _Biddulphia sinensis_ prepared by Deltares, NL" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "datafile = joinpath(datadir, \"Biddulphia sinensis-1995-2020.csv\")\n", "maybedownload(\"https://dox.ulg.ac.be/index.php/s/VgLglubaTLetHzc/download\", datafile)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "## Mask and bathymetry\n", "Interpolate land-sea mask" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "maskname = joinpath(datadir,\"mask.nc\")\n", "DIVAndNN.prep_mask(bathname,bathisglobal,gridlon,gridlat,years,maskname)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Load the mask (true: sea, false: land)" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "ds = Dataset(maskname,\"r\")\n", "mask = nomissing(ds[\"mask\"][:,:]) .== 1\n", "close(ds)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Interpolate the bathymetry" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "DIVAndNN.prep_bath(bathname,bathisglobal,gridlon,gridlat,datadir)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "## Environmental covariables\n", "These files are quite large and processing them takes some time. We therefore\n", "download the prepared data files for the North Sea." ], "metadata": {} }, { "cell_type": "markdown", "source": [ "These files can be generated by:\n", "```julia\n", "maybedownload(\"https://ec.oceanbrowser.net/data/emodnet-projects/Phase-3/Combined/Water_body_phosphate_combined_V1.nc\",\n", " joinpath(datadir,\"tmp\",\"Water_body_phosphate_combined_V1.nc\"))\n", "\n", "maybedownload(\"https://ec.oceanbrowser.net/data/emodnet-projects/Phase-3/Combined/Water_body_nitrogen_combined_V1.nc\",\n", " joinpath(datadir,\"tmp\",\"Water_body_nitrogen_combined_V1.nc\"))\n", "\n", "maybedownload(\"https://ec.oceanbrowser.net/data/emodnet-projects/Phase-3/Combined/Water_body_silicate_combined_V1.nc\",\n", " joinpath(datadir,\"tmp\",\"Water_body_silicate_combined_V1.nc\"))\n", "\n", "DIVAndNN.prep_tempsalt(gridlon,gridlat,data_TS,datadir)\n", "```" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "maybedownload(\"https://dox.ulg.ac.be/index.php/s/y9Z0c1wb5YshVDW/download\",\n", " joinpath(datadir,\"silicate.nc\"))\n", "\n", "maybedownload(\"https://dox.ulg.ac.be/index.php/s/A1NPSWwQYkx6Wy6/download\",\n", " joinpath(datadir,\"phosphate.nc\"))\n", "\n", "maybedownload(\"https://dox.ulg.ac.be/index.php/s/LDPbPWBvW6wPmCw/download\",\n", " joinpath(datadir,\"nitrogen.nc\"))\n", "\n", "\n", "BLAS.set_num_threads(1)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Compute local resolution" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "mask_unused,pmn,xyi = DIVAnd.domain(bathname,bathisglobal,gridlon,gridlat);" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Next we load the covariables.\n", "The entries below correspond to the file name, the variable name and\n", "transformation function" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "covars_fname = [\n", " (\"bathymetry.nc\" , \"batymetry\" , identity),\n", " (\"nitrogen.nc\" , \"nitrogen\" , identity),\n", " (\"phosphate.nc\" , \"phosphate\" , identity),\n", " (\"silicate.nc\" , \"silicate\" , identity),\n", "]" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Add `datadir` to the file file names" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "covars_fname = map(entry -> (joinpath(datadir,entry[1]),entry[2:end]...),covars_fname)\n", "\n", "field = DIVAndNN.loadcovar((gridlon,gridlat),covars_fname;\n", " covars_const = true);" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Normalize covariables" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "DIVAndNN.normalize!(mask,field)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Inventory of all data files\n", "For this example we have just one file" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "data_analysis = DIVAndNN.Format2020(datadir,\"\")\n", "scientificname_accepted = listnames(data_analysis);" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "Parameters for the analysis\n", "Except `len`, all parameters are adimensional." ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "niter = 500 # number of iterations\n", "trainfrac = 0.01 # fraction of data using during training\n", "epsilon2ap = 10 # data constraint parameter\n", "epsilon2_background = 10 # error variance of obs. relative to background\n", "NLayers = [size(field)[end],4,1] # number of layers of the neural network\n", "learning_rate = 0.001 # learning rate for the optimizer\n", "L2reg = 0.0001 # L2 regularization for the weights\n", "dropoutprob = 0.6 # drop-out probability\n", "len = 75e3 # correlation length-scale (meters)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "output directory" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "outdir = joinpath(datadir,\"Results\",\"test\")\n", "mkpath(outdir)\n", "\n", "sname = String(scientificname_accepted[1])\n", "\n", "@info sname" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "load data" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "lon_a,lat_a,obstime_a,value_a,ids_a = loadbyname(data_analysis,years,sname)\n", "\n", "Random.seed!(1234)\n", "\n", "xobs_a = (lon_a,lat_a)\n", "lenxy = (len,len)" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "## Start the analysis" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "value_analysis,fw0 = DIVAndNN.analysisprob(\n", " mask,pmn,xyi,xobs_a,\n", " value_a,\n", " lenxy,epsilon2ap,\n", " field,\n", " NLayers,\n", " costfun = DIVAndNN.nll,\n", " niter = niter,\n", " dropoutprob = dropoutprob,\n", " L2reg = L2reg,\n", " learning_rate = learning_rate,\n", " rmaverage = true,\n", " trainfrac = trainfrac,\n", " epsilon2_background = epsilon2_background,\n", ");" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "## Save the results" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "outname = joinpath(outdir,\"DIVAndNN_$(sname)_interp.nc\")\n", "create_nc_results(outname, gridlon, gridlat, value_analysis, sname;\n", " varname = \"probability\", long_name=\"occurrence probability\");" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "## Plots" ], "metadata": {} }, { "outputs": [], "cell_type": "code", "source": [ "include(\"../scripts/emodnet_bio_plot2.jl\")" ], "metadata": {}, "execution_count": null }, { "cell_type": "markdown", "source": [ "---\n", "\n", "*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*" ], "metadata": {} } ], "nbformat_minor": 3, "metadata": { "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.4.1" }, "kernelspec": { "name": "julia-1.4", "display_name": "Julia 1.4.1", "language": "julia" } }, "nbformat": 4 }