{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UTO98CS_feHV",
    "tags": []
   },
   "source": [
    "# Tutorial: Data Submission to ESS-DIVE SANDBOX Using API\n",
    "\n",
    "This notebook includes steps to submit data to ESS-DIVE through the API. This is a tutorial notebook and contains extra guidance and information about the API and metadata creation process. \n",
    "\n",
    "The ESS-DIVE Dataset API is a service that enables projects to programmatically submit and manage datasets with ESS-DIVE. This is an alternative to using the ESS-DIVE Online form for data uploads. This service encodes metadata using the JSON-LD specification. JSON-LD is a schema to encode linked Data using JSON, and in the future will be used by Google to index metadata for searches. The use of the standardized JSON-LD schema will dramatically increase the visibility of datasets, and also enable projects to create one-time code that can be reused for periodic uploads of datasets to ESS-DIVE. \n",
    "\n",
    "\n",
    "---\n",
    "⭐ Contact ess-dive-support@lbl.gov to **submit more than 10GB per upload attempt**. Additional permissions are required. <br>\n",
    "⭐ Current Maximum Upload Limit: **500 GB per upload attempt** <br> Please contact ess-dive-support@lbl.gov to submit more than 500GB of data at once."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ln1UsEzr48Dv"
   },
   "source": [
    "- Use **Sandbox** https://api-sandbox.ess-dive.lbl.gov when testing code to submit datasets to ESS-DIVE. All code examples use sandbox. \n",
    "- Once you have tested your code and you're ready to create new datasets for publication, use our **production** domain https://api.ess-dive.lbl.gov/.\n",
    "\n",
    "\n",
    "For additional information about the API, review the documentation at https://api-sandbox.ess-dive.lbl.gov. <br>\n",
    "Email ESS-DIVE at ess-dive-support@lbl.gov if you require assistance\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "yujnqOtsgIBz",
    "tags": []
   },
   "source": [
    "## 1. Get Authentication Token\n",
    "\n",
    "\n",
    "1. Go to https://data-sandbox.ess-dive.lbl.gov\n",
    "2. Sign in with Orcid\n",
    "3. Click your Name in the right hand corner and select My Profile \n",
    "4. Now Click the Settings > Authentication Token\n",
    "5. Scroll down and click Copy on the “Token” tab to get your authentication token \n",
    "\n",
    "---\n",
    "⭐️ If you are not already registered to submit data with ESS-DIVE, follow the steps on the Register to Submit Data page: https://docs.ess-dive.lbl.gov/contributing-data/new-contributor-registration. \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Z97JGmb4Tvj9"
   },
   "source": [
    "## 2. Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "_xnrI6sTVPRb",
    "outputId": "b91da6b5-76e5-4cfb-a171-0a57497d965a"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: requests in /Users/emilyarobles/anaconda3/lib/python3.9/site-packages (2.28.2)\n",
      "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /Users/emilyarobles/anaconda3/lib/python3.9/site-packages (from requests) (1.26.9)\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /Users/emilyarobles/anaconda3/lib/python3.9/site-packages (from requests) (2021.10.8)\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /Users/emilyarobles/anaconda3/lib/python3.9/site-packages (from requests) (2.0.4)\n",
      "Requirement already satisfied: idna<4,>=2.5 in /Users/emilyarobles/anaconda3/lib/python3.9/site-packages (from requests) (3.3)\n",
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "pip install requests"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "eK7_dkbqU3M6"
   },
   "source": [
    "Enter your Authentication Token below by running the cell, then pasting your token into the text box. Do not re-run the cell after entering your token, unless you are updating your token. See step 1 for instructions to access your token through ESS-DIVE.\n",
    "\n",
    "*Reminder: Tokens expire every 24 hours. Always repeat this setup step when you update your token.*"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 49,
     "referenced_widgets": [
      "ef68fd157bb6494fbf438cc0c6b5f447",
      "b205804e6f5c4f238b499e832ee13936",
      "9a1b718673da41abb24fe4326cb18b9b"
     ]
    },
    "id": "qzNunXQkfhcA",
    "outputId": "22e7e5a3-e4ea-4517-a820-cd80619b65dd"
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "1b4c7fdf589846798e0bf9e430feccbe",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Text(value='', description='Token:')"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import requests\n",
    "import os\n",
    "import json\n",
    "from ipywidgets import widgets, interact\n",
    "import re\n",
    "\n",
    "token_text = widgets.Text(\"\", description=\"Token:\")\n",
    "display(token_text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "6vi1ZU5s4p8M",
    "outputId": "0d76480b-7843-41dc-e6d4-057becd21c35"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "You have successfully entered your token. Please remember to update every 24hrs.\n"
     ]
    }
   ],
   "source": [
    "token = token_text.value\n",
    "base = \"https://api-sandbox.ess-dive.lbl.gov/\"\n",
    "header_authorization =  \"bearer {}\".format(token)\n",
    "endpoint = \"packages\"\n",
    "\n",
    "if token == '':\n",
    "  print('Please enter your token after running the cell above.')\n",
    "else:\n",
    "  print('You have successfully entered your token. Please remember to update every 24hrs.')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "93wHjt7ABHNh",
    "tags": []
   },
   "source": [
    "## 3. Create Metadata \n",
    "The following lines of code validate JSON-LD metadata for a single dataset. The original code for creating a JSON object can be found at https://docs.ess-dive.lbl.gov/programmatic-tools/ess-dive-dataset-api/python-example#create-metadata. For the purpose of this tutorial, we have organized elements into smaller groups. \n",
    "<br><br>\n",
    "**For each metadata field we have provided general guidance on what to include. For more information about our requirements for metadata quality and to review checks that are performed during the publication review process, please see our Dataset Requirements documentation: https://docs.ess-dive.lbl.gov/contributing-data/package-level-metadata**"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "byk8uGU81mhG",
    "tags": []
   },
   "source": [
    "### Project Information - *provider*\n",
    "First, let's add **your** information as provider. You will need your project name and personal information, the same as it appears in your ESS-DIVE profile: \n",
    "\n",
    "*   ORCID\n",
    "*   First and last name\n",
    "*   Email\n",
    "*   Job title (eg. Principal Investigator)\n",
    "\n",
    "\n",
    "Revise the example information directly in the cell bellow. After running the cell, check that the output information matches your project information correctly. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "PDMvGoRQTovl",
    "outputId": "8e6bd4b1-47e7-4602-c0cd-ee4c8e64c801"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Please review your information and revise if needed.\n",
      "Project name: SPRUCE\n",
      "Submitter ORCID: http://orcid.org/0000-0001-7293-3561\n",
      "First name: Paul\n",
      "Last name: Hanson\n",
      "Email: hansonpj@ornl.gov\n",
      "Job title: Principal Investigator\n"
     ]
    }
   ],
   "source": [
    "provider_details = {\n",
    "   \"name\": \"SPRUCE\",\n",
    "   \"member\": {\n",
    "     \"@id\": \"http://orcid.org/0000-0001-7293-3561\",\n",
    "     \"givenName\": \"Paul\",\n",
    "     \"familyName\": \"Hanson\",\n",
    "     \"email\": \"hansonpj@ornl.gov\",\n",
    "     \"jobTitle\": \"Principal Investigator\"\n",
    "   }\n",
    " }\n",
    "\n",
    "#DO NOT EDIT BELOW\n",
    "print('Please review your information and revise if needed.')\n",
    "print(f\"Project name: {provider_details['name']}\\nSubmitter ORCID: {provider_details['member']['@id']}\\nFirst name: {provider_details['member']['givenName']}\\nLast name: {provider_details['member']['familyName']}\\nEmail: {provider_details['member']['email']}\\nJob title: {provider_details['member']['jobTitle']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "bK__ppCUC_lk"
   },
   "source": [
    "**Coming soon: Use project identifier instead of manually entering project metadata.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {
    "id": "G5LWqNZ9C_lk"
   },
   "outputs": [],
   "source": [
    "#provider_spruce = {\n",
    "#            \"identifier\": {\n",
    "#            \"@type\": \"PropertyValue\",\n",
    "#                \"propertyID\": \"ess-dive\",\n",
    "#                \"value\": \"1e6d50d3-9532-43fb-a63f-bdcb4350bf0c\"\n",
    "#   }\n",
    "# }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8fy8cZw87ZR6"
   },
   "source": [
    "### Dataset Author(s) - *creator*\n",
    "Revise the examples in the cell below to include dataset authors in the order that you would like them to appear in the citation. Please add the ORCID for all authors, especially the first author, if possible. You can add or delete authors as needed. \n",
    "<br>\n",
    "**After running the cell, check that the output author listing is correct.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {
    "id": "zH2BKZD4zJCs"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "This is how your author list will appear in citation. Please make revisions if needed. \n",
      "['Hanson P', 'Riggs J', 'Nettles C', 'Dorrance W', 'Hook L']\n"
     ]
    }
   ],
   "source": [
    "creators =  [\n",
    "   {\n",
    "     \"@id\": \"http://orcid.org/0000-0001-7293-3561\",\n",
    "     \"givenName\": \"Paul J\",\n",
    "     \"familyName\": \"Hanson\",\n",
    "     \"affiliation\": \"Oak Ridge National Laboratory\",\n",
    "     \"email\": \"hansonpj@ornl.gov\"\n",
    "   },\n",
    "   {\n",
    "     \"givenName\": \"Jeffrey\",\n",
    "     \"familyName\": \"Riggs\",\n",
    "     \"affiliation\": \"Oak Ridge National Laboratory\"\n",
    "   },\n",
    "   {\n",
    "     \"givenName\": \"C\",\n",
    "     \"familyName\": \"Nettles\",\n",
    "     \"affiliation\": \"Oak Ridge National Laboratory\"\n",
    "   },\n",
    "   {\n",
    "     \"givenName\": \"William\",\n",
    "     \"familyName\": \"Dorrance\",\n",
    "     \"affiliation\": \"Oak Ridge National Laboratory\"\n",
    "   },\n",
    "   {\n",
    "     \"givenName\": \"Les\",\n",
    "     \"familyName\": \"Hook\",\n",
    "     \"affiliation\": \"Oak Ridge National Laboratory\"\n",
    "   }\n",
    " ]\n",
    "\n",
    "# DO NOT EDIT BELOW\n",
    "author_list_preview = []\n",
    "for i in creators:\n",
    "  names = i['familyName'] + ' ' + i['givenName'][0]\n",
    "  author_list_preview.append(names)\n",
    "\n",
    "\n",
    "print(f'This is how your author list will appear in citation. Please make revisions if needed. \\n{author_list_preview}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dataset Contact - *editor*\n",
    "List the person who should be contacted by users seeking further information for the data. Only one contact is allowed. Including the ORCID of this individual is strongly encouraged. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [],
   "source": [
    "contact = {\n",
    "   \"@id\": \"http://orcid.org/0000-0001-7293-3561\",\n",
    "   \"givenName\": \"Paul J\",\n",
    "   \"familyName\": \"Hanson\",\n",
    "   \"email\": \"hansonpj@ornl.gov\"\n",
    " }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2rMbdAvgC_lk",
    "tags": []
   },
   "source": [
    "### Dataset Title - *name*\n",
    "Replace the example below with a title 7-20 words in length."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "id": "N1qbB2nWC_lk"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SPRUCE S1 Bog Environmental Monitoring Data: 2010-2016\n"
     ]
    }
   ],
   "source": [
    "dataset_title = \"SPRUCE S1 Bog Environmental Monitoring Data: 2010-2016\"\n",
    "\n",
    "# DO NOT EDIT\n",
    "print(dataset_title)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Dataset Variables and Keywords - *variableMeasured, keywords*\n",
    "Enter all variables that you would like to include in your dataset, **separated by commas.** You can use your own variables, or choose standard names from the Global Change Master Directory (GCMD) Keywords list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "bgA319CTH5lK",
    "outputId": "461cd0cc-080c-4589-a753-db02fe081541"
   },
   "outputs": [],
   "source": [
    "variables = 'ENTER, VARIABLES, HERE'\n",
    "\n",
    "# DO NOT EDIT\n",
    "variables_list = variables.split(',')\n",
    "print(variables_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Repeat for keywords; enter all keywords that you would like to include in your dataset, **separated by commas.** You can use your own keywords, or choose standard names from the Global Change Master Directory (GCMD) Keywords list."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "keywords = 'ENTER, KEYWORDS, HERE'\n",
    "\n",
    "# DO NOT EDIT\n",
    "keywords_list = keywords.split(',')\n",
    "print(keywords_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Dataset Abstract - *description*\n",
    "Enter an abstract for your dataset below. The abstract should be at least 100 words in length and include all details necessary to understand the purpose of your dataset and the research question it addresses, in addition to information required for data use and reproducibility."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Abstract: This data set reports selected ambient environmental monitoring data from the S1 bog in Minnesota for the period June 2010 through December 2016. Measurements of the environmental conditions at these stations will serve as a pre-treatment baseline for experimental treatments and provide driver data for future modeling activities. The site is the S1 bog, a Picea mariana [black spruce] – Sphagnum spp. bog forest in northern Minnesota, 40 km north of Grand Rapids, in the USDA Forest Service Marcell Experimental Forest (MEF). There are/were three monitoring sites located in the bog: Stations 1 and 2 are co-located at the southern end of the bog and Station 3 is located north central and adjacent to an existing U.S. Forest Service monitoring well. There are eight data files with selected results of ambient environmental monitoring in the S1 bog for the period June 2010 through December 2016. One file has the complete set of measurements and the other seven have the available data for a given calendar year. Not all measurements started in June 2010 and EM3 measurements ended in May 2014. Further details about the data package are in the attached pdf file (SPRUCE_EM_DATA_2010_2016_20170620).\n",
      "\n",
      "Dataset abstract is 196 words in length.\n"
     ]
    }
   ],
   "source": [
    "abstract = ['This data set reports selected ambient environmental monitoring data from the S1 bog in Minnesota for the period June 2010 through December 2016. Measurements of the environmental conditions at these stations will serve as a pre-treatment baseline for experimental treatments and provide driver data for future modeling activities. The site is the S1 bog, a Picea mariana [black spruce] – Sphagnum spp. bog forest in northern Minnesota, 40 km north of Grand Rapids, in the USDA Forest Service Marcell Experimental Forest (MEF). There are/were three monitoring sites located in the bog: Stations 1 and 2 are co-located at the southern end of the bog and Station 3 is located north central and adjacent to an existing U.S. Forest Service monitoring well. There are eight data files with selected results of ambient environmental monitoring in the S1 bog for the period June 2010 through December 2016. One file has the complete set of measurements and the other seven have the available data for a given calendar year. Not all measurements started in June 2010 and EM3 measurements ended in May 2014. Further details about the data package are in the attached pdf file (SPRUCE_EM_DATA_2010_2016_20170620).']\n",
    "\n",
    "# DO NOT EDIT\n",
    "print(f'Abstract: {abstract[0]}')\n",
    "for i in abstract:\n",
    "    res = len(re.findall(r'\\w+', i))\n",
    "    print (f\"\\nDataset abstract is {str(res)} words in length.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Geographic Coverage Information - *spatialCoverage*\n",
    "Revise the example in the below cell to add the description and coordinates of data collection sites. You can create additional sites if needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
    "geographic_info = [\n",
    "   {\n",
    "     \"description\": \"Site ID: S1 Bog Site name: S1 Bog, Marcell Experimental Forest Description: The site is the 8.1-ha S1 bog, a Picea mariana [black spruce] - Sphagnum spp. ombrotrophic bog forest in northern Minnesota, 40 km north of Grand Rapids, in the USDA Forest Service Marcell Experimental Forest (MEF). The S1 bog was harvested in successive strip cuts in 1969 and 1974 and the cut areas were allowed to naturally regenerate. Stations 1 and 2 are located in a 1974 strip that is characterized by a medium density of 3-5 meter black spruce and larch trees with an open canopy. The area was suitable for siting a monitoring station for representative meteorological conditions on the S1 bog. Station 3 is located in a 1969 harvest strip that is characterized by a higher density of 3-5 meter black spruce and larch trees with a generally closed canopy. Measurements at this station represent conditions in the surrounding stand. Site Photographs are in the attached document\",\n",
    "     \"geo\": [\n",
    "       {\n",
    "         \"name\": \"Northwest\",\n",
    "         \"latitude\": 47.50285,\n",
    "         \"longitude\": -93.48283\n",
    "       },\n",
    "       {\n",
    "         \"name\": \"Southeast\",\n",
    "         \"latitude\": 47.50285,\n",
    "         \"longitude\": -93.48283\n",
    "       }\n",
    "     ]\n",
    "   },\n",
    "    # SITE TWO\n",
    "    {\n",
    "     \"description\": \"Description of sectond site\",\n",
    "     \"geo\": [\n",
    "       {\n",
    "         \"name\": \"Northwest\",\n",
    "         \"latitude\": 47.50285,\n",
    "         \"longitude\": -93.48283\n",
    "       },\n",
    "       {\n",
    "         \"name\": \"Southeast\",\n",
    "         \"latitude\": 47.50285,\n",
    "         \"longitude\": -93.48283\n",
    "       }\n",
    "     ]\n",
    "   }\n",
    " ]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Start and End Dates - *temporalCoverage*\n",
    "Enter start and end dates for your data in YYYY-MM-DD format."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [],
   "source": [
    "start_end_dates = {\n",
    "   \"startDate\": \"2010-07-16\",\n",
    "   \"endDate\": \"2016-12-31\"\n",
    " }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Methods - *measurementTechnique* \n",
    "Revise the example text below "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Methods: The stations are equipped with standard sensors for measuring meteorological parameters, solar radiation, soil temperature and moisture, and groundwater temperature and elevation. Note that some sensor locations are relative to nearby vegetation and bog microtopographic features (i.e., hollows and hummocks). See Table 1 in the attached pdf (SPRUCE_EM_DATA_2010_2016_20170620) for a list of measurements and further details. Sensors and data loggers were initially installed and became operational in June, July, and August of 2010. Additional sensors were added in September 2011. Station 3 was removed from service on May 12, 2014. These data are considered at Quality Level 1. Level 1 indicates an internally consistent data product that has been subjected to quality checks and data management procedures. Established calibration procedures were followed.\n"
     ]
    }
   ],
   "source": [
    "methods = ['The stations are equipped with standard sensors for measuring meteorological parameters, solar radiation, soil temperature and moisture, and groundwater temperature and elevation. Note that some sensor locations are relative to nearby vegetation and bog microtopographic features (i.e., hollows and hummocks). See Table 1 in the attached pdf (SPRUCE_EM_DATA_2010_2016_20170620) for a list of measurements and further details. Sensors and data loggers were initially installed and became operational in June, July, and August of 2010. Additional sensors were added in September 2011. Station 3 was removed from service on May 12, 2014. These data are considered at Quality Level 1. Level 1 indicates an internally consistent data product that has been subjected to quality checks and data management procedures. Established calibration procedures were followed.']\n",
    "\n",
    "# DO NOT EDIT\n",
    "print(f'Methods: {methods[0]}') "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "tags": []
   },
   "source": [
    "### Funding Information - *funder*\n",
    "Revise the example text below to include information about project funding. All datasets submitted to ESS-DIVE *should* include DOE BER in their list of funders (see below), except in special cases."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [],
   "source": [
    "funding = {\n",
    "   \"@id\": \"http://dx.doi.org/10.13039/100006206\",\n",
    "   \"name\": \"U.S. DOE > Office of Science > Biological and Environmental Research (BER)\"\n",
    " }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 4. Create the JSON LD\n",
    "The below cell incorporates all of your previous entries and creates your JSON LD object. \n",
    "<br> **There are three additional fields you may wish to revise: \"@id,\" \"datePublished,\" and \"license.\" If your dataset has been previously published and you would like to publish it on ESS-DIVE using the *same* DOI, enter this DOI in the @id field (as seen in example). datePublished will assume current date, unless your dataset has been previously published, in which case you should enter the year of publication (2015 in example). Additionally, all datasets have default license: http://creativecommons.org/licenses/by/4.0/** "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "id": "hvW7MFOs7f3R"
   },
   "outputs": [],
   "source": [
    "json_ld = {\n",
    " \"@context\": \"http://schema.org/\",\n",
    " \"@type\": \"Dataset\",\n",
    " \"@id\": \"http://dx.doi.org/10.3334/CDIAC/spruce.001\",\n",
    " \"name\": dataset_title,\n",
    " \"description\": abstract,\n",
    " \"creator\": creators,\n",
    " \"datePublished\": \"2015\",\n",
    " \"keywords\": keywords_list,\n",
    " \"variableMeasured\": variables_list,\n",
    " \"license\": \"http://creativecommons.org/licenses/by/4.0/\",\n",
    " \"spatialCoverage\": geographic_info,\n",
    " \"funder\": funding,\n",
    " \"temporalCoverage\": start_end_dates,\n",
    " \"editor\": contact,\n",
    " \"provider\": provider_details,\n",
    " \"measurementTechnique\": methods\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EiszmKKJCsxg"
   },
   "source": [
    "## 5. Submit your dataset\n",
    "There are three options for creating a new dataset:\n",
    "\n",
    "*   submit metadata only\n",
    "*   submit metadata and a single data file\n",
    "*   submit metadata and multiple data files \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZUBkHPbk7gCS"
   },
   "source": [
    "### Metadata Only\n",
    "Use the following cell to submit only your JSON-LD object. This will create a dataset with only metadata and no files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {
    "id": "nAkr83H37pJM"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "View URL:https://data-sandbox.ess-dive.lbl.gov/view/ess-dive-9de3d58449bb69a-20230511T050754345911\n",
      "Name:SPRUCE S1 Bog Environmental Monitoring Data: 2010-2016\n"
     ]
    }
   ],
   "source": [
    "post_packages_url = \"{}{}\".format(base,endpoint)\n",
    "post_package_response = requests.post(post_packages_url,\n",
    "                                      headers={\"Authorization\":header_authorization},\n",
    "                                      json=json_ld)\n",
    "\n",
    "if post_package_response.status_code == 201:\n",
    "    # Success\n",
    "    response=post_package_response.json()\n",
    "    print(f\"View URL:{response['viewUrl']}\")\n",
    "    print(f\"Name:{response['dataset']['name']}\")\n",
    "else:\n",
    "    # There was an error\n",
    "    print(post_package_response.text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "zXca7qo68fE-"
   },
   "source": [
    "### Metadata and Single Data File\n",
    "To submit the JSON-LD object along with a data file, use the following cell block. Replace \"file_path\" with the path to your file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "b1kHlWYY8in-"
   },
   "outputs": [],
   "source": [
    "files_tuples_array = []\n",
    "upload_file = \"file_path\"\n",
    "\n",
    "files_tuples_array.append(((\"json-ld\", json.dumps(json_ld))))\n",
    "files_tuples_array.append((\"data\", open(upload_file ,'rb')))\n",
    "\n",
    "post_packages_url = \"{}{}\".format(base,endpoint)\n",
    "post_package_response = requests.post(post_packages_url,\n",
    "                                    headers={\"Authorization\":header_authorization},\n",
    "                                    files= files_tuples_array)\n",
    "\n",
    "if post_package_response.status_code == 201:\n",
    "    # Success\n",
    "    response=post_package_response.json()\n",
    "    print(f\"View URL:{response['viewUrl']}\")\n",
    "    print(f\"Name:{response['dataset']['name']}\")\n",
    "else:\n",
    "    # There was an error\n",
    "    print(post_package_response.text)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "lsZTNg9FAivU"
   },
   "source": [
    "### Metadata and Multiple Data Files\n",
    "If you have many files to be uploaded, you can place them all inside a directory named 'files' and use the following code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "KM20zaFwAmEh"
   },
   "outputs": [],
   "source": [
    "files_tuples_array = []\n",
    "files_upload_directory = \"/Users/emilyarobles/Desktop/API_TEST_FILES/\"\n",
    "files = os.listdir(files_upload_directory)\n",
    "\n",
    "files_tuples_array.append(((\"json-ld\", json.dumps(json_ld))))\n",
    "\n",
    "for filename in files:\n",
    "   file_directory = files_upload_directory + filename\n",
    "   files_tuples_array.append(((\"data\", open(file_directory, 'rb'))))\n",
    "\n",
    "post_packages_url = \"{}{}\".format(base,endpoint)\n",
    "post_package_response = requests.post(post_packages_url,\n",
    "                                    headers={\"Authorization\":header_authorization},\n",
    "                                    files= files_tuples_array)\n",
    "\n",
    "if post_package_response.status_code == 201:\n",
    "   # Success\n",
    "   response=post_package_response.json()\n",
    "   print(f\"View URL:{response['viewUrl']}\")\n",
    "   print(f\"Name:{response['dataset']['name']}\")\n",
    "else:\n",
    "   # There was an error\n",
    "   print(post_package_response.text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "FbZX7HQD_Get"
   },
   "source": [
    "# Revise Existing Datasets\n",
    "It is possible to both update the metadata and data of an existing dataset.  The following update scenarios are possible \n",
    "\n",
    "*   update metadata only\n",
    "*   replace/add files only\n",
    "*   both update metadata and replace/add files. \n",
    "\n",
    "These examples will demonstrate both updating metadata and adding new files to the dataset created in previous sections."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "wswTOzr1BbD1"
   },
   "source": [
    "### Update metadata only\n",
    "Use the PUT function to update the metadata of a dataset.  This example updates the title (name) of a dataset. You will need the ESS-DIVE identifier of the dataset that you want to revise. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "3RsF5D4LC_ll"
   },
   "outputs": [],
   "source": [
    "dataset_id = input('Enter an ESS-DIVE Identifier here: ')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "gBHjV8GPCzQr"
   },
   "outputs": [],
   "source": [
    "put_package_url = \"{}{}/{}\".format(base,endpoint, dataset_id)\n",
    "\n",
    "metadata_update_dict = {\"name\": \"Updated Dataset Name\"}\n",
    "\n",
    "put_package_response = requests.put(put_package_url,\n",
    "                                    headers={\"Authorization\":header_authorization},\n",
    "                                    json=metadata_update_dict)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Yf4LKPf2By8E"
   },
   "source": [
    "Check the results for the changed metadata attribute"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "7XaUOvneBzbM"
   },
   "outputs": [],
   "source": [
    "# Check for errors\n",
    "if put_package_response.status_code == 200:\n",
    "   # Success\n",
    "   response=put_package_response.json()\n",
    "   print(f\"View URL:{response['viewUrl']}\")\n",
    "   print(f\"Name:{response['dataset']['name']}\")\n",
    "else:\n",
    "   # There was an error\n",
    "   print(put_package_response.text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9gAwOeh4CC0Q"
   },
   "source": [
    "### Metadata plus a new data file\n",
    "Use the PUT function to update a dataset.  This example updates the date published to 2019 of a dataset and adds a new data file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "V-C13qxFC_ll"
   },
   "outputs": [],
   "source": [
    "dataset_id = input('Enter an ESS-DIVE Identifier here: ')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Zs9cgaT4CDEb"
   },
   "outputs": [],
   "source": [
    "files_tuples_array = []\n",
    "upload_file = \"path/to/your_file\"\n",
    "files_tuples_array.append(((\"json-ld\", json.dumps(metadata_update_dict))))\n",
    "files_tuples_array.append((\"data\", open(upload_file ,'rb')))\n",
    "\n",
    "put_package_url = \"{}{}/{}\".format(base,endpoint, dataset_id)\n",
    "\n",
    "\n",
    "\n",
    "put_package_response = requests.put(put_package_url,\n",
    "                                   headers={\"Authorization\":header_authorization},\n",
    "                                   files= files_tuples_array)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0bVDRztkCKwy"
   },
   "source": [
    "Check the results for the changed metadata attribute.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "-ul8pIEyCK9_"
   },
   "outputs": [],
   "source": [
    "# Check for errors\n",
    "if put_package_response.status_code == 200:\n",
    "    # Success\n",
    "    response=put_package_response.json()\n",
    "    print(f\"View URL:{response['viewUrl']}\")\n",
    "    print(f\"Date Published:{response['dataset']['datePublished']}\")\n",
    "    print(f\"Files In Dataset:{response['dataset']['distribution']}\")\n",
    "else:\n",
    "   # There was an error\n",
    "   print(put_package_response.text)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Cy0iTU6pCaec"
   },
   "source": [
    "Check the results for the added metadata attribute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "-aj_RkdqCapf"
   },
   "outputs": [],
   "source": [
    "get_packages_url = \"{}{}\".format(base,endpoint)\n",
    "get_packages_response = requests.get(get_packages_url, \n",
    "    headers={\"Authorization\":header_authorization})\n",
    "\n",
    "if get_packages_response.status_code == 200:\n",
    "   #Success\n",
    "   print(get_packages_response.json())\n",
    "else:\n",
    "   # There was an error\n",
    "   print(get_packages_response.text)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.12"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "9a1b718673da41abb24fe4326cb18b9b": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "DescriptionStyleModel",
     "state": {
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "DescriptionStyleModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "StyleView",
      "description_width": ""
     }
    },
    "b205804e6f5c4f238b499e832ee13936": {
     "model_module": "@jupyter-widgets/base",
     "model_module_version": "1.2.0",
     "model_name": "LayoutModel",
     "state": {
      "_model_module": "@jupyter-widgets/base",
      "_model_module_version": "1.2.0",
      "_model_name": "LayoutModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/base",
      "_view_module_version": "1.2.0",
      "_view_name": "LayoutView",
      "align_content": null,
      "align_items": null,
      "align_self": null,
      "border": null,
      "bottom": null,
      "display": null,
      "flex": null,
      "flex_flow": null,
      "grid_area": null,
      "grid_auto_columns": null,
      "grid_auto_flow": null,
      "grid_auto_rows": null,
      "grid_column": null,
      "grid_gap": null,
      "grid_row": null,
      "grid_template_areas": null,
      "grid_template_columns": null,
      "grid_template_rows": null,
      "height": null,
      "justify_content": null,
      "justify_items": null,
      "left": null,
      "margin": null,
      "max_height": null,
      "max_width": null,
      "min_height": null,
      "min_width": null,
      "object_fit": null,
      "object_position": null,
      "order": null,
      "overflow": null,
      "overflow_x": null,
      "overflow_y": null,
      "padding": null,
      "right": null,
      "top": null,
      "visibility": null,
      "width": null
     }
    },
    "ef68fd157bb6494fbf438cc0c6b5f447": {
     "model_module": "@jupyter-widgets/controls",
     "model_module_version": "1.5.0",
     "model_name": "TextModel",
     "state": {
      "_dom_classes": [],
      "_model_module": "@jupyter-widgets/controls",
      "_model_module_version": "1.5.0",
      "_model_name": "TextModel",
      "_view_count": null,
      "_view_module": "@jupyter-widgets/controls",
      "_view_module_version": "1.5.0",
      "_view_name": "TextView",
      "continuous_update": true,
      "description": "Token:",
      "description_tooltip": null,
      "disabled": false,
      "layout": "IPY_MODEL_b205804e6f5c4f238b499e832ee13936",
      "placeholder": "​",
      "style": "IPY_MODEL_9a1b718673da41abb24fe4326cb18b9b",
      "value": ""
     }
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}