{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# ODM2 API: Retrieve, manipulate and visualize ODM2 water quality measurement-type data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example shows how to use the ODM2 Python API (`odm2api`) to connect to an ODM2 database, retrieve data, and analyze and visualize the data. The [database (iUTAHGAMUT_waterquality_measurementresults_ODM2.sqlite)](https://github.com/ODM2/ODM2PythonAPI/blob/master/Examples/data/iUTAHGAMUT_waterquality_measurementresults_ODM2.sqlite) contains [\"measurement\"-type results](http://vocabulary.odm2.org/resulttype/measurement/).\n", "\n", "This example uses SQLite for the database because it doesn't require a server. However, the ODM2 Python API demonstrated here can alse be used with ODM2 databases implemented in MySQL, PostgreSQL or Microsoft SQL Server.\n", "\n", "More details on the ODM2 Python API and its source code and latest development can be found at https://github.com/ODM2/ODM2PythonAPI\n", "\n", "Adapted from notebook https://github.com/BiG-CZ/wshp2017_tutorial_content/blob/master/notebooks/ODM2_Example3.ipynb, based in part on earlier code and an ODM2 database from [Jeff Horsburgh's group](http://jeffh.usu.edu) at Utah State University.\n", "\n", "[Emilio Mayorga](https://github.com/emiliom/)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/home/mayorga/miniconda/envs/odm2client/lib/python2.7/site-packages/folium/__init__.py:59: UserWarning: This version of folium is the last to support Python 2. Transition to Python 3 to be able to receive updates and fixes. Check out https://python3statement.org/ for more info.\n", " UserWarning\n" ] } ], "source": [ "import os\n", "import datetime\n", "\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "\n", "from shapely.geometry import Point\n", "import pandas as pd\n", "import geopandas as gpd\n", "import folium\n", "from folium.plugins import MarkerCluster\n", "\n", "import odm2api\n", "from odm2api.ODMconnection import dbconnection\n", "import odm2api.services.readService as odm2rs\n", "from odm2api.models import SamplingFeatures" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'2019-05-12 01:58:17.943605 UTC'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\"{} UTC\".format(datetime.datetime.utcnow())" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(u'0.24.2', u'0.5.0', u'0.8.3')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.__version__, gpd.__version__, folium.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**odm2api version used** to run this notebook:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "u'0.7.2'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "odm2api.__version__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Connect to the ODM2 SQLite Database\n", "\n", "This example uses an ODM2 SQLite database file loaded with water quality sample data from multiple monitoring sites in the [iUTAH](https://iutahepscor.org/) Gradients Along Mountain to Urban Transitions ([GAMUT](http://data.iutahepscor.org/mdf/Data/Gamut_Network/)) water quality monitoring network. Water quality samples have been collected and analyzed for nitrogen, phosphorus, total coliform, E-coli, and some water isotopes. The [database (iUTAHGAMUT_waterquality_measurementresults_ODM2.sqlite)](https://github.com/ODM2/ODM2PythonAPI/blob/master/Examples/data/iUTAHGAMUT_waterquality_measurementresults_ODM2.sqlite) contains [\"measurement\"-type results](http://vocabulary.odm2.org/resulttype/measurement/).\n", "\n", "The example database is located in the `data` sub-directory." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# Assign directory paths and SQLite file name\n", "dbname_sqlite = \"iUTAHGAMUT_waterquality_measurementresults_ODM2.sqlite\"\n", "\n", "sqlite_pth = os.path.join(\"data\", dbname_sqlite)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Database connection successful!\n" ] } ], "source": [ "try:\n", " session_factory = dbconnection.createConnection('sqlite', sqlite_pth)\n", " read = odm2rs.ReadODM2(session_factory)\n", " print(\"Database connection successful!\")\n", "except Exception as e:\n", " print(\"Unable to establish connection to the database: \", e)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run Some Basic Queries on the ODM2 Database\n", "\n", "This section shows some examples of how to use the API to run both simple and more advanced queries on the ODM2 database, as well as how to examine the query output in convenient ways thanks to Python tools.\n", "\n", "Simple query functions like **getVariables( )** return objects similar to the entities in ODM2, and individual attributes can then be retrieved from the objects returned. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get all Variables\n", "A simple query with simple output." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NoDataValueSpeciationCVVariableCodeVariableDefinitionVariableNameCVVariableTypeCV_sa_instance_state
VariableID
1-9999.0000000000NTNNoneNitrogen, totalWater quality<sqlalchemy.orm.state.InstanceState object at ...
2-9999.0000000000PTPNonePhosphorus, totalWater quality<sqlalchemy.orm.state.InstanceState object at ...
3-9999.0000000000NNitrateNoneNitrogen, dissolved nitrite (NO2) + nitrate (NO3)Water quality<sqlalchemy.orm.state.InstanceState object at ...
4-9999.0000000000NAmmoniaNoneNitrogen, NH4Water quality<sqlalchemy.orm.state.InstanceState object at ...
5-9999.0000000000PPhosphateNonePhosphorus, orthophosphate dissolvedWater quality<sqlalchemy.orm.state.InstanceState object at ...
6-9999.0000000000Not ApplicableTcoliformNoneColiform, totalWater quality<sqlalchemy.orm.state.InstanceState object at ...
7-9999.0000000000Not ApplicableE-ColiNoneE-coliWater quality<sqlalchemy.orm.state.InstanceState object at ...
8-9999.0000000000CDOCNoneCarbon, dissolved organicWater quality<sqlalchemy.orm.state.InstanceState object at ...
9-9999.0000000000NTDNNoneNitrogen, total dissolvedWater quality<sqlalchemy.orm.state.InstanceState object at ...
10-9999.0000000000Not ApplicableAbs254NoneAbsorbanceWater quality<sqlalchemy.orm.state.InstanceState object at ...
\n", "
" ], "text/plain": [ " NoDataValue SpeciationCV VariableCode VariableDefinition \\\n", "VariableID \n", "1 -9999.0000000000 N TN None \n", "2 -9999.0000000000 P TP None \n", "3 -9999.0000000000 N Nitrate None \n", "4 -9999.0000000000 N Ammonia None \n", "5 -9999.0000000000 P Phosphate None \n", "6 -9999.0000000000 Not Applicable Tcoliform None \n", "7 -9999.0000000000 Not Applicable E-Coli None \n", "8 -9999.0000000000 C DOC None \n", "9 -9999.0000000000 N TDN None \n", "10 -9999.0000000000 Not Applicable Abs254 None \n", "\n", " VariableNameCV VariableTypeCV \\\n", "VariableID \n", "1 Nitrogen, total Water quality \n", "2 Phosphorus, total Water quality \n", "3 Nitrogen, dissolved nitrite (NO2) + nitrate (NO3) Water quality \n", "4 Nitrogen, NH4 Water quality \n", "5 Phosphorus, orthophosphate dissolved Water quality \n", "6 Coliform, total Water quality \n", "7 E-coli Water quality \n", "8 Carbon, dissolved organic Water quality \n", "9 Nitrogen, total dissolved Water quality \n", "10 Absorbance Water quality \n", "\n", " _sa_instance_state \n", "VariableID \n", "1 \n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
PersonFirstNamePersonIDPersonLastNamePersonMiddleName_sa_instance_state
0Nancy1Mesner<sqlalchemy.orm.state.InstanceState object at ...
1Dane2Brophy<sqlalchemy.orm.state.InstanceState object at ...
2Ben3Rider<sqlalchemy.orm.state.InstanceState object at ...
3Michelle4Baker<sqlalchemy.orm.state.InstanceState object at ...
4Erin5Jones<sqlalchemy.orm.state.InstanceState object at ...
\n", "" ], "text/plain": [ " PersonFirstName PersonID PersonLastName PersonMiddleName \\\n", "0 Nancy 1 Mesner \n", "1 Dane 2 Brophy \n", "2 Ben 3 Rider \n", "3 Michelle 4 Baker \n", "4 Erin 5 Jones \n", "\n", " _sa_instance_state \n", "0 \n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ElevationDatumCVElevation_mFeatureGeometryWKTLatitudeLongitudeSamplingFeatureCodeSamplingFeatureDescriptionSamplingFeatureGeotypeCVSamplingFeatureIDSamplingFeatureNameSamplingFeatureTypeCVSamplingFeatureUUIDSiteTypeCVSpatialReferenceID_sa_instance_stategeometry
0EGM961356.0None40.745078-111.854449RB_1300ENoneNone1Red Butte Creek at 1300E (downstream of spring)Site0DDE8EF6-EC2F-42C0-AB50-20C6C02E89B2Stream1<sqlalchemy.orm.state.InstanceState object at ...POINT (-111.854449 40.745078)
1EGM961356.0None40.745106-111.854389RB_1300ESpringNoneNone2Spring that enters Red Butte Creek at 1300ESite9848BBFE-EA3F-4918-A324-13E8EDE5381CSpring1<sqlalchemy.orm.state.InstanceState object at ...POINT (-111.854389 40.745106)
2EGM961289.0None40.741583-111.917667RB_900W_BANoneNone3Red Butte Creek terminus at Jordan River at 13...Site688017BC-9E02-4444-A21D-270366BE2348Stream1<sqlalchemy.orm.state.InstanceState object at ...POINT (-111.917667 40.741583)
3EGM961519.0None40.766134-111.826530RB_AmphitheaterNoneNone4Red Butte Creek below Red Butte Garden Amphith...Site9CFE685B-5CDA-4E38-98D9-406D645C7D21Stream1<sqlalchemy.orm.state.InstanceState object at ...POINT (-111.82653 40.766134)
4EGM961648.0None40.779602-111.806669RB_ARBR_AANoneNone5Red Butte Creek above Red Butte Reservoir Adan...Site98C7F63A-FDFB-4898-87C6-5AA8EC34D1E4Stream1<sqlalchemy.orm.state.InstanceState object at ...POINT (-111.806669 40.779602)
\n", "" ], "text/plain": [ " ElevationDatumCV Elevation_m FeatureGeometryWKT Latitude Longitude \\\n", "0 EGM96 1356.0 None 40.745078 -111.854449 \n", "1 EGM96 1356.0 None 40.745106 -111.854389 \n", "2 EGM96 1289.0 None 40.741583 -111.917667 \n", "3 EGM96 1519.0 None 40.766134 -111.826530 \n", "4 EGM96 1648.0 None 40.779602 -111.806669 \n", "\n", " SamplingFeatureCode SamplingFeatureDescription SamplingFeatureGeotypeCV \\\n", "0 RB_1300E None None \n", "1 RB_1300ESpring None None \n", "2 RB_900W_BA None None \n", "3 RB_Amphitheater None None \n", "4 RB_ARBR_AA None None \n", "\n", " SamplingFeatureID SamplingFeatureName \\\n", "0 1 Red Butte Creek at 1300E (downstream of spring) \n", "1 2 Spring that enters Red Butte Creek at 1300E \n", "2 3 Red Butte Creek terminus at Jordan River at 13... \n", "3 4 Red Butte Creek below Red Butte Garden Amphith... \n", "4 5 Red Butte Creek above Red Butte Reservoir Adan... \n", "\n", " SamplingFeatureTypeCV SamplingFeatureUUID SiteTypeCV \\\n", "0 Site 0DDE8EF6-EC2F-42C0-AB50-20C6C02E89B2 Stream \n", "1 Site 9848BBFE-EA3F-4918-A324-13E8EDE5381C Spring \n", "2 Site 688017BC-9E02-4444-A21D-270366BE2348 Stream \n", "3 Site 9CFE685B-5CDA-4E38-98D9-406D645C7D21 Stream \n", "4 Site 98C7F63A-FDFB-4898-87C6-5AA8EC34D1E4 Stream \n", "\n", " SpatialReferenceID _sa_instance_state \\\n", "0 1 " ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# A trivial but easy-to-generate GeoPandas plot\n", "gdf.plot();" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A site has a `SiteTypeCV`. Let's examine the site type distribution, and use that information to create a new GeoDataFrame column to specify a map marker color by `SiteTypeCV`." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Stream 24\n", "Spring 1\n", "Name: SiteTypeCV, dtype: int64" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "gdf['SiteTypeCV'].value_counts()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "gdf[\"color\"] = gdf.apply(lambda feat: 'green' if feat['SiteTypeCV'] == 'Stream' else 'red', axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: While the database holds a copy of the **ODM2 Controlled Vocabularies**, the complete description of each CV term is available from a web request to the CV API at http://vocabulary.odm2.org. Want to know more about how a \"spring\" is defined? Here's one simple way, using `Pandas` to access and parse the CSV web service response." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
termnamedefinitioncategoryprovenanceprovenance_urinote
0springSpringA location at which the water table intersects...Spring SitesAdapted from USGS Site Types.NaNhttp://wdr.water.usgs.gov/nwisgmap/help/sitety...
\n", "
" ], "text/plain": [ " term name definition \\\n", "0 spring Spring A location at which the water table intersects... \n", "\n", " category provenance provenance_uri \\\n", "0 Spring Sites Adapted from USGS Site Types. NaN \n", "\n", " note \n", "0 http://wdr.water.usgs.gov/nwisgmap/help/sitety... " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sitetype = 'spring'\n", "pd.read_csv(\"http://vocabulary.odm2.org/api/v1/sitetype/{}/?format=csv\".format(sitetype))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Now we'll create an interactive and helpful `Folium` map of the sites.** This map features:\n", "- Automatic panning to the location of the sites (no hard wiring, except for the zoom scale), based on GeoPandas functionality and information from the ODM2 Site Sampling Features\n", "- Color coding by `SiteTypeCV` \n", "- Marker clustering\n", "- Simple marker pop ups with content from the ODM2 Site Sampling Features" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
" ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c = gdf.unary_union.centroid\n", "m = folium.Map(location=[c.y, c.x], tiles='CartoDB positron', zoom_start=11)\n", "\n", "marker_cluster = MarkerCluster().add_to(m)\n", "for idx, feature in gdf.iterrows():\n", " folium.Marker(location=[feature.geometry.y, feature.geometry.x], \n", " icon=folium.Icon(color=feature['color']),\n", " popup=\"{0} ({1}): {2}\".format(\n", " feature['SamplingFeatureCode'], feature['SiteTypeCV'], \n", " feature['SamplingFeatureName'])\n", " ).add_to(marker_cluster)\n", "\n", " \n", "# Done with setup. Now render the map\n", "m" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Add a new Sampling Feature\n", "Just to llustrate how to add a new entry. We won't \"commit\" (save) the sampling feature to the database." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "New sampling feature created, but not saved to database.\n", "\n", "\n" ] } ], "source": [ "sitesf0 = siteFeatures[0]\n", "\n", "try:\n", " newsf = SamplingFeatures()\n", " session = session_factory.getSession()\n", " newsf.FeatureGeometryWKT = \"POINT(-111.946 41.718)\"\n", " newsf.Elevation_m = 100\n", " newsf.ElevationDatumCV = sitesf0.ElevationDatumCV\n", " newsf.SamplingFeatureCode = \"TestSF\"\n", " newsf.SamplingFeatureDescription = \"this is a test to add a sampling feature\"\n", " newsf.SamplingFeatureGeotypeCV = \"Point\"\n", " newsf.SamplingFeatureTypeCV = sitesf0.SamplingFeatureTypeCV\n", " newsf.SamplingFeatureUUID = sitesf0.SamplingFeatureUUID+\"2\"\n", " session.add(newsf)\n", " # To save the new sampling feature, do session.commit()\n", " print(\"New sampling feature created, but not saved to database.\\n\")\n", " print(newsf)\n", "except Exception as e :\n", " print(\"error adding a sampling feature: {}\".format(e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get Objects and Related Objects from the Database (SamplingFeatures example)\n", "\n", "This code shows some examples of how objects and related objects can be retrieved using the API. In the following, we use the **getSamplingFeatures( )** function to return a particular sampling feature by passing in its SamplingFeatureCode. This function returns a list of SamplingFeature objects, so just get the first one in the returned list." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "odm2api.models.Sites" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Get the SamplingFeature object for a particular SamplingFeature by passing its SamplingFeatureCode\n", "sf = read.getSamplingFeatures(codes=['RB_1300E'])[0]\n", "type(sf)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'ElevationDatumCV': u'EGM96',\n", " 'Elevation_m': 1356.0,\n", " 'FeatureGeometryWKT': None,\n", " 'Latitude': 40.745078,\n", " 'Longitude': -111.854449,\n", " 'SamplingFeatureCode': u'RB_1300E',\n", " 'SamplingFeatureDescription': None,\n", " 'SamplingFeatureGeotypeCV': None,\n", " 'SamplingFeatureID': 1,\n", " 'SamplingFeatureName': u'Red Butte Creek at 1300E (downstream of spring)',\n", " 'SamplingFeatureTypeCV': u'Site',\n", " 'SamplingFeatureUUID': u'0DDE8EF6-EC2F-42C0-AB50-20C6C02E89B2',\n", " 'SiteTypeCV': u'Stream',\n", " 'SpatialReferenceID': 1,\n", " '_sa_instance_state': }" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Simple way to examine the content (properties) of a Python object, as if it were a dictionary\n", "vars(sf)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can also drill down and get objects linked by foreign keys. The API returns related objects in a nested hierarchy so they can be interrogated in an object oriented way. So, if I use the **getResults( )** function to return a Result from the database (e.g., a \"Measurement\" Result), I also get the associated Action that created that Result (e.g., a \"Specimen analysis\" Action)." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('The FeatureAction object for the Result is: ', )\n", "('The Action object for the Result is: ', )\n", "\n", "The following are some of the attributes for the Action that created the Result: \n", "ActionTypeCV: Specimen analysis\n", "ActionDescription: None\n", "BeginDateTime: 2014-10-30 00:00:00\n", "EndDateTime: None\n", "MethodName: Astoria Total Phosphorus\n", "MethodDescription: Determination of total phosphorus by persulphate oxidation digestion and ascorbic acid method\n" ] } ], "source": [ "try:\n", " # Call getResults, but return only the first Result\n", " firstResult = read.getResults()[0]\n", " frfa = firstResult.FeatureActionObj\n", " frfaa = firstResult.FeatureActionObj.ActionObj\n", " print(\"The FeatureAction object for the Result is: \", frfa)\n", " print(\"The Action object for the Result is: \", frfaa)\n", " \n", " # Print some Action attributes in a more human readable form\n", " print(\"\\nThe following are some of the attributes for the Action that created the Result: \")\n", " print(\"ActionTypeCV: {}\".format(frfaa.ActionTypeCV))\n", " print(\"ActionDescription: {}\".format(frfaa.ActionDescription))\n", " print(\"BeginDateTime: {}\".format(frfaa.BeginDateTime))\n", " print(\"EndDateTime: {}\".format(frfaa.EndDateTime))\n", " print(\"MethodName: {}\".format(frfaa.MethodObj.MethodName))\n", " print(\"MethodDescription: {}\".format(frfaa.MethodObj.MethodDescription))\n", "except Exception as e:\n", " print(\"Unable to demo Foreign Key Example: {}\".format(e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Get a Result and its Attributes\n", "\n", "Because all of the objects are returned in a nested form, if you retrieve a result, you can interrogate it to get all of its related attributes. When a Result object is returned, it includes objects that contain information about Variable, Units, ProcessingLevel, and the related Action that created that Result." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "------- Example of Retrieving Attributes of a Result -------\n", "The following are some of the attributes for the Result retrieved: \n", "ResultID: 1\n", "ResultTypeCV: Measurement\n", "ValueCount: 1\n", "ProcessingLevel: Raw Data\n", "SampledMedium: Liquid aqueous\n", "Variable: TP: Phosphorus, total\n", "Units: milligrams per liter\n", "SamplingFeatureID: 26\n", "SamplingFeatureCode: 3\n" ] } ], "source": [ "print(\"------- Example of Retrieving Attributes of a Result -------\")\n", "try:\n", " firstResult = read.getResults()[0]\n", " frfa = firstResult.FeatureActionObj\n", " print(\"The following are some of the attributes for the Result retrieved: \")\n", " print(\"ResultID: {}\".format(firstResult.ResultID))\n", " print(\"ResultTypeCV: {}\".format(firstResult.ResultTypeCV))\n", " print(\"ValueCount: {}\".format(firstResult.ValueCount))\n", " print(\"ProcessingLevel: {}\".format(firstResult.ProcessingLevelObj.Definition))\n", " print(\"SampledMedium: {}\".format(firstResult.SampledMediumCV))\n", " print(\"Variable: {}: {}\".format(firstResult.VariableObj.VariableCode, \n", " firstResult.VariableObj.VariableNameCV))\n", " print(\"Units: {}\".format(firstResult.UnitsObj.UnitsName))\n", " print(\"SamplingFeatureID: {}\".format(frfa.SamplingFeatureObj.SamplingFeatureID))\n", " print(\"SamplingFeatureCode: {}\".format(frfa.SamplingFeatureObj.SamplingFeatureCode))\n", "except Exception as e:\n", " print(\"Unable to demo example of retrieving Attributes of a Result: {}\".format(e))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The last block of code returns a particular Measurement Result. From that I can get the SamplingFeaureID (in this case 26) for the Specimen from which the Result was generated. But, if I want to figure out which Site the Specimen was collected at, I need to query the database to get the related Site SamplingFeature. I can use **getRelatedSamplingFeatures( )** for this. Once I've got the SamplingFeature for the Site, I could get the rest of the SamplingFeature attributes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Retrieve the \"Related\" Site at which a Specimen was collected" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "# Pass the Sampling Feature ID of the specimen, and the relationship type\n", "relatedSite = read.getRelatedSamplingFeatures(sfid=26, relationshiptype='Was Collected at')[0]" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'ElevationDatumCV': u'EGM96',\n", " 'Elevation_m': 1356.0,\n", " 'FeatureGeometryWKT': None,\n", " 'Latitude': 40.745078,\n", " 'Longitude': -111.854449,\n", " 'SamplingFeatureCode': u'RB_1300E',\n", " 'SamplingFeatureDescription': None,\n", " 'SamplingFeatureGeotypeCV': None,\n", " 'SamplingFeatureID': 1,\n", " 'SamplingFeatureName': u'Red Butte Creek at 1300E (downstream of spring)',\n", " 'SamplingFeatureTypeCV': u'Site',\n", " 'SamplingFeatureUUID': u'0DDE8EF6-EC2F-42C0-AB50-20C6C02E89B2',\n", " 'SiteTypeCV': u'Stream',\n", " 'SpatialReferenceID': 1,\n", " '_sa_instance_state': }" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "vars(relatedSite)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "-----------------------------------------\n", "\n", "## Return Results and Data Values for a Particular Site/Variable\n", "\n", "From the list of Variables returned above and the information about the SamplingFeature I queried above, I know that VariableID = 2 for Total Phosphorus and SiteID = 1 for the Red Butte Creek site at 1300E. I can use the **getResults( )** function to get all of the Total Phosphorus results for this site by passing in the VariableID and the SiteID." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "siteID = 1 # Red Butte Creek at 1300 E (obtained from the getRelatedSamplingFeatures query)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "18" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "v = variables_df[variables_df['VariableCode'] == 'TP']\n", "variableID = v.index[0]\n", "\n", "results = read.getResults(siteid=siteID, variableid=variableID, restype=\"Measurement\")\n", "# Get the list of ResultIDs so I can retrieve the data values associated with all of the results\n", "resultIDList = [x.ResultID for x in results]\n", "len(resultIDList)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Retrieve the Result (Data) Values, Then Create a Quick Time Series Plot of the Data\n", "\n", "Now I can retrieve all of the data values associated with the list of Results I just retrieved. In ODM2, water chemistry measurements are stored as \"Measurement\" results. Each \"Measurement\" Result has a single data value associated with it. So, for convenience, the **getResultValues( )** function allows you to pass in a list of ResultIDs so you can get the data values for all of them back in a Pandas data frame object, which is easier to work with. Once I've got the data in a Pandas data frame object, I can use the **plot( )** function directly on the data frame to create a quick visualization." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ValueIDResultIDDataValueValueDateTimeValueDateTimeUTCOffset
0110.01002015-10-27 13:26:24-7
110010.01002015-11-17 13:55:12-7
2109100.05742015-05-12 14:24:00-7
310100.05742015-06-18 12:43:12-7
4198990.04242015-10-27 13:55:12-7
\n", "
" ], "text/plain": [ " ValueID ResultID DataValue ValueDateTime ValueDateTimeUTCOffset\n", "0 1 1 0.0100 2015-10-27 13:26:24 -7\n", "1 100 1 0.0100 2015-11-17 13:55:12 -7\n", "2 109 10 0.0574 2015-05-12 14:24:00 -7\n", "3 10 10 0.0574 2015-06-18 12:43:12 -7\n", "4 198 99 0.0424 2015-10-27 13:55:12 -7" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Get all of the data values for the Results in the list created above\n", "# Call getResultValues, which returns a Pandas Data Frame with the data\n", "resultValues = read.getResultValues(resultids=resultIDList, lowercols=False)\n", "resultValues.head()" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Plot the time sequence of Measurement Result Values \n", "ax = resultValues.plot(x='ValueDateTime', y='DataValue', title=relatedSite.SamplingFeatureName,\n", " kind='line', use_index=True, linestyle='solid', style='o')\n", "ax.set_ylabel(\"{0} ({1})\".format(results[0].VariableObj.VariableNameCV, \n", " results[0].UnitsObj.UnitsAbbreviation))\n", "ax.set_xlabel('Date/Time')\n", "ax.legend().set_visible(False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### End with a fancier plot, facilitated via a function\n", "\n", "If I'm going to reuse a series of steps, it's always helpful to write little generic functions that can be called to quickly and consistently get what we need. To conclude this demo, here's one such function that encapsulates the `VariableID`, `getResults` and `getResultValues` queries we showed above. Then we leverage it to create a nice 2-variable (2-axis) plot of TP and TN vs time, and conclude with a reminder that we have ready access to related metadata about analytical lab methods and such." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "def get_results_and_values(siteid, variablecode):\n", " v = variables_df[variables_df['VariableCode'] == variablecode]\n", " variableID = v.index[0]\n", " \n", " results = read.getResults(siteid=siteid, variableid=variableID, restype=\"Measurement\")\n", " resultIDList = [x.ResultID for x in results]\n", " resultValues = read.getResultValues(resultids=resultIDList, lowercols=False)\n", " \n", " return resultValues, results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Fancy plotting, leveraging the `Pandas` plot method and `matplotlib`." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "# Plot figure and axis set up\n", "f, ax = plt.subplots(1, figsize=(13, 6))\n", "\n", "# First plot (left axis)\n", "VariableCode = 'TP'\n", "resultValues_TP, results_TP = get_results_and_values(siteID, VariableCode)\n", "resultValues_TP.plot(x='ValueDateTime', y='DataValue', label=VariableCode, \n", " style='o-', kind='line', ax=ax)\n", "ax.set_ylabel(\"{0}: {1} ({2})\".format(VariableCode, results_TP[0].VariableObj.VariableNameCV, \n", " results_TP[0].UnitsObj.UnitsAbbreviation))\n", "\n", "# Second plot (right axis)\n", "VariableCode = 'TN'\n", "resultValues_TN, results_TN = get_results_and_values(siteID, VariableCode)\n", "resultValues_TN.plot(x='ValueDateTime', y='DataValue', label=VariableCode, \n", " style='^-', kind='line', ax=ax,\n", " secondary_y=True)\n", "ax.right_ax.set_ylabel(\"{0}: {1} ({2})\".format(VariableCode, results_TN[0].VariableObj.VariableNameCV, \n", " results_TN[0].UnitsObj.UnitsAbbreviation))\n", "\n", "# Tweak the figure\n", "ax.legend(loc='upper left')\n", "ax.right_ax.legend(loc='upper right')\n", "\n", "ax.grid(True)\n", "ax.set_xlabel('')\n", "ax.set_title(relatedSite.SamplingFeatureName);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, let's show some useful metadata. Use the `Results` records and their relationship to `Actions` (via `FeatureActions`) to **extract and print out the Specimen Analysis methods used for TN and TP**. Or at least for the *first* result for each of the two variables; methods may have varied over time, but the specific method associated with each result is stored in ODM2 and available." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "TP METHOD: Astoria Total Phosphorus (Determination of total phosphorus by persulphate oxidation digestion and ascorbic acid method)\n", "TN METHOD: Astoria Total Nitrogen (Determination of total Nitrogen by persulphate oxidation digestion and cadmium reduction method)\n" ] } ], "source": [ "results_faam = lambda results, i: results[i].FeatureActionObj.ActionObj.MethodObj\n", "\n", "print(\"TP METHOD: {0} ({1})\".format(results_faam(results_TP, 0).MethodName,\n", " results_faam(results_TP, 0).MethodDescription))\n", "print(\"TN METHOD: {0} ({1})\".format(results_faam(results_TN, 0).MethodName,\n", " results_faam(results_TN, 0).MethodDescription))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python [conda env:odm2client]", "language": "python", "name": "conda-env-odm2client-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.15" } }, "nbformat": 4, "nbformat_minor": 2 }