{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Getting started with Ibmdbpy - Part 3: Advanced geospatial features \n", "\n", "This notebook is an extensive guide to help you to get familiar with ibmdbpy concepts and geospatial functionalities. It showcases various operations on IdaGeoDataFrames, providing step-by-step examples to get started with the package and its geospatial extension.\n", "\n", "___\n", "\n", "### Accelerate Python analytics with in-database processing by using ibmdbpy and IBM Db2 Warehouse\n", " \n", "The ibmdbpy project provides a Python interface for data manipulation and access to in-database algorithms in IBM Db2 Warehouse. It accelerates Python analytics by seamlessly pushing operations written in Python into the underlying database for execution, thereby benefitting from in-database performance-enhancing features, such as columnar storage and parallel processing. For more details about ibmdbpy, please refer to the [documentation](https://pythonhosted.org/ibmdbpy/index.html) and to the dedicated [Git repository](https://github.com/ibmdbanalytics/ibmdbpy/tree/master/ibmdbpy). This notebook provides you with an overview of ibmdbpy geospatial extension.\n", "\n", "__About ibmdbpy's geospatial extension__\n", "\n", "Ibmdbpy supports a wrapper for spatial functions which enables you to generate and analyze spatial information about geographic features, and to store and manage the data on which this information is based. The spatial data is identified by ibmdbpy as a special class called IdaGeoDataFrame that extends all the properties of an IdaDataFrame and has additional methods supported for geospatial types like ST_Point, ST_LineString, ST_Polygon etc. Like Python package GeoPandas, which is an extension of Pandas and provides the GeoDataFrame abstraction, ibmdbpy spatial extension lets you play with IdaGeoDataFrames and IdaGeoSeries as an extension of IdaDataFrames and IdaSeries. The Python wrappers for spatial functions which Db2 currently supports make the querying process much simpler for users. For more details about ibmdbpy geospatial extension, please refer to the dedicated [documentation](https://pythonhosted.org/ibmdbpy/geospatial.html). More details about Db2 spatial extender can be found on the [IBM Knowledge Center](https://www.ibm.com/support/knowledgecenter/SSCJDQ/com.ibm.swg.im.dashdb.spatial.doc/doc/csbp1001.html).\n", "\n", "__Prerequisites__\n", "* Db2 account: see [IBM Cloud](https://cloud.ibm.com/login) or [Db2 Warehouse](https://www.ibm.com/support/knowledgecenter/en/SSCJDQ/com.ibm.swg.im.dashdb.kc.doc/welcome.html)\n", "* Db2 driver: learn more on [IBM Knowledge Center](https://www.ibm.com/support/knowledgecenter/en/SSFMBX/com.ibm.swg.im.dashdb.doc/connecting/connect_applications_by_type.html) and see [IBM Support](https://www.ibm.com/support/pages/db2-jdbc-driver-versions-and-downloads)\n", "* Having installed the [ibmdbpy](https://pypi.org/project/ibmdbpy/) python library with pip: \n", "> pip install ibmdbpy \n", "* Optional dependency for JDBC is the [jaydebeapi](https://pypi.org/project/JayDeBeApi/) library. Run the following command to install ibmdbpy, as well as the dependencies for the JDBC feature:\n", "> pip install ibmdbpy[jdbc]\n", "\n", "__Contents__ \n", "\n", "1. __Establish connection to Db2 database__\n", " 1. JDBC connection\n", " 2. Verbosity and autocommit settings\n", "2. __Create and open IdaGeoDataFrames__\n", " 1. Load sample data from Db2 tables\n", " 2. Extract an IdaGeoSeries\n", " 3. Create sample tables with various geometry types \n", "3. __Get familiar with geospatial operations__ \n", " 1. General properties of geometry columns\n", " 2. Dealing with point coordinates\n", " 3. Start, middle and end points of curves\n", " 4. Area and boundaries of polygons\n", " 5. Length versus perimeter\n", " 6. Envelope, MBR and convex hull\n", " 7. Centroid and buffer \n", " 8. Rings\n", " 9. Generalization of a geospatial pattern \n", " 10. Boolean methods\n", " 11. Operations on multiple geometry columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__Imports__" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from ibmdbpy import IdaDataBase, IdaDataFrame, IdaGeoDataFrame" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Establish connection to Db2 database\n", "\n", "In ibmdbpy, users can choose to use JDBC to connect to a remote Db2 instance. The JDBC Connection is based on a Java Virtual Machine, so it is available on every machine that can run Java. You could also use an ODBC connection, however this is not the option we use in this notebook.\n", "\n", "__1. JDBC connection__\n", "\n", "First, you need to dowload a valid driver (more info on [IBM Support](https://www.ibm.com/support/pages/db2-jdbc-driver-versions-and-downloads)). Then you need to put the `db2jcc.jar` or`db2jcc4.jar` file in the ibmdbpy site-package folder. When ibmdbpy runs, it checks whether one of those files exists in its installation folder and uses it to establish the connection. \n", "\n", "More details on IBM Knowledge Center:\n", "* JDBC for [IBM Db2 Warehouse](https://www.ibm.com/support/knowledgecenter/en/SSCJDQ/com.ibm.swg.im.dashdb.doc/connecting/connect_connecting_jdbc_applications.html)\n", "* JDBC for [IBM Db2 on Cloud](https://www.ibm.com/support/knowledgecenter/en/SSFMBX/com.ibm.swg.im.dashdb.doc/connecting/connect_connecting_jdbc_applications.html)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "#Enter the values for you database connection\n", "dsn_database = \"___\" # e.g. \"BLUDB\"\n", "dsn_hostname = \"___\" # e.g.: \"abc.url.example\"\n", "dsn_port = \"___\" # e.g. \"50000\"\n", "dsn_uid = \"___\" # e.g. \"db2_1234\"\n", "dsn_pwd = \"___\" # e.g. \"zorglub\"" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "connection_string='jdbc:db2://'+dsn_hostname+':'+dsn_port+'/'+dsn_database+':user='+dsn_uid+';password='+dsn_pwd+\";\" \n", "# the IdaDataBase object holds the connection to database.\n", "idadb=IdaDataBase(dsn=connection_string)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Congratulations! You successfully connected to Db2 with ibmdbpy! When you are done, use `idadb.close()` to close the connection. To reconnect, or if the connection was broken, just use `idadb.reconnect()`. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__2. Verbosity and autocommit settings__\n", "\n", "The verbose mode automatically prints all SQL-communication between ibmdbpy and Db2, which can be very useful for debugging or understanding how ibmdbpy works. Choose the mode with `set_verbose()` or by setting the `verbosity` option when defining the IdaDataBase object. We encourage you to take a look at the prints in the first place, then feel free to silence the verbose. Note that printing adds delay when running cells. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Verbosity\n", "from ibmdbpy.utils import set_verbose\n", "# if you want to see what takes place under the hood : set verbose to True\n", "set_verbose(False) # set it to True if you want to see the detail of ibmdbpy's operations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default the environment variable `AUTOCOMMIT` is then set to True, which means that every SQL statement which is submitted through the connection is executed within its own transaction and then committed implicitly. When you close the connection to Db2, if the environment variable `AUTOCOMMIT` is set to False, then all changes after the last explicit commit are discarded. \n", "\n", "Let's get to it!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Create and open IdaGeoDataFrames\n", "\n", "Let's explore the functionalities of ibmdbpy's geospatial extension! In this notebook, we will get familiar with IdaGeoDataFrames. An IdaGeoDataFrame is a reference to a spatial table in a remote instance of Db2. It has inherited the properties of an IdaDataFrame and bebefits from additional functionalities derived from Db2 spatial extension. \n", "\n", "In this notebook, we will use sample data available out of the box in Db2 Warehouse. The `SAMPLES.GEO_COUNTY` dataset contains geographical and administrative data about US counties. The `SAMPLES.GEO_TORNADO` dataset holds 5 decades of tornado records in the US. We will also use much smaller sample table of various data types. We will create these tables ourselves with a few SQL statements." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__1. Load sample data from Db2 tables__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we show you how to directly open Db2 tables as IdaGeoDataFrames.\n", "\n", "When defining an IdaGeoDataFrame, the first argument is the name of the IdaDataBase object which holds the connection to the database, the second argument designates the Db2 table you want to open. Optionally you may set an indexer column and a geometry column. Indexer and geometry can be defined and / or changed anytime using `indexer` and `set_geometry`.\n", "\n", "Note that in the following cell we have set the `OBJECTID` column as indexer when defining the IdaDataFrame. Otherwise, since the data is partitioned, it cannot be guaranteed that rows are always displayed in the same order. (Although, in paractice, an implicit sorting is operated by ibmdbpy). To ensure a behavior closer to Pandas' we therefore explicitly set an eligible column as index. You can either directly set the geometry column when defining an IdaGeoDataFrame object, or set it afterwards with `set_geometry('')`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/eva.feillet/.local/lib/python3.7/site-packages/jaydebeapi/__init__.py:250: UserWarning: No type mapping for JDBC type 'STRUCT' (constant value 2002). Using None as a default type_code.\n", " \"Using None as a default type_code.\" % (type_name, jdbc_type_const))\n" ] } ], "source": [ "# prepopulated table in Db2\n", "\n", "counties = IdaGeoDataFrame(idadb,'SAMPLES.GEO_COUNTY',indexer='OBJECTID')\n", "counties.set_geometry('SHAPE')\n", "\n", "tornadoes = IdaGeoDataFrame(idadb,'SAMPLES.GEO_TORNADO',indexer='OBJECTID')\n", "tornadoes.set_geometry('SHAPE')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__2. Extract an IdaGeoSeries__\n", "\n", "An IdaGeoSeries inherits the properties of IdaSeries. The specificity of IdaGeoSeries relies in the data type: `ST_GEOMETRY` or one of its subtypes: `ST_POINT`, `ST_LINESTRING`, `ST_MULTIPOLYGON` etc. Through this notebook, you will get familiar with them.\n", "\n", "To isolate an IdaGeoSeries, just select the corresponding column of the IdaGeoDataFrame, like you would do in Pandas." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "idageoseries = counties[\"SHAPE\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a look at the data type:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TYPENAME
SHAPEST_MULTIPOLYGON
\n", "
" ], "text/plain": [ " TYPENAME\n", "SHAPE ST_MULTIPOLYGON" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "idageoseries.dtypes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each element contains a list of list of coordinates. Coordinates define points. Each list of coordinates defines a polygon. A set of polygons makes a multipolygon. Same applies for curves (called linestrings and multilinestrings)." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 MULTIPOLYGON (((-119.0414320216 46.1928618799,...\n", "1 MULTIPOLYGON (((-82.3369660445 40.5550088866, ...\n", "2 MULTIPOLYGON (((-82.6497086900 33.6087744671, ...\n", "3 MULTIPOLYGON (((-84.1141995028 39.5779906201, ...\n", "4 MULTIPOLYGON (((-94.1405779888 35.0996842940, ...\n", "Name: SHAPE, dtype: object" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "idageoseries.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__3. Create sample tables with various geometry types__\n", "\n", "A separate text file called `sql_script` containing SQL statements accompanies this notebook. You can either copy these statements and let them run directly in your Db2 Warehouse console, or simply run the next cell of this notebook to execute the queries automatically with ibmdbpy. \n", "\n", "You obtain a few sample tables with toy data. They will be useful to showcase some particular geospational functionalities of ibmdbpy." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "DROP TABLE sample_polygons;\n", "\n", "SAMPLE_POLYGONS\n", "Table already exists.\n", "DROP TABLE sample_geometries; \n", "\n", "SAMPLE_GEOMETRIES\n", "Table already exists.\n", "DROP TABLE sample_points;\n", "\n", "SAMPLE_POINTS\n", "Table already exists.\n", "DROP TABLE sample_mlines;\n", "\n", "SAMPLE_MLINES\n", "Table already exists.\n", "DROP TABLE sample_lines;\n", "\n", "SAMPLE_LINES\n", "Table already exists.\n", "SET CURRENT FUNCTION PATH = CURRENT FUNCTION PATH, db2gse;\n", "\n", "CREATE TABLE sample_polygons (id INTEGER, geometry db2gse.ST_Polygon) ORGANIZE BY ROW;\n", "\n", "INSERT INTO sample_polygons VALUES (1101, db2gse.ST_Polygon ('polygon ((110 120, 110 140, 120 130, 110 120))', 1));\n", "\n", "INSERT INTO sample_polygons VALUES (1102, db2gse.ST_Polygon ('polygon ((110 120, 110 140, 130 140, 130 120, 110 120), (115 125, 115 135, 125 135, 125 135, 115 125))', 1)); \n", "\n", "CREATE TABLE sample_geometries (id INTEGER, geometry db2gse.ST_Geometry) organize by row;\n", "\n", "INSERT INTO sample_geometries(id, geometry) VALUES (1, db2gse.ST_Geometry('point(1 2)', 1) );\n", "\n", "INSERT INTO sample_geometries(id, geometry) VALUES(2, db2gse.ST_Geometry ('polygon ((0 0, 0 4, 5 4, 5 0, 0 0))', 1) );\n", "\n", "INSERT INTO sample_geometries(id, geometry) VALUES(3, db2gse.ST_Geometry ('point empty', 1) );\n", "\n", "INSERT INTO sample_geometries(id, geometry) VALUES(4, db2gse.ST_Geometry ('multipolygon empty', 1) ); \n", "\n", "INSERT INTO sample_geometries(id, geometry) VALUES(5, db2gse.ST_Geometry('linestring(33 2, 34 3, 35 6)', 1) );\n", "\n", "INSERT INTO sample_geometries(id, geometry) VALUES(6, db2gse.ST_Geometry('polygon((3 3, 4 6, 5 3, 3 3))', 1));\n", "\n", "INSERT INTO sample_geometries(id, geometry) VALUES(7, db2gse.ST_Geometry('5060', 1)); \n", "\n", "CREATE TABLE sample_points(id INT GENERATED BY DEFAULT AS IDENTITY NOT NULL, loc db2gse.ST_POINT NOT NULL, PRIMARY KEY (id)) ORGANIZE BY ROW;\n", "\n", "INSERT INTO SAMPLE_POINTS(loc) VALUES (db2gse.ST_Point(14, 58, CAST(NULL AS DOUBLE), 1));\n", "\n", "INSERT INTO SAMPLE_POINTS(loc) VALUES (db2gse.ST_Point(12, 35, 12, 1));\n", "\n", "INSERT INTO SAMPLE_POINTS(loc) VALUES (db2gse.ST_Point(12, 66, 43, 45, 1));\n", "\n", "INSERT INTO SAMPLE_POINTS(loc) VALUES (db2gse.ST_Point(14, 58, CAST(NULL AS DOUBLE), 4, 1));\n", "\n", "INSERT INTO SAMPLE_POINTS(loc) VALUES (db2gse.ST_Point(12, 35, 12, CAST(NULL AS DOUBLE),1));\n", "\n", "INSERT INTO SAMPLE_POINTS(loc) VALUES (db2gse.ST_Point(17, 65, 32, 1));\n", "\n", "INSERT INTO SAMPLE_POINTS(loc) VALUES (db2gse.ST_Point(17, 65, 34, 3, 1)); \n", "\n", "CREATE TABLE sample_mlines (id INTEGER, geometry db2gse.ST_MultiLineString) ORGANIZE BY ROW;\n", "\n", "INSERT INTO sample_mlines VALUES (1110, db2gse.ST_MultiLineString ('multilinestring ( (33 2, 34 3, 35 6), (28 4, 29 5, 31 8, 43 12), (39 3, 37 4, 36 7) )', 1) ); \n", "\n", "CREATE TABLE sample_lines(id SMALLINT, geometry db2gse.ST_LineString) organize by row; \n", "\n", "INSERT INTO sample_lines(id, geometry) VALUES (1110, db2gse.ST_LineString('linestring(850 250, 850 850)', 1) );\n", "\n", "INSERT INTO sample_lines(id, geometry) VALUES (1111, db2gse.ST_LineString('9090100100', 1) ); \n" ] } ], "source": [ "with open(\"sql_script\", \"r\") as f:\n", " for line in f:\n", " print(line)\n", " idadb.ida_query(str(line))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "# Define the corresponding IdaDataFrames\n", "\n", "sample_points = IdaGeoDataFrame(idadb, \"SAMPLE_POINTS\", indexer = \"ID\", geometry = \"LOC\")\n", "sample_lines = IdaGeoDataFrame(idadb, \"SAMPLE_LINES\", indexer = \"ID\", geometry = \"GEOMETRY\")\n", "sample_mlines = IdaGeoDataFrame(idadb, \"SAMPLE_MLINES\", indexer = \"ID\", geometry = \"GEOMETRY\")\n", "sample_geometries = IdaGeoDataFrame(idadb, \"SAMPLE_GEOMETRIES\", indexer = \"ID\", geometry = \"GEOMETRY\")\n", "sample_polygons = IdaGeoDataFrame(idadb, \"SAMPLE_POLYGONS\", indexer = \"ID\", geometry = \"GEOMETRY\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! We now have everything we need to explore ibmdbpy's geospatial features!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Get familiar with geospatial operations \n", "\n", "In the following cells of this notebook, you will find examples for each of the geospatial methods ot the package which return an IdaGeoSeries. These methods use functions from Db2 geospatial extender (db2gse). All you need is the sample data we defined above." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__1. General properties of geometry columns__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note on geospatial methods\n", "\n", "When calling methods to perform operations on IdaGeoDataFrames, these operations will automatically be performed on the chosen geometry column(s). \n", "\n", "For example, if you want to compute the area of a county, *provided that the column containing the coordinates of each county as a collection of polygons has been set as geometry column*, calling `counties.area()` will compute the area of each county. \n", "> counties[\"AREA\"] = counties.area()\n", "\n", "A new column called `AREA` is created in the Db2 table. The area is computed on the basis of the collection of polygons contained in the `SHAPE` column of the IdaGeoDataFrame, which we have defined as the geometry column.\n", "\n", "But if you want to perform an operation on a geospatial column which is not defined as *the* geometry column, then you may either set this column as the new geometry column, or explicitly specify this column as the targeted IdaGeoSeries of the method you call. \n", "> counties[\"AREA\"] = counties[\"SHAPE\"].area()\n", "\n", "Note that all of the methods we will show you in this notebook return null if the input geometry is null or empty, and will raise an error if the input geometry has not the appropriate data type. If not specified otherwise, the returned geometries have the same spatial reference system as the input geometry." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Setting a geometry column as *the* geometry column\n", "\n", "Db2 documentation might refer to columns with geospatial data types as __geometries__. But don't get confused about it: there is __only one column at a time__ which can be defined as __*the* geometry__ column of a dataset. \n", "\n", "Here we have directly defined the geometry column when defining the IdaDataFrame:\n", "> sample_points = IdaGeoDataFrame(idadb, \"SAMPLE_POINTS\", indexer = \"ID\", geometry = \"LOC\")\n", "\n", "If you want to set another column as geometry, use this structure:\n", "> your_idageodf.set_geometry(\"eligible_column\")\n", "\n", "You will find useful to check which column is defined as your current \"official\" geometry column by calling the attributes `geometry.columns`. " ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['GEOMETRY']\n", "['SHAPE']\n", "['LOC']\n" ] } ], "source": [ "print(sample_geometries.geometry.columns)\n", "print(tornadoes.geometry.columns)\n", "print(sample_points.geometry.columns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Data type and Geometry type\n", "\n", "Each geospatial method of ibmdbpy may have restrictions as to which data type is allowed as input. You can check the data type(s) of any geometry column by using the `geometry_type` method and `dtypes` attribute. What is the difference? `geometry_type` gives the particular geometry type of each row, whereas `dtypes` gives you the data type by column.\n", "\n", "Geometry columns can contain elements of different data types! For example, a column containing LineStrings and MultiLineStrings will have `ST_MultiLineString` as overall data type. A column containing points and lines or polygons will have the general `ST_GEOMETRY` data type.\n", "\n", "\n", "The spatial data types supported by DB2 Spatial Extender areimplementations of the geometries shown in the figure.\n", "\n", "![](geom_dtypes.png)\n", "\n", "*Fig: Hierarchy of geometries supported by DB2 Spatial Extender. Instantiable geometries inthis figure include examples of how they might be rendered visually.*" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 \"DB2GSE \".\"ST_POINT\"\n", "1 \"DB2GSE \".\"ST_POLYGON\"\n", "2 \"DB2GSE \".\"ST_POINT\"\n", "3 \"DB2GSE \".\"ST_MULTIPOLYGON\"\n", "4 \"DB2GSE \".\"ST_LINESTRING\"\n", "Name: DB2GSE.ST_GEOMETRYTYPE(GEOMETRY), dtype: object" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Geometry type --> output by row\n", "sample_geometries['GEOMETRY'].geometry_type().head()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TYPENAME
GEOMETRYST_GEOMETRY
\n", "
" ], "text/plain": [ " TYPENAME\n", "GEOMETRY ST_GEOMETRY" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# dtypes --> output by column\n", "sample_geometries[['GEOMETRY']].dtypes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Dimension of geometries\n", "\n", "If the given geometry is empty, then -1 is returned. For points and multipoints, the dimension is 0; for curves and multicurves, the dimension is 1; and for polygons and multipolygons, the dimension is 2. If the given geometry is null, then null is returned. Here are various examples of the results you can obtain." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOMETRYDIM
0POINT (1.000000 2.000000)0
1POLYGON ((0.000000 0.000000, 5.000000 0.000000...2
2POINT EMPTY-1
3MULTIPOLYGON EMPTY-1
4LINESTRING (33.000000 2.000000, 34.000000 3.00...1
\n", "
" ], "text/plain": [ " GEOMETRY DIM\n", "0 POINT (1.000000 2.000000) 0\n", "1 POLYGON ((0.000000 0.000000, 5.000000 0.000000... 2\n", "2 POINT EMPTY -1\n", "3 MULTIPOLYGON EMPTY -1\n", "4 LINESTRING (33.000000 2.000000, 34.000000 3.00... 1" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Mixed types and Empty geometries\n", "sample_geometries[\"DIM\"] = sample_geometries.dimension()\n", "sample_geometries[[\"GEOMETRY\", \"DIM\"]].head()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SHAPESHAPE_DIM
0MULTILINESTRING ((-90.2200062071 38.7700071663...1
1MULTILINESTRING ((-89.3000059755 39.1000072739...1
2MULTILINESTRING ((-84.5800047496 40.8800078382...1
3MULTILINESTRING ((-94.3700070010 34.4000061520...1
4MULTILINESTRING ((-90.6800062393 37.6000069289...1
\n", "
" ], "text/plain": [ " SHAPE SHAPE_DIM\n", "0 MULTILINESTRING ((-90.2200062071 38.7700071663... 1\n", "1 MULTILINESTRING ((-89.3000059755 39.1000072739... 1\n", "2 MULTILINESTRING ((-84.5800047496 40.8800078382... 1\n", "3 MULTILINESTRING ((-94.3700070010 34.4000061520... 1\n", "4 MULTILINESTRING ((-90.6800062393 37.6000069289... 1" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Dimension of a MultiLineString\n", "tornadoes[\"SHAPE_DIM\"] = tornadoes.dimension()\n", "tornadoes[[\"SHAPE\", \"SHAPE_DIM\"]].head()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
buffer_20_kmbuffer_20_km_dim
0POLYGON ((-90.3065519651 38.9369737029, -90.32...2
1POLYGON ((-89.3798853739 39.2690904737, -89.39...2
2POLYGON ((-84.7257488606 41.0222185578, -84.73...2
3POLYGON ((-94.5212609425 34.5296645617, -94.53...2
4POLYGON ((-90.8575378881 37.7120296620, -90.86...2
\n", "
" ], "text/plain": [ " buffer_20_km buffer_20_km_dim\n", "0 POLYGON ((-90.3065519651 38.9369737029, -90.32... 2\n", "1 POLYGON ((-89.3798853739 39.2690904737, -89.39... 2\n", "2 POLYGON ((-84.7257488606 41.0222185578, -84.73... 2\n", "3 POLYGON ((-94.5212609425 34.5296645617, -94.53... 2\n", "4 POLYGON ((-90.8575378881 37.7120296620, -90.86... 2" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Dimension of a polygon\n", "\n", "tornadoes[\"buffer_20_km\"] = tornadoes.buffer(distance = 20, unit = \"KILOMETER\")\n", "# Note: see section 7. to learn more about the buffer method\n", "\n", "tornadoes[\"buffer_20_km_dim\"] = tornadoes[\"buffer_20_km\"].dimension()\n", "tornadoes[[\"buffer_20_km\", \"buffer_20_km_dim\"]].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Number of items in a geometry\n", "\n", "You can compute the number of geometries in a collection with `num_geometries`. Similarly, to know how many points, lines or polygons a given geometry contains, use `num_points`, `num_line_strings` and `num_polygons`." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " TYPENAME\n", "SHAPE ST_MULTIPOLYGON\n" ] }, { "data": { "text/plain": [ "0 1\n", "1 1\n", "2 1\n", "3 1\n", "4 1\n", "Name: DB2GSE.ST_NUMGEOMETRIES(SHAPE), dtype: int64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Num Geometries for an ST_MULTIPOLYGON column\n", "print(counties.geometry.dtypes)\n", "counties.num_geometries().head()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " TYPENAME\n", "GEOMETRY ST_MULTILINESTRING\n" ] }, { "data": { "text/plain": [ "0 3\n", "Name: DB2GSE.ST_NUMGEOMETRIES(GEOMETRY), dtype: int64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Num Geometries for an ST_MULTILINESTRING column\n", "print(sample_mlines.geometry.dtypes)\n", "sample_mlines.num_geometries().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can be more precise and get the number of geometries of a particular type: points, lines and polygons." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOMETRYnum_points
0POINT (1.000000 2.000000)1.0
1POLYGON ((0.000000 0.000000, 5.000000 0.000000...5.0
2POINT EMPTYNaN
3MULTIPOLYGON EMPTYNaN
4LINESTRING (33.000000 2.000000, 34.000000 3.00...3.0
\n", "
" ], "text/plain": [ " GEOMETRY num_points\n", "0 POINT (1.000000 2.000000) 1.0\n", "1 POLYGON ((0.000000 0.000000, 5.000000 0.000000... 5.0\n", "2 POINT EMPTY NaN\n", "3 MULTIPOLYGON EMPTY NaN\n", "4 LINESTRING (33.000000 2.000000, 34.000000 3.00... 3.0" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Num Points\n", "sample_geometries[\"num_points\"] = sample_geometries.num_points()\n", "sample_geometries[[\"GEOMETRY\", \"num_points\"]].head()" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 3\n", "Name: DB2GSE.ST_NUMLINESTRINGS(GEOMETRY), dtype: int64" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Num Linestrings\n", "sample_mlines.num_line_strings().head()" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(57, 1)\n" ] }, { "data": { "text/plain": [ "0 2\n", "1 2\n", "2 2\n", "3 2\n", "4 2\n", "Name: NUM_POLY, dtype: int64" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Num Polygons\n", "counties['NUM_POLY'] = counties.num_polygons()\n", "print(counties['NUM_POLY'][counties['NUM_POLY']>1].shape)\n", "counties['NUM_POLY'][counties['NUM_POLY']>1].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__2. Dealing with point coordinates__\n", "\n", "When you have a geometry column containing points or a list of points, you might need to extract some of the coordinates or compute min and max in a particular dimension. Let's see how it works." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Dimension of point coordinates\n", "\n", "Use this method to check the dimension of your coordinates. Here you see that the points obtained with the `centroid` method have 2 dimensions (e.g. X and Y), whereas some of the sample points we have created have a Z and/or an M coordinate. \n", "\n", "X, Y and Z are traditional cartesian coordinates. The measure value M can be used to record additional non-coordinte values at various points within a geometry, for example an angle or a temperature." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDLOCcoord_dim
01POINT (14.000000 58.000000)2
12POINT Z(12.000000 35.000000 12)3
23POINT ZM(12.000000 66.000000 43 45)4
34POINT M(14.000000 58.000000 4)3
45POINT Z(12.000000 35.000000 12)3
\n", "
" ], "text/plain": [ " ID LOC coord_dim\n", "0 1 POINT (14.000000 58.000000) 2\n", "1 2 POINT Z(12.000000 35.000000 12) 3\n", "2 3 POINT ZM(12.000000 66.000000 43 45) 4\n", "3 4 POINT M(14.000000 58.000000 4) 3\n", "4 5 POINT Z(12.000000 35.000000 12) 3" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Sample points\n", "sample_points['coord_dim'] = sample_points.coord_dim()\n", "sample_points[['ID', 'LOC','coord_dim']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Extract coordinates" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDLOCXYZM
01POINT (14.000000 58.000000)14.058.0NaNNaN
12POINT Z(12.000000 35.000000 12)12.035.012.0NaN
23POINT ZM(12.000000 66.000000 43 45)12.066.043.045.0
34POINT M(14.000000 58.000000 4)14.058.0NaN4.0
45POINT Z(12.000000 35.000000 12)12.035.012.0NaN
\n", "
" ], "text/plain": [ " ID LOC X Y Z M\n", "0 1 POINT (14.000000 58.000000) 14.0 58.0 NaN NaN\n", "1 2 POINT Z(12.000000 35.000000 12) 12.0 35.0 12.0 NaN\n", "2 3 POINT ZM(12.000000 66.000000 43 45) 12.0 66.0 43.0 45.0\n", "3 4 POINT M(14.000000 58.000000 4) 14.0 58.0 NaN 4.0\n", "4 5 POINT Z(12.000000 35.000000 12) 12.0 35.0 12.0 NaN" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# X, Y, Z and M coordinates\n", "\n", "sample_points_extractor = IdaGeoDataFrame(idadb, \"SAMPLE_POINTS\", indexer = \"ID\")\n", "sample_points_extractor.set_geometry(\"LOC\")\n", "\n", "sample_points_extractor[\"X\"] = sample_points_extractor.x()\n", "sample_points_extractor[\"Y\"] = sample_points_extractor.y()\n", "sample_points_extractor[\"Z\"] = sample_points_extractor.z()\n", "sample_points_extractor[\"M\"] = sample_points_extractor.m()\n", "sample_points_extractor.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Min and max" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDGEOMETRYmax_Xmax_Ymax_Zmax_M
01POINT (1.000000 2.000000)1.02.0NoneNone
12POLYGON ((0.000000 0.000000, 5.000000 0.000000...5.04.0NoneNone
23POINT EMPTYNaNNaNNoneNone
34MULTIPOLYGON EMPTYNaNNaNNoneNone
45LINESTRING (33.000000 2.000000, 34.000000 3.00...35.06.0NoneNone
\n", "
" ], "text/plain": [ " ID GEOMETRY max_X max_Y max_Z \\\n", "0 1 POINT (1.000000 2.000000) 1.0 2.0 None \n", "1 2 POLYGON ((0.000000 0.000000, 5.000000 0.000000... 5.0 4.0 None \n", "2 3 POINT EMPTY NaN NaN None \n", "3 4 MULTIPOLYGON EMPTY NaN NaN None \n", "4 5 LINESTRING (33.000000 2.000000, 34.000000 3.00... 35.0 6.0 None \n", "\n", " max_M \n", "0 None \n", "1 None \n", "2 None \n", "3 None \n", "4 None " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Max X, Y, Z and M\n", "sample_geometries = IdaGeoDataFrame(idadb, \"SAMPLE_GEOMETRIES\", indexer = \"ID\", geometry = \"GEOMETRY\")\n", "\n", "sample_geometries[\"max_X\"] = sample_geometries.max_x()\n", "sample_geometries[\"max_Y\"] = sample_geometries.max_y()\n", "sample_geometries[\"max_Z\"] = sample_geometries.max_z()\n", "sample_geometries[\"max_M\"] = sample_geometries.max_m()\n", "sample_geometries.head()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDGEOMETRYmin_Xmin_Ymin_Zmin_M
01POINT (1.000000 2.000000)1.02.0NoneNone
12POLYGON ((0.000000 0.000000, 5.000000 0.000000...0.00.0NoneNone
23POINT EMPTYNaNNaNNoneNone
34MULTIPOLYGON EMPTYNaNNaNNoneNone
45LINESTRING (33.000000 2.000000, 34.000000 3.00...33.02.0NoneNone
\n", "
" ], "text/plain": [ " ID GEOMETRY min_X min_Y min_Z \\\n", "0 1 POINT (1.000000 2.000000) 1.0 2.0 None \n", "1 2 POLYGON ((0.000000 0.000000, 5.000000 0.000000... 0.0 0.0 None \n", "2 3 POINT EMPTY NaN NaN None \n", "3 4 MULTIPOLYGON EMPTY NaN NaN None \n", "4 5 LINESTRING (33.000000 2.000000, 34.000000 3.00... 33.0 2.0 None \n", "\n", " min_M \n", "0 None \n", "1 None \n", "2 None \n", "3 None \n", "4 None " ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Min X, Y, Z and M\n", "sample_geometries = IdaGeoDataFrame(idadb, \"SAMPLE_GEOMETRIES\", indexer = \"ID\", geometry = \"GEOMETRY\")\n", "\n", "sample_geometries[\"min_X\"] = sample_geometries.min_x()\n", "sample_geometries[\"min_Y\"] = sample_geometries.min_y()\n", "sample_geometries[\"min_Z\"] = sample_geometries.min_z()\n", "sample_geometries[\"min_M\"] = sample_geometries.min_m()\n", "sample_geometries.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__3. Start, middle and end point of curves__\n", "\n", "These methods take a curve (`ST_LineString`) as input and return the corresponding points:\n", "* start point: the first point of the curve\n", "* last point: the last point of the curve\n", "* mid point: the point geometrically in the middle of the two points above. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDGEOMETRYstartmiddleend
01110LINESTRING (850.000000 250.000000, 850.000000 ...POINT (850.000000 250.000000)POINT (850.000000 550.000000)POINT (850.000000 850.000000)
11111LINESTRING (90.000000 90.000000, 100.000000 10...POINT (90.000000 90.000000)POINT (95.000000 95.000000)POINT (100.000000 100.000000)
\n", "
" ], "text/plain": [ " ID GEOMETRY \\\n", "0 1110 LINESTRING (850.000000 250.000000, 850.000000 ... \n", "1 1111 LINESTRING (90.000000 90.000000, 100.000000 10... \n", "\n", " start middle \\\n", "0 POINT (850.000000 250.000000) POINT (850.000000 550.000000) \n", "1 POINT (90.000000 90.000000) POINT (95.000000 95.000000) \n", "\n", " end \n", "0 POINT (850.000000 850.000000) \n", "1 POINT (100.000000 100.000000) " ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Start point\n", "sample_lines['start'] = sample_lines.start_point()\n", "# Mid point\n", "sample_lines[\"middle\"] = sample_lines.mid_point()\n", "# End point\n", "sample_lines['end'] = sample_lines.end_point()\n", "\n", "# Take a look at the new columns\n", "sample_lines.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__4. Area and boundaries of polygons__\n", "\n", "You can compute the area and boundary of geometries with type `ST_Polygon` or `ST_MultiPolygon`. Let's see what we obtain." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Area of a surface\n", "\n", "If the input surface is composed of several polygons, then the area is the sum of each polygon's areas." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NAMEarea_in_km2
0Hall1430.061292
1Baker898.279987
2Wright1508.045410
3Charlotte1243.314514
4Sangamon2280.381965
\n", "
" ], "text/plain": [ " NAME area_in_km2\n", "0 Hall 1430.061292\n", "1 Baker 898.279987\n", "2 Wright 1508.045410\n", "3 Charlotte 1243.314514\n", "4 Sangamon 2280.381965" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Area\n", "counties['area_in_km2'] = counties.area(unit = 'KILOMETER')\n", "counties[['NAME','area_in_km2']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Under the hood, an SQL statements using `db2gse.ST_AREA` function is applyed to the `SHAPE`column, defined as geometry column. You can print the corresponding statements by enabling the `verbose` option." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note on allowed units\n", "\n", "To see which linear units are allowed, use the `linear_units` method of IdaGeoDataFrames. Examples include FOOT, MILE, METER, KILOMETRE, INDIAN YARD, NAUTICAL MILE and many other country and time specific units." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Boundary of a shape\n", "\n", "The `boundary` method returns the boundary of the input geometry as a new geometry column. Here you can see that the boundary of a polygon with no holes is a single linestring, represented as ST_LineString (example: Hodgeman county). The boundary of a polygon with one or more holes would consist of multiple linestrings, represented as ST_MultiLineString (example: Bedford county). " ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NAMEboundary
0WilbargerLINESTRING (-99.4756582604 33.8340108094, -99....
1AustinLINESTRING (-96.6219873342 30.0442882117, -96....
2LoganLINESTRING (-99.4497297204 46.6316377481, -99....
3La PlataLINESTRING (-107.4817473750 37.0000108736, -10...
4RandolphLINESTRING (-91.2589262966 36.2578866492, -91....
\n", "
" ], "text/plain": [ " NAME boundary\n", "0 Wilbarger LINESTRING (-99.4756582604 33.8340108094, -99....\n", "1 Austin LINESTRING (-96.6219873342 30.0442882117, -96....\n", "2 Logan LINESTRING (-99.4497297204 46.6316377481, -99....\n", "3 La Plata LINESTRING (-107.4817473750 37.0000108736, -10...\n", "4 Randolph LINESTRING (-91.2589262966 36.2578866492, -91...." ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Boundary\n", "counties['boundary'] = counties.boundary()\n", "counties[['NAME','boundary']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note on Spatial Reference Systems (SRS)\n", "\n", "The resulting coordinates are given in the same spatial reference system as the input coordinates. This is also true for the other methods showcased in this notebook. Let's check this with the `srid` method. If no spatial reference system identifier is given as an input parameter, the `srid` method simply returns the current spatial reference system identifier of the given geometry. Otherwise it resets the ID of the spatial reference system." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1005\n", "1 1005\n", "2 1005\n", "Name: DB2GSE.ST_SRID(DB2GSE.ST_BOUNDARY(SHAPE)), dtype: int64" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Look at the reference system\n", "# The bounadry column is not the specified geometry column, \n", "# so we explicitly call the srid function on it.\n", "counties['boundary'].srid().head(3)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1005\n", "1 1005\n", "2 1005\n", "Name: DB2GSE.ST_SRID(SHAPE), dtype: int64" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Here the srid function is applied to the SHAPE column\n", "# because it is set as the geometry column of the IdaGeoDataFrame.\n", "counties.srid().head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you can additionally check the *name* of the spatial reference system of your data with `srs_name`." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 SAMPLE_GCS_WGS_1984\n", "1 SAMPLE_GCS_WGS_1984\n", "2 SAMPLE_GCS_WGS_1984\n", "3 SAMPLE_GCS_WGS_1984\n", "4 SAMPLE_GCS_WGS_1984\n", "Name: DB2GSE.ST_SRSNAME(SHAPE), dtype: object" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "counties.srs_name().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__5. Length versus perimeter__\n", "\n", "The `length` method is for curves or collection of curves, i.e. geometries with data type `ST_LineString` or `ST_MultiLineString`. The`perimeter` method is a method for surfaces or collection of surfaces, i.e. geometries with data type ST_Polygon or ST_MultiPolygon. " ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OBJECTIDSHAPELEN_KM
01MULTILINESTRING ((-90.2200062071 38.7700071663...17.798545
12MULTILINESTRING ((-89.3000059755 39.1000072739...6.448745
23MULTILINESTRING ((-84.5800047496 40.8800078382...0.014213
34MULTILINESTRING ((-94.3700070010 34.4000061520...0.014173
45MULTILINESTRING ((-90.6800062393 37.6000069289...4.254681
\n", "
" ], "text/plain": [ " OBJECTID SHAPE LEN_KM\n", "0 1 MULTILINESTRING ((-90.2200062071 38.7700071663... 17.798545\n", "1 2 MULTILINESTRING ((-89.3000059755 39.1000072739... 6.448745\n", "2 3 MULTILINESTRING ((-84.5800047496 40.8800078382... 0.014213\n", "3 4 MULTILINESTRING ((-94.3700070010 34.4000061520... 0.014173\n", "4 5 MULTILINESTRING ((-90.6800062393 37.6000069289... 4.254681" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Length\n", "tornadoes['LEN_KM'] = tornadoes.length(unit = 'KILOMETER')\n", "tornadoes[['OBJECTID', 'SHAPE', 'LEN_KM']].head()" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NAMESHAPEPERI_MILE
0CrowleyMULTIPOLYGON (((-104.0582517183 38.1464982543,...116.393352
1CrookMULTIPOLYGON (((-119.6555997519 44.3069696862,...261.563538
2SomersetMULTIPOLYGON (((-75.9484559154 38.2159676862, ...126.374737
3ClarkeMULTIPOLYGON (((-77.9619585898 39.0138487739, ...55.450347
4ErieMULTIPOLYGON (((-78.4643840244 42.8674694862, ...173.074296
\n", "
" ], "text/plain": [ " NAME SHAPE PERI_MILE\n", "0 Crowley MULTIPOLYGON (((-104.0582517183 38.1464982543,... 116.393352\n", "1 Crook MULTIPOLYGON (((-119.6555997519 44.3069696862,... 261.563538\n", "2 Somerset MULTIPOLYGON (((-75.9484559154 38.2159676862, ... 126.374737\n", "3 Clarke MULTIPOLYGON (((-77.9619585898 39.0138487739, ... 55.450347\n", "4 Erie MULTIPOLYGON (((-78.4643840244 42.8674694862, ... 173.074296" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Perimeter\n", "counties[\"PERI_MILE\"] = counties.perimeter(unit = 'MILE')\n", "counties[[\"NAME\", \"SHAPE\", \"PERI_MILE\"]].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__6. Envelope, MBR and Convex hull__\n", "\n", "What is the difference between the envelope, the Minimum Bounding Rectangle and the convex hull of a geometry?\n", "\n", "All 3 methods return a shape containing the input geometry, but these shapes do not have the same properties.\n", "\n", "* The Minimum Bounding Rectangle (MBR) of a geometry is the smallest rectangle to contain the whole input shape. You can obtain it with the `mbr` method.\n", "* The `envelope` method outputs simple rectangles, represented as a polygon. If the given geometry is a point, a horizontal linestring, or a vertical linestring, then a rectangle, which is slightly larger than the given geometry, is returned. Otherwise, the Minimum Bounding Rectangle of the geometry is returned as the envelope.\n", "* The `convex hull` method returns the smallest *convex* set that contains the input shape. The convex hull is also called convex envelope or convex closure. For example, if you have a bounded subset of points in the Euclidean space, the convex hull can be visualized as the shape enclosed by an elastic band stretched around the outside points of the subset. \n", "\n", "Note that if vertices of the geometry do not form a convex, the convexhull method returns a null.\n", "If possible, the specific type of the returned geometry will be `ST_Point`, `ST_LineString`, or `ST_Polygon`. The convex hull of a convex polygon with no holes is a single linestring, represented as ST_LineString. The convex hull of a non convex polygon does not exit. For None geometries, for empty geometries and for non convex geometries the output is None." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NAMESHAPEMBRenvelopeconvex_envelope
0LafayetteMULTIPOLYGON (((-90.4263836312 42.5071807967, ...POLYGON ((-90.4269086653 42.5056648248, -89.83...POLYGON ((-90.4269086653 42.5056648248, -89.83...POLYGON ((-90.4269086653 42.8128698462, -90.42...
1SanilacMULTIPOLYGON (((-82.1455052616 43.6955954588, ...POLYGON ((-83.1204005291 43.1541073218, -82.12...POLYGON ((-83.1204005291 43.1541073218, -82.12...POLYGON ((-83.1204005291 43.3270573447, -83.11...
2TaylorMULTIPOLYGON (((-84.0691810519 32.5918031946, ...POLYGON ((-84.4532361602 32.3720591397, -84.00...POLYGON ((-84.4532361602 32.3720591397, -84.00...POLYGON ((-84.4532361602 32.5423801709, -84.45...
3OhioMULTIPOLYGON (((-80.5191234475 40.0164178652, ...POLYGON ((-80.7338065145 40.0164178652, -80.51...POLYGON ((-80.7338065145 40.0164178652, -80.51...POLYGON ((-80.7338065145 40.0334168595, -80.59...
4HoustonMULTIPOLYGON (((-83.7877562454 32.5016909466, ...POLYGON ((-83.8568549803 32.2825891390, -83.48...POLYGON ((-83.8568549803 32.2825891390, -83.48...POLYGON ((-83.8568549803 32.3751601541, -83.85...
\n", "
" ], "text/plain": [ " NAME SHAPE \\\n", "0 Lafayette MULTIPOLYGON (((-90.4263836312 42.5071807967, ... \n", "1 Sanilac MULTIPOLYGON (((-82.1455052616 43.6955954588, ... \n", "2 Taylor MULTIPOLYGON (((-84.0691810519 32.5918031946, ... \n", "3 Ohio MULTIPOLYGON (((-80.5191234475 40.0164178652, ... \n", "4 Houston MULTIPOLYGON (((-83.7877562454 32.5016909466, ... \n", "\n", " MBR \\\n", "0 POLYGON ((-90.4269086653 42.5056648248, -89.83... \n", "1 POLYGON ((-83.1204005291 43.1541073218, -82.12... \n", "2 POLYGON ((-84.4532361602 32.3720591397, -84.00... \n", "3 POLYGON ((-80.7338065145 40.0164178652, -80.51... \n", "4 POLYGON ((-83.8568549803 32.2825891390, -83.48... \n", "\n", " envelope \\\n", "0 POLYGON ((-90.4269086653 42.5056648248, -89.83... \n", "1 POLYGON ((-83.1204005291 43.1541073218, -82.12... \n", "2 POLYGON ((-84.4532361602 32.3720591397, -84.00... \n", "3 POLYGON ((-80.7338065145 40.0164178652, -80.51... \n", "4 POLYGON ((-83.8568549803 32.2825891390, -83.48... \n", "\n", " convex_envelope \n", "0 POLYGON ((-90.4269086653 42.8128698462, -90.42... \n", "1 POLYGON ((-83.1204005291 43.3270573447, -83.11... \n", "2 POLYGON ((-84.4532361602 32.5423801709, -84.45... \n", "3 POLYGON ((-80.7338065145 40.0334168595, -80.59... \n", "4 POLYGON ((-83.8568549803 32.3751601541, -83.85... " ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# MBR\n", "counties[\"MBR\"] = counties.mbr()\n", "# Envelope\n", "counties['envelope'] = counties.envelope()\n", "# Convex Hull\n", "counties['convex_envelope'] = counties.convex_hull()\n", "\n", "# Compare coordinates\n", "counties[[\"NAME\", \"SHAPE\", \"MBR\", 'envelope', 'convex_envelope']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__7. Buffer and Centroid__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `buffer` method returns the geometry that surrounds the input geometry by the input distance, measured in a given unit. \n", "\n", "Note that any circular curve in the boundary of the resulting geometry is approximated by linear strings! For example, the buffer around a point, which would result in a circular region, is approximated by a polygon whose boundary is a linestring." ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
SHAPEbuffer_20_km
0MULTILINESTRING ((-90.2200062071 38.7700071663...POLYGON ((-90.3065519651 38.9369737029, -90.32...
1MULTILINESTRING ((-89.3000059755 39.1000072739...POLYGON ((-89.3798853739 39.2690904737, -89.39...
2MULTILINESTRING ((-84.5800047496 40.8800078382...POLYGON ((-84.7257488606 41.0222185578, -84.73...
3MULTILINESTRING ((-94.3700070010 34.4000061520...POLYGON ((-94.5212609425 34.5296645617, -94.53...
4MULTILINESTRING ((-90.6800062393 37.6000069289...POLYGON ((-90.8575378881 37.7120296620, -90.86...
\n", "
" ], "text/plain": [ " SHAPE \\\n", "0 MULTILINESTRING ((-90.2200062071 38.7700071663... \n", "1 MULTILINESTRING ((-89.3000059755 39.1000072739... \n", "2 MULTILINESTRING ((-84.5800047496 40.8800078382... \n", "3 MULTILINESTRING ((-94.3700070010 34.4000061520... \n", "4 MULTILINESTRING ((-90.6800062393 37.6000069289... \n", "\n", " buffer_20_km \n", "0 POLYGON ((-90.3065519651 38.9369737029, -90.32... \n", "1 POLYGON ((-89.3798853739 39.2690904737, -89.39... \n", "2 POLYGON ((-84.7257488606 41.0222185578, -84.73... \n", "3 POLYGON ((-94.5212609425 34.5296645617, -94.53... \n", "4 POLYGON ((-90.8575378881 37.7120296620, -90.86... " ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Buffer\n", "tornadoes[\"buffer_20_km\"] = tornadoes.buffer(distance = 20, unit = \"KILOMETER\")\n", "tornadoes[['SHAPE', 'buffer_20_km']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The other way around: The `centroid` method returns the geometric center of the input geometry. This point is defined as the center of the minimum bounding rectangle of the given geometry. The resulting point is represented in the spatial reference system of the given geometry." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
NAMEcentroid
0LafayettePOINT (-90.1321430727 42.6602443432)
1SanilacPOINT (-82.6226543867 43.4248513903)
2TaylorPOINT (-84.2276055947 32.5598456804)
3OhioPOINT (-80.6263529839 40.1009943786)
4HoustonPOINT (-83.6721899279 32.4873821842)
\n", "
" ], "text/plain": [ " NAME centroid\n", "0 Lafayette POINT (-90.1321430727 42.6602443432)\n", "1 Sanilac POINT (-82.6226543867 43.4248513903)\n", "2 Taylor POINT (-84.2276055947 32.5598456804)\n", "3 Ohio POINT (-80.6263529839 40.1009943786)\n", "4 Houston POINT (-83.6721899279 32.4873821842)" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Centroid\n", "counties['centroid'] = counties.centroid()\n", "counties[['NAME','centroid']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__8. Rings__\n", "\n", "You can create a ring around a surface, or count the rings inside a geometry. More precisely, `num_interior_ring` gives the number of rings inside a polygon and `exterior_ring` computes the coordinates of the exterior ring of a polygon. " ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDGEOMETRYext_ring
01101POLYGON ((110.000000 120.000000, 120.000000 13...LINESTRING (110.000000 120.000000, 120.000000 ...
11102POLYGON ((110.000000 120.000000, 130.000000 12...LINESTRING (110.000000 120.000000, 130.000000 ...
\n", "
" ], "text/plain": [ " ID GEOMETRY \\\n", "0 1101 POLYGON ((110.000000 120.000000, 120.000000 13... \n", "1 1102 POLYGON ((110.000000 120.000000, 130.000000 12... \n", "\n", " ext_ring \n", "0 LINESTRING (110.000000 120.000000, 120.000000 ... \n", "1 LINESTRING (110.000000 120.000000, 130.000000 ... " ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Exterior ring\n", "sample_polygons[\"ext_ring\"] = sample_polygons.exterior_ring()\n", "sample_polygons.head()" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
GEOMETRYint_ring
0POLYGON ((110.000000 120.000000, 120.000000 13...0
1POLYGON ((110.000000 120.000000, 130.000000 12...1
\n", "
" ], "text/plain": [ " GEOMETRY int_ring\n", "0 POLYGON ((110.000000 120.000000, 120.000000 13... 0\n", "1 POLYGON ((110.000000 120.000000, 130.000000 12... 1" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# num Interior Ring\n", "sample_polygons[\"int_ring\"] = sample_polygons.num_interior_ring()\n", "sample_polygons[[\"GEOMETRY\", \"int_ring\"]].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__9. Generalization of geospatial patterns__\n", "\n", "The `generalize` takes a threshold value as option and represents the given geometry with a reduced number of points, while preserving the general characteristics of this geometry.\n", "\n", "It uses the Douglas-Peucker line-simplification algorithm. This algorithm recursively subdivides the sequence of points which define the geometry, until a set of points can be replaced by a straight line segment. In this line segment, none of the defining points deviates from the straight line segment by more than the given threshold. For simplification, Z and M coordinates are not considered when performing this operation. " ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
OBJECTIDSHAPEgeneralize
01MULTILINESTRING ((-90.2200062071 38.7700071663...MULTILINESTRING ((-90.2200062071 38.7700071663...
12MULTILINESTRING ((-89.3000059755 39.1000072739...MULTILINESTRING ((-89.3000059755 39.1000072739...
23MULTILINESTRING ((-84.5800047496 40.8800078382...MULTILINESTRING ((-84.5800047496 40.8800078382...
34MULTILINESTRING ((-94.3700070010 34.4000061520...MULTILINESTRING ((-94.3700070010 34.4000061520...
45MULTILINESTRING ((-90.6800062393 37.6000069289...MULTILINESTRING ((-90.6800062393 37.6000069289...
\n", "
" ], "text/plain": [ " OBJECTID SHAPE \\\n", "0 1 MULTILINESTRING ((-90.2200062071 38.7700071663... \n", "1 2 MULTILINESTRING ((-89.3000059755 39.1000072739... \n", "2 3 MULTILINESTRING ((-84.5800047496 40.8800078382... \n", "3 4 MULTILINESTRING ((-94.3700070010 34.4000061520... \n", "4 5 MULTILINESTRING ((-90.6800062393 37.6000069289... \n", "\n", " generalize \n", "0 MULTILINESTRING ((-90.2200062071 38.7700071663... \n", "1 MULTILINESTRING ((-89.3000059755 39.1000072739... \n", "2 MULTILINESTRING ((-84.5800047496 40.8800078382... \n", "3 MULTILINESTRING ((-94.3700070010 34.4000061520... \n", "4 MULTILINESTRING ((-90.6800062393 37.6000069289... " ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Generalize\n", "tornadoes['generalize'] = tornadoes.generalize(threshold = 4)\n", "tornadoes[['OBJECTID', 'SHAPE', 'generalize']].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__10. Boolean methods__\n", "\n", "Last but not least in our toolbox: a few convenience methods for property checking." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* General\n", "\n", "The method `is_valid`tells you whether an element is topologically correct." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1\n", "1 1\n", "2 1\n", "3 1\n", "4 1\n", "Name: DB2GSE.ST_ISVALID(LOC), dtype: int64" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# is_valid\n", "sample_points.is_valid().head()\n", "# Hopefully, we obtain only 1s (True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Methods for points\n", "\n", "Use `is_3d` to assess if a point is 3-dimensional, and `is_measured` to know if it has an additional measure property." ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDLOCcoord_dimis_3dis_M
01POINT (14.000000 58.000000)200
12POINT Z(12.000000 35.000000 12)310
23POINT ZM(12.000000 66.000000 43 45)411
34POINT M(14.000000 58.000000 4)301
45POINT Z(12.000000 35.000000 12)310
\n", "
" ], "text/plain": [ " ID LOC coord_dim is_3d is_M\n", "0 1 POINT (14.000000 58.000000) 2 0 0\n", "1 2 POINT Z(12.000000 35.000000 12) 3 1 0\n", "2 3 POINT ZM(12.000000 66.000000 43 45) 4 1 1\n", "3 4 POINT M(14.000000 58.000000 4) 3 0 1\n", "4 5 POINT Z(12.000000 35.000000 12) 3 1 0" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# is_3d\n", "sample_points[\"is_3d\"] = sample_points.is_3d()\n", "\n", "# is_measured\n", "sample_points[\"is_M\"]=sample_points.is_measured()\n", "sample_points.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Geometric properties of shapes\n", "\n", "Quite intuitively, a geometry is said to be empty if it does not have any points." ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 0\n", "1 0\n", "2 0\n", "Name: DB2GSE.ST_ISEMPTY(DB2GSE.ST_BOUNDARY(SHAPE)), dtype: int64" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#is_empty\n", "counties[\"boundary\"].is_empty().head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Objects with type `ST_LineStrings`, `ST_MultiPoints`, and `ST_MultiLineStrings` are either simple or nonsimple. They are simple if they obey all topological rules that are imposed on their respective subclass and nonsimple if they bend a few rules. Here are the rules: `ST_LineString` is simple if it does not intersect its interior.\n", "`ST_MultiPoint` is simple if none of its elements occupy the same coordinate space. `ST_MultiLineString` is simple if none of its element's interiors intersect." ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(0, 28)" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# is_simple\n", "counties[\"is_simple\"] = counties.is_simple()\n", "filtered_counties = counties[counties['is_simple'] == 0]\n", "filtered_counties.shape" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(37, 28)" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "counties[\"is_simple\"] = counties['boundary'].is_simple()\n", "filtered_counties = counties[counties['is_simple'] == 0]\n", "filtered_counties.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, you can ask yourself whether a curve is closed i.e. has a loop structure. Use the `is_closed` method for this." ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 0\n", "1 0\n", "Name: DB2GSE.ST_ISCLOSED(GEOMETRY), dtype: int64" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#is_closed\n", "sample_lines.is_closed().head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: additional geospatial functionalities are available in Db2 geospatial extender (DB2GSE) and can be used through ibmdbpy with the `IdaDataBase.ida_query` function . Find out more about DB2GSE on the [IBM Knowledge Center](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.spatial.topics.doc/doc/csbp1001.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "__11. Operations on multiple geometry columns__\n", "\n", "The last section of this guide shows you how to use geospatial methods which return an IdaGeoDataFrame. So far, all the methods we have showcased return an IdaGeoSeries i.e. only one column. The methods we show you here are applied on an IdaGeoDataFrame (say, `ida1`) with a second IdaGeoDataFrame as parameter (say, `ida2`). It returns a new IdaGeoDataFrame (let's call it `result`) containing three columns: IDs from `ida1`, IDs from `ida2`and the result of a given operation for each pair of input values. Let's look at an example." ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(31, 19)\n", "(5, 19)\n", "column1_for_db2gse: SHAPE\n", "column2_for_db2gse: SHAPE\n", "(155, 3)\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
0622313872.238731
1618402280.202293
2629393832.304173
361092927.972900
461634064.978945
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 6 2231 3872.238731\n", "1 6 1840 2280.202293\n", "2 6 2939 3832.304173\n", "3 6 109 2927.972900\n", "4 6 163 4064.978945" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Distance \n", "\n", "counties = IdaGeoDataFrame(idadb,'SAMPLES.GEO_COUNTY',indexer='OBJECTID')\n", "counties.set_geometry('SHAPE')\n", "ida1 = counties[counties['NAME'] == 'Washington']\n", "ida2 = counties[counties['NAME'] == 'Kent']\n", "\n", "print(ida1.shape)\n", "print(ida2.shape)\n", "\n", "result = ida1.distance(ida2,unit = 'KILOMETER')\n", "\n", "print(result.shape)\n", "result.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So what happened here ? `ida1`and `ida2` are obtained with a filtering statement on the `NAME` column. There are 31 counties called Washington in the US, and 5 called Kent, hence the number of rows in each table. The two IdaGeoDataFrames have inherited their `geometry` attribute from the `counties` IdaGeoDataFrame. Their column `SHAPE` is set as geometry. `result` is a new IdaGeoDataFrame obtained by applying the `distance` method to `ida1` with `ida2` as parameter. It contains as many distances as pairs (geom1, geom2) of elements geom1 from the geometry column of `ida1`, geom2 from the geometry column of `ida2`. 31 times 5 makes 155, so we are good. \n", "\n", "Okay, but what exactly does this distance actually stands for? Under the hood, ibmdbpy executes an SQL query using db2gse.ST_DISTANCE. This geospatial function from Db2 returns *the shortest distance between any point in the first geometry to any point in the second geometry*, measured in the default or given units. Here we have chosen to output the result in kilometers.\n", "\n", "A note on this particular method: our input geometries are polygons not points, so you might wonder which points are actually used to compute this distance. The output is the closest distance between both geometries, i.e. the distance between the two closest points from each polygons. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Methods returning a new geometry\n", "\n", "The `difference` method and the `intersection` method respectively compute the geometry corresponding to the geospatial difference and the intersection of two input geometries. \n", "\n", "More precisely, `difference` returns the part of the first geometry which does not intersect with the second geometry. Both geometries must be of the same dimension. If either geometry is null, null is returned. If the first geometry is empty, an empty geometry of type ST_Point is returned. If the second geometry is empty, then the first geometry is returned unchanged.\n", "\n", "`intersection` returns a new geometry that is the intersection of the two given geometries, represented in the spatial reference system of the first geometry. If possible, the specific type of the returned geometry will be `ST_Point`, `ST_LineString`, or `ST_Polygon`. For example, the intersection of a point and a polygon is either empty or a single point, represented as ST_MultiPoint." ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: SHAPE\n", "column2_for_db2gse: SHAPE\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
02109POLYGON ((-96.6219873342 30.0442882117, -96.61...
12163POLYGON ((-96.6219873342 30.0442882117, -96.61...
222231POLYGON ((-96.6219873342 30.0442882117, -96.61...
322939POLYGON ((-96.6219873342 30.0442882117, -96.61...
421840POLYGON ((-96.6219873342 30.0442882117, -96.61...
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 2 109 POLYGON ((-96.6219873342 30.0442882117, -96.61...\n", "1 2 163 POLYGON ((-96.6219873342 30.0442882117, -96.61...\n", "2 2 2231 POLYGON ((-96.6219873342 30.0442882117, -96.61...\n", "3 2 2939 POLYGON ((-96.6219873342 30.0442882117, -96.61...\n", "4 2 1840 POLYGON ((-96.6219873342 30.0442882117, -96.61..." ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Difference\n", "\n", "ida1 = counties[counties['NAME'] == 'Austin']\n", "ida2 = counties[counties['NAME'] == 'Kent']\n", "result = ida1.difference(ida2)\n", "result.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you can apply such a method on an IdaGeoDataFrame to itself. For example here:" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: GEOMETRY\n", "column2_for_db2gse: GEOMETRY\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
011POINT (1.000000 2.000000)
112POINT (1.000000 2.000000)
213POINT EMPTY
314POINT EMPTY
415POINT EMPTY
516POINT EMPTY
617POINT EMPTY
721POINT (1.000000 2.000000)
822POLYGON ((0.000000 0.000000, 5.000000 0.000000...
923POINT EMPTY
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 1 1 POINT (1.000000 2.000000)\n", "1 1 2 POINT (1.000000 2.000000)\n", "2 1 3 POINT EMPTY\n", "3 1 4 POINT EMPTY\n", "4 1 5 POINT EMPTY\n", "5 1 6 POINT EMPTY\n", "6 1 7 POINT EMPTY\n", "7 2 1 POINT (1.000000 2.000000)\n", "8 2 2 POLYGON ((0.000000 0.000000, 5.000000 0.000000...\n", "9 2 3 POINT EMPTY" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Intersection\n", "\n", "sample_geometries.intersection(sample_geometries).head(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`union` gives you the geometric union of the two input shapes. Geometry must be compatible e.g. only curves or only surfaces." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: GEOMETRY\n", "column2_for_db2gse: GEOMETRY\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
011101110MULTILINESTRING ((850.000000 250.000000, 850.0...
111111110MULTILINESTRING ((90.000000 90.000000, 100.000...
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 1110 1110 MULTILINESTRING ((850.000000 250.000000, 850.0...\n", "1 1111 1110 MULTILINESTRING ((90.000000 90.000000, 100.000..." ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new_sample = sample_lines.union(sample_mlines)\n", "new_sample.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Additionally, there is a bunch of methods to explore relationships between geometries, for example whether two geometries overlap (surfaces), intersect (curves), or contain, cross or touches one another. In the result column, you will obtain 0 for False, 1 for True, NaN if not applicable. Here are a few examples." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Inclusion\n", "\n", "`contains` indicates whether an element from the first geometry contains an element fron the second geometry.\n", "\n", "`within` indicates whether an element from the first geometry is within an element fron the second geometry.\n", "\n", "`equals` method can be used to look for duplications." ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: SHAPE\n", "column2_for_db2gse: SHAPE\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
0134214471
1134214501
2134215171
3134215181
4134215191
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 134 21447 1\n", "1 134 21450 1\n", "2 134 21517 1\n", "3 134 21518 1\n", "4 134 21519 1" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Contains\n", "customers = IdaGeoDataFrame(idadb,'SAMPLES.GEO_CUSTOMER',indexer='OBJECTID')\n", "customers.set_geometry('SHAPE')\n", "\n", "ida1 = customers[customers['INSURANCE_VALUE']>250000]\n", "ida2 = counties[counties['NAME']=='Madison']\n", "result = ida2.contains(ida1)\n", "\n", "result[result['RESULT']==1].head()" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: GEOMETRY\n", "column2_for_db2gse: GEOMETRY\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
013NaN
1111.0
214NaN
3170.0
4150.0
5160.0
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 1 3 NaN\n", "1 1 1 1.0\n", "2 1 4 NaN\n", "3 1 7 0.0\n", "4 1 5 0.0\n", "5 1 6 0.0" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Within\n", "sample_geometries.within(sample_geometries).head(6)" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
IDGEOMETRYmin_Xmin_Ymin_Zmin_M
01POINT (1.000000 2.000000)1.02.0NoneNone
12POLYGON ((0.000000 0.000000, 5.000000 0.000000...0.00.0NoneNone
25LINESTRING (33.000000 2.000000, 34.000000 3.00...33.02.0NoneNone
\n", "
" ], "text/plain": [ " ID GEOMETRY min_X min_Y min_Z \\\n", "0 1 POINT (1.000000 2.000000) 1.0 2.0 None \n", "1 2 POLYGON ((0.000000 0.000000, 5.000000 0.000000... 0.0 0.0 None \n", "2 5 LINESTRING (33.000000 2.000000, 34.000000 3.00... 33.0 2.0 None \n", "\n", " min_M \n", "0 None \n", "1 None \n", "2 None " ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Reminder: how the data looks like\n", "criterion = (sample_geometries['ID'] == 1)|(sample_geometries['ID'] == 2)|(sample_geometries['ID'] == 5)\n", "sample_geometries[criterion].head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Comment: a point is considered to contain itself. Polygon with ID 2 contains point with ID 1. Empty geometries can't contain any object so NaN is returned. Pairs of objects without inclusion relationship get a 0." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Intersections and contact\n", "\n", "A handful of methods to help you explore these topics: `crosses` (whether the geometry of the first IdaGeoDataFrame crosses the second),`intersects` (whether two geometries intersect each other), `mbr_intersects` (same for minimum bounding rectangles of each geometry), `disjoint` (opposite of intersection), `overlaps` () and `touches` ().\n", "\n", "Basic rules about geometry dimensions apply. For example, in the case of `crosses`: if the intersection of the two geometries results in a geometry that has a dimension that is one less than the maximum dimension of the two given geometries, and if the resulting geometry is not equal to any of the two given geometries, then 1 is returned. Incompatible dimensions: if the first geometry is a polygon or a multipolygon, or if the second geometry is a point or multipoint, or if any of the geometries is null value or is empty, then null is returned. Otherwise, the result is 0 (zero).\n", "\n", "Note: `intersects` returns the exact opposite result of `disjoint`.\n", "\n", "More details about intersections and allowed relationships between data types: [IBM Knowledge Center](https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.spatial.topics.doc/doc/rsbp4060.html)." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: SHAPE\n", "column2_for_db2gse: SHAPE\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
0127001
1152441
21575621
3828021
48168861
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 1 2700 1\n", "1 1 5244 1\n", "2 1 57562 1\n", "3 8 2802 1\n", "4 8 16886 1" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Crosses\n", "\n", "result = tornadoes.crosses(tornadoes)\n", "result[result[\"RESULT\"]==1].head()" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: SHAPE\n", "column2_for_db2gse: SHAPE\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
031340
131600
232630
333950
4317090
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 3 134 0\n", "1 3 160 0\n", "2 3 263 0\n", "3 3 395 0\n", "4 3 1709 0" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Intersects\n", "result = ida1.intersects(ida2)\n", "result.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A last example:`overlaps`. If the intersection of the geometries results in a geometry of the same dimension but is not equal to either of the given geometries, it returns 1 . Otherwise, it returns 0." ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "column1_for_db2gse: SHAPE\n", "column2_for_db2gse: SHAPE\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INDEXERIDA1INDEXERIDA2RESULT
0479145340
1479146740
2479142460
3479145300
4479142770
\n", "
" ], "text/plain": [ " INDEXERIDA1 INDEXERIDA2 RESULT\n", "0 4791 4534 0\n", "1 4791 4674 0\n", "2 4791 4246 0\n", "3 4791 4530 0\n", "4 4791 4277 0" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Overlaps\n", "\n", "texas59 = tornadoes[(tornadoes[\"ST\"]=='TX')&(tornadoes[\"YR\"]==1959)]\n", "texas60 = tornadoes[(tornadoes[\"ST\"]=='TX')&(tornadoes[\"YR\"]==1960)]\n", "# tornado records from Texas in 1959 and 1960 respectively\n", "\n", "result = texas60.overlaps(texas59)\n", "result.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Wow! You reached the end of this notebook about ibmdbpy geospatial extension! You learned so much!\n", "\n", "Before closing this notebook, a last step to perform: closing the connection to Db2." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", "__Close the connection__" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Connection closed.\n" ] } ], "source": [ "idadb.close()\n", "#idadb.reconnect()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Where to go from here?\n", "\n", "__Congratulations!__ You are now familiar with advanced functionalities of ibmdbpy's geospatial extension! You are ready to get hands-on experience by playing with other notebooks of this series, and apply your new skills on your own data!\n", "\n", "* Getting started with ibmdbpy :\n", " \n", " [Basics](./ibmdbpy_GettingStarted_1-basics.ipynb)\n", " \n", " \n", "* More on ibmdbpy's geospatial extension:\n", "\n", " [GeoBasics](./ibmdbpy_GettingStarted_2-geo_basics.ipynb)\n", " \n", "\n", "* More practice : analyze the Museums dataset, understand how to create IdaDataFrames and IdaGeoDataFrames:\n", " \n", " [Preprocessing](../MuseumsUseCase/ibmdbpy_Museums_DataAnalysis_1-preprocessing.ipynb)\n", "\n", " [Geospatial recommendation](../MuseumsUseCase/ibmdbpy_Museums_DataAnalysis_2-geospatial.ipynb)\n", "\n", "\n", "* Machine learning with ibmdbpy: \n", " \n", " [Naïve Bayes](../MachineLearning/ibmdbpy_NaiveBayes.ipynb)\n", "\n", " [Association Rules Mining](../MachineLearning/ibmdbpy_AssociationRulesMining.ipynb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "____\n", "\n", "__Authors__\n", "\n", "Eva Feillet - ML intern, IBM Cloud and Cognitive Software @ IBM Lab in Böblingen, Germany" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 2 }