# Getting started with ibmdbpy - Part 2: Geospatial extension

This notebook showcases the key abstractions and features of ibmdbpy geospatial extension. It provides you with step-by-step examples to get started with the package and its gespatial functionalities.

____

### Accelerate Python analytics with in-database processing by using ibmdbpy and IBM Db2 Warehouse
 
The ibmdbpy project provides access to in-database algorithms in IBM Db2 Warehouse through a Python interface for data manipulation. It accelerates Python analytics by seamlessly pushing operations written in Python into the underlying database for execution, thereby benefitting from in-database performance-enhancing features, such as columnar storage and parallel processing. For more details about ibmdbpy, please refer to the [documentation](https://pythonhosted.org/ibmdbpy/index.html) and to the dedicated [Git repository](https://github.com/ibmdbanalytics/ibmdbpy/tree/master/ibmdbpy). This notebook provides you with an overview of ibmdbpy geospatial extension.

__About ibmdbpy's geospatial extension__

Ibmdbpy supports a wrapper for spatial functions which enables you to generate and analyze spatial information about geographic features, and to store and manage the data on which this information is based. The spatial data is identified by ibmdbpy as a special class called IdaGeoDataFrame that extends all the properties of an IdaDataFrame and has additional methods supported for geospatial types like ST_Point, ST_LineString, ST_Polygon etc. Like Python package GeoPandas, which is an extension of Pandas and provides the GeoDataFrame abstraction, ibmdbpy spatial extension lets you play with IdaGeoDataFrames and IdaGeoSeries as an extension of IdaDataFrames and IdaSeries. The Python wrappers for spatial functions which Db2 currently supports make the querying process much simpler for users. For more details about ibmdbpy geospatial extension, please refer to the dedicated [documentation](https://pythonhosted.org/ibmdbpy/geospatial.html). More details about Db2 spatial extender can be found on the [IBM Knowledge Center](https://www.ibm.com/support/knowledgecenter/SSCJDQ/com.ibm.swg.im.dashdb.spatial.doc/doc/csbp1001.html).

__Prerequisites__
* Db2 account: see [IBM Cloud](https://cloud.ibm.com/login) or [Db2 Warehouse](https://www.ibm.com/support/knowledgecenter/en/SSCJDQ/com.ibm.swg.im.dashdb.kc.doc/welcome.html)
* Db2 driver: learn more on [IBM Knowledge Center](https://www.ibm.com/support/knowledgecenter/en/SSFMBX/com.ibm.swg.im.dashdb.doc/connecting/connect_applications_by_type.html) and see [IBM Support](https://www.ibm.com/support/pages/db2-jdbc-driver-versions-and-downloads)
* Having installed the [ibmdbpy](https://pypi.org/project/ibmdbpy/) python library with pip: 
> pip install ibmdbpy 
* Optional dependency for JDBC is the [jaydebeapi](https://pypi.org/project/JayDeBeApi/) library. Run the following command to install ibmdbpy, as well as the dependencies for the JDBC feature:
> pip install ibmdbpy[jdbc]

__Contents__

1. Establish connection to Db2 database
2. Open and define IdaGeoDataFrames
3. Learn how to handle IdaGeoDataFrames 

__Imports__

In [1]:
from ibmdbpy import IdaDataBase, IdaDataFrame, IdaGeoDataFrame

## 1. Establish connection to Db2 database

In ibmdbpy, users can choose to use JDBC to connect to a remote Db2 instance. The JDBC Connection is based on a Java Virtual Machine, so it is available on every machine that can run Java. You could also use an ODBC connection, however this is not the option we use in this notebook.

__JDBC__

First, you need to dowload a valid driver (more info on [IBM Support](https://www.ibm.com/support/pages/db2-jdbc-driver-versions-and-downloads)). Then you need to put the `db2jcc.jar` or`db2jcc4.jar` file in the ibmdbpy site-package folder. When ibmdbpy runs, it checks whether one of those files exists in its installation folder and uses it to establish the connection. 

More details on IBM Knowledge Center:
* JDBC for [IBM Db2 Warehouse](https://www.ibm.com/support/knowledgecenter/en/SSCJDQ/com.ibm.swg.im.dashdb.doc/connecting/connect_connecting_jdbc_applications.html)
* JDBC for [IBM Db2 on Cloud](https://www.ibm.com/support/knowledgecenter/en/SSFMBX/com.ibm.swg.im.dashdb.doc/connecting/connect_connecting_jdbc_applications.html)

In [2]:
#Enter the values for you database connection
dsn_database = "___" # e.g. "BLUDB"
dsn_hostname = "___" # e.g.: "abc.url.example"
dsn_port = "___"    # e.g. "50000"
dsn_uid = "___"     # e.g. "db2_1234"
dsn_pwd = "___"     # e.g. "zorglub"

In [3]:
connection_string='jdbc:db2://'+dsn_hostname+':'+dsn_port+'/'+dsn_database+':user='+dsn_uid+';password='+dsn_pwd+";" 
# the IdaDataBase object holds the connection to database.
idadb=IdaDataBase(dsn=connection_string)

Congratulations! You successfully connected to Db2 with ibmdbpy! When you are done, use `idadb.close()` to close the connection. To reconnect, or if the connection was broken, just use `idadb.reconnect()`. 

__Verbosity and autocommit__

The verbose mode automatically prints all SQL-communication between ibmdbpy and Db2, which can be very useful for debugging or understanding how ibmdbpy works. Choose the mode with `set_verbose()` or by setting the `verbosity` option when defining the IdaDataBase object. We encourage you to take a look at the prints in the first place, then feel free to silence the verbose. Note that printing adds delay when running cells.  

In [4]:
# Verbosity
from ibmdbpy.utils import set_verbose
set_verbose(False) # set it to True if you want to see the detail of ibmdbpy operations

By default the environment variable `AUTOCOMMIT` is then set to True, which means that every SQL statement which is submitted through the connection is executed within its own transaction and then committed implicitly. When you close the connection to Db2, if the environment variable `AUTOCOMMIT` is set to False, then all changes after the last explicit commit are discarded. 

Let's get to it!

## 2. Open and define IdaGeoDataFrames

Let's explore the functionalities of ibmdbpy's geospatial extension! In this notebook, we will get familiar with IdaGeoDataFrames. An IdaGeoDataFrame is a reference to a spatial table in a remote instance of Db2. It has inherited the properties of an IdaDataFrame and benefits from additional functionalities derived from Db2 spatial extension. 

In the examples below, we simply use sample data available out of the box in Db2 Warehouse. The`SAMPLES.GEO_COUNTY` dataset contains geographical and administrative data about US counties.

__Load the data__

* Method 1: open the data directly as IdaGeoDataFrame

Simply create an object of the IdaGeoDataFrame class. The first argument is the name of the IdaDataBase object which holds the connection to the database, the second argument designates the Db2 table you want to open. Optionally you may set an `indexer` column and a `geometry` column.

Note that in the following cell we have set the `OBJECTID` column as indexer when defining the IdaDataFrame. Otherwise, since the data is partitioned, it cannot be guaranteed that rows are always displayed in the same order. (Although, in paractice, an implicit sorting is operated by ibmdbpy). To ensure a behavior closer to Pandas' we therefore explicitly set an eligible column as index.

In [5]:
# prepopulated table in Db2
idageodf = IdaGeoDataFrame(idadb, 'SAMPLES.GEO_COUNTY', indexer='OBJECTID')

In [6]:
idageodf.head()


  "Using None as a default type_code." % (type_name, jdbc_type_const))


Unnamed: 0,OBJECTID,SHAPE,STATEFP,COUNTYFP,COUNTYNS,NAME,GEOID,NAMELSAD,LSAD,CLASSFP,MTFCC,CSAFP,CBSAFP,METDIVFP,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON
0,1,"MULTIPOLYGON (((-99.4756582604 33.8340108094, ...",48,487,1384029,Wilbarger,48487,Wilbarger County,6,H1,G4020,,46900.0,,A,2514474000.0,18257915.0,34.0849199,-99.2424397
1,2,"MULTIPOLYGON (((-96.6219873342 30.0442882117, ...",48,15,1383793,Austin,48015,Austin County,6,H1,G4020,288.0,26420.0,,A,1674401000.0,25610780.0,29.8919013,-96.2701696
2,3,"MULTIPOLYGON (((-99.4497297204 46.6316377481, ...",38,47,1035620,Logan,38047,Logan County,6,H1,G4020,,,,A,2571371000.0,47715597.0,46.469278,-99.5045846
3,4,"MULTIPOLYGON (((-107.4817473750 37.0000108736,...",8,67,198148,La Plata,8067,La Plata County,6,H1,G4020,,20420.0,,A,4382664000.0,19545452.0,37.2873673,-107.8397178
4,5,"MULTIPOLYGON (((-91.2589262966 36.2578866492, ...",5,121,69178,Randolph,5121,Randolph County,6,H1,G4020,,,,A,1689166000.0,9968439.0,36.3412985,-91.0284409


* Method 2: convert an IdaDataFrame with eligible geometry column into an IdaGeoDataFrame

You can create an IdaGeoDataFrame by converting an IdaDataFrame which you gave an eligible geometry column. Note that an IdaGeoDataFrame inherits all the properties of an IdaDataFrame, so you could also simply open it as an IdaDataFrame if you do not need to manipulate its geospatial information, and convert it back any time into an IdaGeoDataFrame. For conversion, use the `IdaGeoDataFrame.from_IdaDataFrame` function. Note that in any case, an IdaGeoDataFrame is characterized by the presence of a column with geometry type: its data type belongs to the `ST_Geometry` family (like in Db2 : `db2gse.ST_Geometry`). 

In [7]:
# here we open the Db2 table SAMPLES.GEO_COUNTY as IdaDataFrame
idadf = IdaDataFrame(idadb, 'SAMPLES.GEO_COUNTY', indexer='OBJECTID')

In [8]:
# now we convert this IdaDataFrame into an IdaGeoDataFrame
# we set the 'SHAPE' column as geometry
geo_idadf = IdaGeoDataFrame.from_IdaDataFrame(idadf, geometry = "SHAPE")

__Understand the geometry column__

Have a look at the data types : you see that the `SHAPE` column is of type `ST_MULTIPOLYGON`. What does that mean? It is a geospatial data type inherited from Db2gse, Db2 Spatial Extender.

In [9]:
# Take a look at the datatypes of our dataset
idageodf.dtypes

Unnamed: 0,TYPENAME
OBJECTID,INTEGER
SHAPE,ST_MULTIPOLYGON
STATEFP,VARCHAR
COUNTYFP,VARCHAR
COUNTYNS,VARCHAR
NAME,VARCHAR
GEOID,VARCHAR
NAMELSAD,VARCHAR
LSAD,VARCHAR
CLASSFP,VARCHAR


<font color='blue'>Note on geospatial data types</font>

* The simplest spatial data item consists of two coordinates which define the position of a single geographic feature. It is denoted with the type `ST_Point`. You may also define 3D points with an angle measure, it will still be of type `ST_POINT`. 

* More extensive spatial data items may consist of several coordinates that define a linear path (such as a road or river), they are denoted as `ST_LineString`, or may consist of coordinates that define the boundary of an area  (for example, the boundary of a land parcel or flood plain), they are denoted as `ST_Polygon`. Each spatial data item is an instance of a spatial data type. 

* A collection of such instances can also be defined: `ST_MultiLineString` for the union of several `ST_LineString` objects, `ST_MultiPolygon` for the union of several `ST_Polygon`objects... Spatial data types are structured types that belong to a single hierarchy `ST_Geometry`. 

* If a column has mixed data types (for example, `ST_LineString`and `ST_MultiLineString`, or `ST_Polygon`and `ST_Point`), then the type `ST_Geometry` is attributed to the column.

A dataset may have several columns with a geospatial datatype. However, only one column can be defined as *the* geometry column of the IdaGeoDataFrame. 

<font color='blue'>Defining the geometry column</font>

An IdaGeoDataFrame has by definition at least one column which can be defined as geometry. However the geometry is not attributed by default. It needs to be set explicitly. Here are two ways to define a geometry column.

1. When defining the IdaGeoDataFrame, use the `geometry` option:
> idageodf = IdaGeoDataFrame(idadb, "TABLENAME", geometry = "GEOM_COL")

2. Or afterwards, use `set_geometry`:
> idageodf.set_geometry('GEOM_COL')

If you want to switch the geometry column:
> idageodf.set_geometry('OTHER_GEOM_COL')

In [10]:
# Has a geometry column already been set?
try:
    idageodf.geometry.column
except:
    print("Geometry property has not been set yet.")

Geometry property has not been set yet.


In [11]:
# Explicitly designate a column as geometry.
idageodf.set_geometry('SHAPE')

In [12]:
# Check the change.
idageodf.geometry.column

'SHAPE'

Congratulations! You have made your first steps with IdaGeoDataFrames, ibmdbpy's abstraction dedicated to geospatial data. Let's see how to manipulate them and what interesting features ibmdbpy offers!

## 3. Learn how to handle IdaGeoDataFrames 

There are two types of functions you can apply to IdaGeoDataFrames. 
* Single-input functions work on a single IdaGeoDataFrame with one spatial column.
* Double-input functions can either work on a single IdaGeoDataFrame with two spatial columns or two different IdaGeoDataFrames with one spatial column each. 

When calling methods to perform operations on IdaGeoDataFrames, these operations will automatically be performed on the chosen geometry column(s). 

For example, if you want to compute the area of a county, provided that the geometry column contains the coordinates of each county as a collection of polygons, calling `idageodf.area()` will compute the area of each county. 
> idageodf["AREA"] = idageodf.area()

A new column called `AREA` is created in the Db2 table. The area is computed on the basis of the collection of polygons contained in the `SHAPE` column of the IdaGeoDataFrame, which we have defined as the geometry column.

But if you want to perform an operation on a geospatial column which is not defined as *the* geometry column, then you may either set this column as the new geometry column, or explicitly specify this column as the targeted IdaGeoSeries of the method you call. 
> idageodf["AREA"] = idageodf["SHAPE"].area()

Under the hood, an SQL statement using `db2gse.ST_AREA` function is executed. You can print this statement by enabling the verbose option.

Let's get familiar with IdaGeoDataFrame manipulation!

__Operations on a single IdaGeoDataFrame__

Here we create new columns by applying methods directly to our IdaGeoDataFrame, as explained above.

* Compute the area of each county and store the information in a new column

We apply the `area` method as explained in this section's introduction.

In [13]:
# optional : define a unit, here the area will be given in squared kilmometers.

idageodf['AREA_KM2'] = idageodf.area(unit='KILOMETER')

The `DB2GSE.ST_AREA()`function has been applied to the `SHAPE` column in order to compute the area associated to each polygon. The output of this function is an IdaGeoSeries. We have obtained a new column that is added to the original IdaGeoDataFrame, meaning the underlying Db2 table has been physically modified.  

* Compute the coordinates of the line defining each county's boundary 

We apply the `boundary` method to our `SHAPE` column.

In [14]:
idageodf['BOUNDARIES'] = idageodf.boundary()

In [15]:
# Take a look at the new columns!
idageodf.head()

Unnamed: 0,OBJECTID,SHAPE,STATEFP,COUNTYFP,COUNTYNS,NAME,GEOID,NAMELSAD,LSAD,CLASSFP,...,CSAFP,CBSAFP,METDIVFP,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,AREA_KM2,BOUNDARIES
0,1,"MULTIPOLYGON (((-99.4756582604 33.8340108094, ...",48,487,1384029,Wilbarger,48487,Wilbarger County,6,H1,...,,46900.0,,A,2514474000.0,18257915.0,34.0849199,-99.2424397,2531.368865,"LINESTRING (-99.4756582604 33.8340108094, -99...."
1,2,"MULTIPOLYGON (((-96.6219873342 30.0442882117, ...",48,15,1383793,Austin,48015,Austin County,6,H1,...,288.0,26420.0,,A,1674401000.0,25610780.0,29.8919013,-96.2701696,1741.293545,"LINESTRING (-96.6219873342 30.0442882117, -96...."
2,3,"MULTIPOLYGON (((-99.4497297204 46.6316377481, ...",38,47,1035620,Logan,38047,Logan County,6,H1,...,,,,A,2571371000.0,47715597.0,46.469278,-99.5045846,2618.142954,"LINESTRING (-99.4497297204 46.6316377481, -99...."
3,4,"MULTIPOLYGON (((-107.4817473750 37.0000108736,...",8,67,198148,La Plata,8067,La Plata County,6,H1,...,,20420.0,,A,4382664000.0,19545452.0,37.2873673,-107.8397178,4404.97532,"LINESTRING (-107.4817473750 37.0000108736, -10..."
4,5,"MULTIPOLYGON (((-91.2589262966 36.2578866492, ...",5,121,69178,Randolph,5121,Randolph County,6,H1,...,,,,A,1689166000.0,9968439.0,36.3412985,-91.0284409,1701.895596,"LINESTRING (-91.2589262966 36.2578866492, -91...."


In [16]:
# What is the data type of our new columns?
idageodf[['AREA_KM2','BOUNDARIES']].dtypes

Unnamed: 0,TYPENAME
AREA_KM2,DOUBLE
BOUNDARIES,ST_GEOMETRY


As you see, the area is a number with type DOUBLE and the boundaries are displayed as `ST_LineString` objects in the view in the cell above. So you may wonder why our `BOUNDARIES` column has the `ST_GEOMETRY` datatype. The reason is `db2gse.ST_BOUNDARY` function, which is used under the hood by ibmdbpy. This function returns by definition a column of data type `ST_GEOMETRY`, whatever the input data type.

Reminder: Our dataset now has two eligible geometry columns, but there is no ambiguity about which is used because we have explicitly defined the `SHAPE` column as geometry. We can change this setting any time if needed with `set_geometry`.

__SQL queries on a single IdaGeoDataFrame__

Calling geospatial methods directly on the IdaGeoDataFrame works because we have explicitly set the geometry column, so ibmdbpy knows which feature to pick to perform the requested action. Alternatively, you can write custom SQL queries using db2gse to manipulate the tables. Here is an example. 

In [17]:
# here we only select the first 9 rows
idadb.ida_query('SELECT OBJECTID, DB2GSE.ST_BOUNDARY(SHAPE) AS BOUNDARIES FROM (SELECT "OBJECTID","SHAPE" FROM SAMPLES.GEO_COUNTY WHERE ("OBJECTID" < 10))')
                

Unnamed: 0,OBJECTID,BOUNDARIES
0,1,"LINESTRING (-99.4756582604 33.8340108094, -99...."
1,2,"LINESTRING (-96.6219873342 30.0442882117, -96...."
2,3,"LINESTRING (-99.4497297204 46.6316377481, -99...."
3,4,"LINESTRING (-107.4817473750 37.0000108736, -10..."
4,5,"LINESTRING (-91.2589262966 36.2578866492, -91...."
5,6,"LINESTRING (-123.3609766418 45.7796744083, -12..."
6,7,"LINESTRING (-81.6941497935 39.8426437809, -81...."
7,8,"LINESTRING (-87.7623450952 44.6770233072, -87...."
8,9,"LINESTRING (-72.1021669126 42.0288105765, -72...."


__Operations on two IdaGeoDataFrames__

* Preparation step

Here we create two IdaGeoDataFrames that we will use to showcase a new type of functionality.
Using a filtering statement, we obtain ida1 (county name is Austin) and ida2 (county name is Kent).

In [18]:
# Filtering on NAME
ida1 = idageodf[idageodf['NAME'] == 'Austin']
ida2 = idageodf[idageodf['NAME'] == 'Kent']

# How many rows in each dataframe?
print("Shape of ida1:")
print(ida1.shape)
print("Shape of ida2:")
print(ida2.shape)

Shape of ida1:
(1, 21)
Shape of ida2:
(5, 21)


In [19]:
ida2.head()

Unnamed: 0,OBJECTID,SHAPE,STATEFP,COUNTYFP,COUNTYNS,NAME,GEOID,NAMELSAD,LSAD,CLASSFP,...,CSAFP,CBSAFP,METDIVFP,FUNCSTAT,ALAND,AWATER,INTPTLAT,INTPTLON,AREA_KM2,BOUNDARIES
0,109,"MULTIPOLYGON (((-85.7825032878 42.7682081077, ...",26,81,1622983,Kent,26081,Kent County,6,H1,...,266.0,24340.0,,A,2195539000.0,62785346.0,43.0324971,-85.547446,2258.324412,"LINESTRING (-85.7825032878 42.7682081077, -85...."
1,163,"MULTIPOLYGON (((-71.3805426551 41.6503345267, ...",44,3,1219778,Kent,44003,Kent County,6,H4,...,148.0,39300.0,,N,436512000.0,50723547.0,41.6689489,-71.5776343,484.77799,"LINESTRING (-71.3805426551 41.6503345267, -71...."
2,1840,"MULTIPOLYGON (((-100.5174574836 33.3978716832,...",48,263,1383917,Kent,48263,Kent County,6,H1,...,,,,A,2337522000.0,1065234.0,33.1847797,-100.7697199,2338.550487,"LINESTRING (-100.5174574836 33.3978716832, -10..."
3,2231,"MULTIPOLYGON (((-75.1384627071 39.0027048751, ...",10,1,217271,Kent,10001,Kent County,6,H1,...,428.0,20100.0,,A,1518598000.0,549069137.0,39.0970884,-75.5029819,2080.430658,"LINESTRING (-75.1384627071 39.0027048751, -75...."
4,2939,"MULTIPOLYGON (((-75.7560059150 39.2460739025, ...",24,29,593907,Kent,24029,Kent County,6,H1,...,,,,A,717511200.0,353307672.0,39.2412793,-76.1259867,1076.193887,"LINESTRING (-75.7560059150 39.2460739025, -75...."


Comment: there are 5 counties called Kent in the US. If you look at the `STATEFP` column containing the FIPS codes, you can tell that they are located in Michigan, Rhode Island, Texas, Delaware and Maryland respectively.

* Mutual distance computation on two IdaGeoDataFrames

So we have two IdaGeoDataFrames, the first one with 1 row only, the second one with 5 rows in total. Let's use the `distance` method to compute the distance between any county of the first dataset and any county of the second dataset. We apply it on ida1, with ida2 as argument, and obtain a new table. <font color='blue'>The output is an IdaGeoDataFrame containing the row IDs of each element from the input datasets, and the computed result.</font>

Okay, but what exactly does this distance actually stands for? Under the hood, ibmdbpy executes an SQL query using db2gse.ST_DISTANCE. This geospatial function from Db2 returns the shortest distance between any point in the first geometry to any point in the second geometry, measured in the default or given units. Here we have chosen to output the result in kilometers.

In [20]:
result = ida1.distance(ida2,unit = 'MILE')
result.head()

column1_for_db2gse: SHAPE
column2_for_db2gse: SHAPE


Unnamed: 0,INDEXERIDA1,INDEXERIDA2,RESULT
0,2,109,1047.000738
1,2,163,1572.738833
2,2,2231,1308.108878
3,2,2939,1282.652624
4,2,1840,305.497341


In [21]:
print(type(result))

<class 'ibmdbpy.geoFrame.IdaGeoDataFrame'>


Many more functionalities are available: intersections, overlaps etc. Take a look at [ibmdbpy documentation](https://pythonhosted.org/ibmdbpy/geoFrame.html) to learn more! An **extensive guide** is also provided in the next notebook of this series.

____

__Final step: Close the connection__

Closing the IdaDataBase object is equivalent to closing the connection: once the connection is closed, it is not possible anymore to use the IdaDataBase instance and any IdaDataFrame instances that were opened in this notebook. Only the changes which have physically impacted the database remain.

So far, all the modifications you made are visible in the notebook as query outputs or as temporary views that have automatically been dropped after use. For example, no table corresponding to ida1, ida2 or their distances have been saved, so once the connection to the database is closed, none of the changes will remain in Db2. If you want to save a particular table to Db2, use :
> idadf.save_as("TABLE_NAME", clear_existing=True|False)

In [22]:
idadb.close()
# idadb.reconnect()

Connection closed.


## Where to go from here?

Congratulations! You are now familiar with the basic functionalities of ibmdbpy's geospatial extension! You are ready to get hands-on experience by playing with other notebooks of this series:

* Getting started with ibmdbpy: 
        
    [Basics](./ibmdbpy_GettingStarted_1-basics.ipynb)
    
    
* More on ibmdbpy's geospatial functions:

    [Extensive guide](./ibmdbpy_GettingStarted_3-geo_guide.ipynb)
    

* Ibmdbpy in practice : analyze the Museums dataset, understand how to create IdaDataFrames and IdaGeoDataFrame:
        
    [Preprocessing](../MuseumsUseCase/ibmdbpy_Museums_DataAnalysis_1-preprocessing.ipynb)

    [Geospatial recommendation](../MuseumsUseCase/ibmdbpy_Museums_DataAnalysis_2-geospatial.ipynb)


* Machine learning with ibmdbpy: 
        
    [Naïve Bayes](../MachineLearning/ibmdbpy_NaiveBayes.ipynb)

    [Association Rules Mining](../MachineLearning/ibmdbpy_AssociationRulesMining.ipynb)

____

__Authors__

Eva Feillet - ML intern, IBM Cloud and Cognitive Software @ IBM Lab in Böblingen, Germany