# CARTOFrames in Action

Data science workflows that leverage CARTO

## Motivation

A large number of data scientists rely on the de facto standards of analysis in a Jupyter notebook. We want to support that by creating a Python module that allows these users to develop analyses while seamlessly interacting with CARTO. We aim to feature:

* Stunning cartography from CARTO maps
* Seamless reading and writing to CARTO for arbitrary updates to a DataFrame
* Interactions with the Data Observatory to enrich a user's analysis


## Basics

You'll need the following for this:

1. Your CARTO username
2. Your API key
3. Your favorite table (I recommend duplicating it and using the copy because we will do some oprations on it)

Paste these values in the quotes (`''`) below.

In [None]:
import pandas as pd
import cartoframes

username = '' # <-- insert your username here
api_key = '' # <-- insert your API key here
tablename = '' # <-- insert your tablename here

cc = cartoframes.CartoContext('https://{}.carto.com/'.format(username),
 api_key)

### Read from your CARTO account

In [None]:
df = cc.read(tablename)
df.head()

### Make updates to the DataFrame and Sync Changes

In [None]:
df['favorite_cookie'] = 'pecan'
df['favorite_cookie'][df.index % 2 == 0] = 'oatmeal'
cc.write(df, tablename, overwrite=True)

### Map it out

In [None]:
df.carto_map(color='favorite_cookie')

### cartoframe from query

Query your CARTO account and create a table from the query. Finally, pull that new table into a pandas DataFrame.

In [None]:
df_buffer = cc.query(query='''
 SELECT ST_Buffer(the_geom::geography, 10000)::geometry as the_geom,
 cartodb_id, mag, depth, place
 FROM all_month_3
 LIMIT 100
 ''',
 tablename='buffered_earthquakes')
df_buffer.head()

In [None]:
print(df_buffer.get_carto_datapage())

## Model workflow

Let's recreate the workflow from , where the author explores [`dask`](http://dask.pydata.org/en/latest/) for splitting up the computations between multiple cores in a machine to complete tasks more quickly. 

In [None]:
from dask import dataframe as dd
import pandas as pd
columns = ["name", "amenity", "Longitude", "Latitude"]
data = dd.read_csv('POIWorld.csv', usecols=columns)

In [None]:
with_name = data[data.name.notnull()]
with_amenity = data[data.amenity.notnull()]

is_starbucks = with_name.name.str.contains('[Ss]tarbucks')
is_dunkin = with_name.name.str.contains('[Dd]unkin')

starbucks = with_name[is_starbucks].compute()
dunkin = with_name[is_dunkin].compute()

In [None]:
starbucks['type'] = 'starbucks'
dunkin['type'] = 'dunkin'
coffee_places = pd.concat([starbucks, dunkin])
coffee_places.head(20)

## Write DataFrame to CARTO

In [None]:
import pandas as pd
import cartoframes

username = 'eschbacher'
api_key = 'abcdefghijklmnopqrstuvwxyz'

In [None]:
# specify columns for lng/lat so carto will create a geometry
cc.write(coffee_places,
 tablename='coffee_places',
 lnglat=('longitude', 'latitude'))

### Let's visualize this DataFrame

Category map on Dunkin' Donuts vs. Starbucks (aka, color by 'type')

In [None]:
from cartoframes import Layer
cc.map(layers=Layer('coffee_places', color='type', size=5),
 zoom=9, lng=-71.0637, lat=36.4275,
 interactive=False)

## Fast Food

In [None]:
is_fastfood = with_amenity.amenity.str.contains('fast_food')
fastfood = with_amenity[is_fastfood]
fastfood.name.value_counts().head(12)

In [None]:
ff = fastfood.compute()
ff.sync_carto(username=username,
 api_key=api_key,
 requested_tablename='fastfood_dask',
 lnglat_cols=('longitude', 'latitude'))

### Number of Fast Food places in this OSM dump

In [None]:
len(ff)

### Recreating the map from the blog

In [None]:
cc.map(layers=Layer('fastfood_dask', size=2, color='#FFF'))

### Going crazy with the Data Observatory

This method relies in you having the `do_augment_table` function that John had you load into your account. This might be kinda slow given that we have 

In [None]:
# DO measures: Total Population,
# Children under 18 years of age
# Median income

data_obs_measures = [{'numer_id': 'us.census.acs.B01003001'},
 {'numer_id': 'us.census.acs.B17001001'},
 {'numer_id': 'us.census.acs.B19013001'}]
cc.data_augment('coffee_places', data_obs_measures)