{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Overwriting feature layers\n", "\n", "As content publishers, you may be required to keep certain web layers up to date. As new data arrives, you may have to append new features, update existing features etc. There are a couple of different options to accomplish this:\n", " \n", " - Method 1: editing individual features as updated datasets are available\n", " - Method 2: overwriting feature layers altogether with updated datasets\n", " \n", "Depending on the number of features that are updated, your workflow requirements, you may adopt either or both kinds of update mechanisms.\n", "\n", "In the sample [Updating features in a feature layer](python/sample-notebooks/updating-features-in-a-feature-layer/) we explore method 1. In this sample, we explore method 2.\n", "\n", "**Method 2**\n", " - [Introduction](#Introduction)\n", " - [Publish the cities feature layer using the initial dataset](Publish-the-cities-feature-layer-using-the-initial-dataset)\n", " - [Merge updates from spreadsheets 1 and 2](#Merge-updates-from-spreadsheets-1-and-2)\n", " - [Write the updates to disk](#Write-the-updates-to-disk)\n", " - [Overwrite the feature layer](#Overwrite-the-feature-layer)\n", " - [Access the overwritten feature layer](#Access-the-overwritten-feature-layer)\n", " - [Conclusion](#Conclusion)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Import libraries\n", "from arcgis.gis import GIS\n", "from arcgis import features\n", "from getpass import getpass #to accept passwords in an interactive fashion\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "········\n" ] } ], "source": [ "# Connect to the GIS\n", "password = getpass()\n", "gis = GIS(\"https://geosaurus.maps.arcgis.com\",'arcgis_python', password)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "Let us consider a scenario where we need to update a feature layer containing the capital cities of the US. We have 2 csv datasets simulating an update workflow as described below:\n", "\n", " 1. capitals_1.csv -- contains the initial, incomplete dataset which is published as a feature layer\n", " 2. capitals_2.csv -- contains additional points and updates to existing points, building on top of usa_capitals_1.csv\n", " \n", "Our goal is to update the features in the feature layer with the latest information contained in both the spreadsheets. We will accomplish this through the following steps\n", "\n", " 1. Add `capitals_1.csv` as an item.\n", " 2. Publish the csv as a feature layer. This simulates a typical scenario where a feature layer is published with initial set of data that is available.\n", " 3. After updated information is available in `capitals_2.csv`, we will merge both spread sheets.\n", " 4. Overwrite the feature layer using the new spread sheet file.\n", " \n", "When you overwrite a feature layer, only the features get updated. All other information such as the feature layer's item id, comments, summary, description etc. remain the same. This way, any web maps or scenes that have this layer remains valid. Overwriting a feature layer also updates the related data item from which it was published. In this case, it will also update the csv data item with the updated spreadsheet file.\n", "\n", "**Note**: Overwrite capability was introduced in ArcGIS Enterprise 10.5 and in ArcGIS Online. This capability is currently only available for feature layers. Further, ArcGIS sets some limits when overwriting feature layers:\n", "\n", " 1. The name of the file that used to update in step 4 above should match the original file name of the item.\n", " 2. The schema -- number of layers (applicable when your original file is a file geodatabase / shape file / service definition), and the name and number of attribute columns should remain the same as before.\n", " \n", "The **method 2** explained in this sample is much simpler compared to **method 1** explained in [Updating features in a feature layer](https://developers.arcgis.com/python/sample-notebooks/updating-features-in-a-feature-layer/). However, we cannot make use of the third spreadsheet which has the additional columns for our capitals. To do that, we would first update the features through overwriting, then edit the definition of the feature layer to add new columns and then edit each feature and add the appropriate column values, similar to that explained in method 1." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Publish the cities feature layer using the initial dataset" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
city_idnamestatecapitalpop2000pop2007longitudelatitude
01HonoluluHIState371657378587-157.82343621.305782
12JuneauAKState3071131592-134.51158258.351418
23Boise CityIDState185787203529-116.23765543.613736
34OlympiaWAState2751445523-122.89307347.042418
45SalemORState136924152039-123.02915544.931109
\n", "
" ], "text/plain": [ " city_id name state capital pop2000 pop2007 longitude latitude\n", "0 1 Honolulu HI State 371657 378587 -157.823436 21.305782\n", "1 2 Juneau AK State 30711 31592 -134.511582 58.351418\n", "2 3 Boise City ID State 185787 203529 -116.237655 43.613736\n", "3 4 Olympia WA State 27514 45523 -122.893073 47.042418\n", "4 5 Salem OR State 136924 152039 -123.029155 44.931109" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read the initial csv\n", "csv1 = 'data/updating_gis_content/capitals_1.csv'\n", "cities_df_1 = pd.read_csv(csv1)\n", "cities_df_1.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(19, 8)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# print the number of records in this csv\n", "cities_df_1.shape" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", " \n", " \n", " \n", "
\n", "\n", "
\n", " USA Capitals spreadsheet 2\n", " \n", "
CSV by arcgis_python\n", "
Last Modified: April 27, 2017\n", "
0 comments, 0 views\n", "
\n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# add the csv as an item\n", "item_prop = {'title':'USA Capitals spreadsheet 2'}\n", "csv_item = gis.content.add(item_properties=item_prop, data=csv1)\n", "csv_item" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", " \n", " \n", " \n", "
\n", "\n", "
\n", " USA Capitals spreadsheet 2\n", " \n", "
Feature Layer Collection by arcgis_python\n", "
Last Modified: April 27, 2017\n", "
0 comments, 0 views\n", "
\n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# publish the csv item into a feature layer\n", "cities_item = csv_item.publish()\n", "cities_item" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "
\n", " \n", " \n", " \n", "
\n", "\n", "
\n", " USA Capitals 2\n", " \n", "
Feature Layer Collection by arcgis_python\n", "
Last Modified: April 27, 2017\n", "
0 comments, 0 views\n", "
\n", "
\n", " " ], "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# update the item metadata\n", "item_prop = {'title':'USA Capitals 2'}\n", "cities_item.update(item_properties = item_prop, \n", " thumbnail='data/updating_gis_content/capital_cities.png')\n", "cities_item" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": true }, "outputs": [], "source": [ "map1 = gis.map('USA')\n", "map1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![original web layer](http://esri.github.io/arcgis-python-api/notebooks/nbimages/05_overwriting_feature_layers_01.PNG)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "map1.add_layer(cities_item)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'https://services7.arcgis.com/JEwYeAy2cc8qOe3o/arcgis/rest/services/USA_Capitals_spreadsheet_2/FeatureServer'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cities_item.url" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Merge updates from spreadsheet 2 with 1\n", "The next set of updates have arrived and are stored in `capitals_2.csv`. We are told it contains corrections for the original set of features and also has new features.\n", "\n", "Instead of applying the updates one at a time, we will merge both the spreadsheets into a new one." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
city_idnamestatecapitalpop2000pop2007longitudelatitude
020Baton RougeLAState227818228810-91.14022730.458091
121HelenaMTState2578026007-112.02702746.595809
222BismarckNDState5553259344-100.77900046.813346
323PierreSDState1387614169-100.33638244.367964
424St. PaulMNState287151291643-93.11411844.954364
\n", "
" ], "text/plain": [ " city_id name state capital pop2000 pop2007 longitude latitude\n", "0 20 Baton Rouge LA State 227818 228810 -91.140227 30.458091\n", "1 21 Helena MT State 25780 26007 -112.027027 46.595809\n", "2 22 Bismarck ND State 55532 59344 -100.779000 46.813346\n", "3 23 Pierre SD State 13876 14169 -100.336382 44.367964\n", "4 24 St. Paul MN State 287151 291643 -93.114118 44.954364" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read the second csv set\n", "csv2 = 'data/updating_gis_content/capitals_2.csv'\n", "cities_df_2 = pd.read_csv(csv2)\n", "cities_df_2.head(5)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(36, 8)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# get the dimensions of this csv\n", "cities_df_2.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us `append` the spreadsheets 1 and 2 and store it in a DataFrame called `updated_df`. Note, this step introduces duplicate rows that were updated in spreadsheet 2." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(55, 8)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "updated_df = cities_df_1.append(cities_df_2)\n", "updated_df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, we must drop the duplicate rows. Note, in this sample, the `city_id` column has unique values and is present in all spreadsheets. Thus, we are able to determine duplicate rows using this column and drop them." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(51, 8)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "updated_df.drop_duplicates(subset='city_id', keep='last', inplace=True)\n", "# we specify argument keep = 'last' to retain edits from second spreadsheet\n", "updated_df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Thus we have dropped 4 rows from spreadsheet 1 and retained the same 4 rows with updated values from spreadsheet 2. Let us see how the DataFrame looks so far:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
city_idnamestatecapitalpop2000pop2007longitudelatitude
01HonoluluHIState371657378587-157.82343621.305782
12JuneauAKState3071131592-134.51158258.351418
23Boise CityIDState185787203529-116.23765543.613736
45SalemORState136924152039-123.02915544.931109
56CarsonNVState5245756641-119.75387339.160946
\n", "
" ], "text/plain": [ " city_id name state capital pop2000 pop2007 longitude latitude\n", "0 1 Honolulu HI State 371657 378587 -157.823436 21.305782\n", "1 2 Juneau AK State 30711 31592 -134.511582 58.351418\n", "2 3 Boise City ID State 185787 203529 -116.237655 43.613736\n", "4 5 Salem OR State 136924 152039 -123.029155 44.931109\n", "5 6 Carson NV State 52457 56641 -119.753873 39.160946" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "updated_df.head(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Write the updates to disk\n", "Let us create a new folder called `updated_capitals_csv` and write the updated features to a csv with the same name as our first csv file." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import os\n", "if not os.path.exists('data/updating_gis_content/updated_capitals_csv'):\n", " os.mkdir('data/updating_gis_content/updated_capitals_csv')" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "updated_df.to_csv('data/updating_gis_content/updated_capitals_csv/capitals_1.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overwrite the feature layer\n", "Let us overwrite the feature layer using the new csv file we just created. To overwrite, we will use the `overwrite()` method." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "from arcgis.features import FeatureLayerCollection\n", "cities_flayer_collection = FeatureLayerCollection.fromitem(cities_item)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'success': True}" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#call the overwrite() method which can be accessed using the manager property\n", "cities_flayer_collection.manager.overwrite('data/updating_gis_content/updated_capitals_csv/capitals_1.csv')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "### Access the overwritten feature layer\n", "Let us query the feature layer and verify the number of features has increased to `51`." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "51" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cities_flayer = cities_item.layers[0] #there is only 1 layer\n", "cities_flayer.query(return_count_only=True) #get the total number of features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let us draw this new layer in map" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": true }, "outputs": [], "source": [ "map2 = gis.map(\"USA\")\n", "map2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![overwritten web layer](http://esri.github.io/arcgis-python-api/notebooks/nbimages/05_overwriting_feature_layers_02.PNG)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": true }, "outputs": [], "source": [ "map2.add_layer(cities_item)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As seen from the map, the number of features has increased while the symbology while the attribute columns remain the same as original." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Conclusion\n", "Thus, in this sample, we observed how update a feature layer by overwriting it with new content. This method is a lot simpler than method 1 explained in [Updating features in a feature layer](https://developers.arcgis.com/python/sample-notebooks/updating-features-in-a-feature-layer/) sample. However, with this simplicity, we compromise on our ability to add new columns or change the schema of the feature layer during the update. Further, if your feature layer was updated after it was published, then those updates get overwritten when you perform the overwrite operation. To retain those edits, [extract the data](https://developers.arcgis.com/python/guide/checking-out-data-from-feature-layers-using-replicas/#Verify-Extract-capability) from the feature layer, merge your updates with this extract, then overwrite the feature layer." ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 1 }