\n
Forest Fire Prediction through KMeans Clustering
\n
\n
The United States Forest Service provides datasets that describe forest fires that have occurred in Canada and the United States since year 2000. We can predict where forest fires are prone to occur by partitioning the locations of past burns into clusters whose centroids can be used to optimally place heavy fire fighting equipment as near as possible to where fires are likely to occur.
\n
Dataset:
https://fsapps.nwcg.gov/gisdata.php
\n
"}]},"apps":[],"jobName":"paragraph_1508822183588_673811793","id":"20171024-051623_427668146","dateCreated":"2017-10-24T05:16:23+0000","dateStarted":"2017-10-25T15:23:14+0000","dateFinished":"2017-10-25T15:23:14+0000","status":"FINISHED","progressUpdateIntervalMs":500,"focus":true,"$$hashKey":"object:4333"},{"title":"Download Raw Data ","text":"%sh\nmkdir -p /mapr/my.cluster.com/user/mapr/data/new/fires\ncd /mapr/my.cluster.com/user/mapr/data/new/fires\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2016_365_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2015_365_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2014_365_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2013_365_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2012_366_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2011_365_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2010_365_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2009_365_conus_shapefile.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2008_005_01_conus_shp.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2007_005_01_conus_shp.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2006_005_01_conus_shp.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2005_005_01_conus_shp.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2004_005_01_conus_shp.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2003_005_01_conus_shp.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2002_005_01_conus_shp.zip\ncurl -s --remote-name https://fsapps.nwcg.gov/afm/data/fireptdata/mcd14ml_2001_005_01_conus_shp.zip\nfind modis*.zip | xargs -I {} unzip {} modis*.dbf\nfind mcd*.zip | xargs -I {} unzip {} mcd*.dbf","user":"anonymous","dateUpdated":"2017-11-10T05:20:42+0000","config":{"colWidth":12,"enabled":true,"results":{},"editorSetting":{"language":"sh","editOnDblClick":false},"editorMode":"ace/mode/sh","editorHide":false,"tableHide":false,"title":true},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"Archive: modis_fire_2009_365_conus_shapefile.zip\n inflating: modis_fire_2009_365_conus.dbf \nArchive: modis_fire_2010_365_conus_shapefile.zip\n inflating: modis_fire_2010_365_conus.dbf \nArchive: modis_fire_2011_365_conus_shapefile.zip\n inflating: modis_fire_2011_365_conus.dbf \nArchive: modis_fire_2012_366_conus_shapefile.zip\n inflating: modis_fire_2012_366_conus.dbf \nArchive: modis_fire_2013_365_conus_shapefile.zip\n inflating: modis_fire_2013_365_conus.dbf \nArchive: modis_fire_2014_365_conus_shapefile.zip\n inflating: modis_fire_2014_365_conus.dbf \nArchive: modis_fire_2015_365_conus_shapefile.zip\n inflating: modis_fire_2015_365_conus.dbf \nArchive: modis_fire_2016_365_conus_shapefile.zip\n inflating: modis_fire_2016_365_conus.dbf \nArchive: mcd14ml_2001_005_01_conus_shp.zip\n inflating: mcd14ml_2001_005_01_conus.dbf \nArchive: mcd14ml_2002_005_01_conus_shp.zip\n inflating: mcd14ml_2002_005_01_conus.dbf \nArchive: mcd14ml_2003_005_01_conus_shp.zip\n inflating: mcd14ml_2003_005_01_conus.dbf \nArchive: mcd14ml_2004_005_01_conus_shp.zip\n inflating: mcd14ml_2004_005_01_conus.dbf \nArchive: mcd14ml_2005_005_01_conus_shp.zip\n inflating: mcd14ml_2005_005_01_conus.dbf \nArchive: mcd14ml_2006_005_01_conus_shp.zip\n inflating: mcd14ml_2006_005_01_conus.dbf \nArchive: mcd14ml_2007_005_01_conus_shp.zip\n inflating: mcd14ml_2007_005_01_conus.dbf \nArchive: mcd14ml_2008_005_01_conus_shp.zip\n inflating: mcd14ml_2008_005_01_conus.dbf \n"}]},"apps":[],"jobName":"paragraph_1508875968857_1867429893","id":"20171024-201248_1170429857","dateCreated":"2017-10-24T20:12:48+0000","dateStarted":"2017-11-10T05:20:43+0000","dateFinished":"2017-11-10T05:21:21+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4334"},{"title":"Backup original dataset","text":"%sh\nmaprcli volume snapshot create -cluster my.cluster.com -snapshotname USFS_Experiment-`date +%Y%m%d%H%M%S` -volume mapr_home\nls -la /mapr/my.cluster.com/user/mapr/.snapshot/\n\n","user":"anonymous","dateUpdated":"2017-11-10T05:30:03+0000","config":{"colWidth":12,"enabled":true,"results":{},"editorSetting":{"language":"sh","editOnDblClick":false},"editorMode":"ace/mode/sh","title":true},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"total 3\ndrwxr-xr-x 5 root root 3 Nov 10 05:30 .\ndrwxr-xr-x 9 mapr mapr 14 Nov 9 23:50 ..\ndrwxr-xr-x 9 mapr mapr 14 Nov 9 23:50 USFS_Experiment\ndrwxr-xr-x 9 mapr mapr 14 Nov 9 23:50 USFS_Experiment-20171110052951\ndrwxr-xr-x 9 mapr mapr 14 Nov 9 23:50 USFS_Experiment-20171110053003\n"}]},"apps":[],"jobName":"paragraph_1510291334892_1101351676","id":"20171110-052214_828862000","dateCreated":"2017-11-10T05:22:14+0000","dateStarted":"2017-11-10T05:30:03+0000","dateFinished":"2017-11-10T05:30:07+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4335"},{"title":"Convert shapefiles to CSVs","text":"%python\nimport csv\nfrom dbfpy import dbf\nimport os\nimport sys\nDATADIR='/mapr/my.cluster.com/user/mapr/data/fires/'\n\nfor filename in os.listdir(DATADIR):\n\n if filename.endswith('.dbf'):\n print \"Converting %s to csv\" % filename\n csv_fn = DATADIR+filename[:-4]+ \".csv\"\n with open(csv_fn,'wb') as csvfile:\n in_db = dbf.Dbf(DATADIR+filename)\n out_csv = csv.writer(csvfile)\n names = []\n for field in in_db.header.fields:\n names.append(field.name)\n out_csv.writerow(names)\n for rec in in_db:\n out_csv.writerow(rec.fieldData)\n in_db.close()\n print \"Done...\"\n\n","user":"anonymous","dateUpdated":"2017-11-10T05:30:13+0000","config":{"colWidth":12,"enabled":true,"results":{},"editorSetting":{"language":"python","editOnDblClick":false},"editorMode":"ace/mode/python","title":true,"tableHide":true,"editorHide":false},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"Converting mcd14ml_2002_005_01_conus.dbf to csv\nDone...\nConverting modis_fire_2009_365_conus.dbf to csv\nDone...\nConverting modis_fire_2013_365_conus.dbf to csv\nDone...\nConverting mcd14ml_2005_005_01_conus.dbf to csv\nDone...\nConverting modis_fire_2016_365_conus.dbf to csv\nDone...\nConverting modis_fire_2015_365_conus.dbf to csv\nDone...\nConverting mcd14ml_2004_005_01_conus.dbf to csv\nDone...\nConverting modis_fire_2011_365_conus.dbf to csv\nDone...\nConverting mcd14ml_2001_005_01_conus.dbf to csv\nDone...\nConverting modis_fire_2012_366_conus.dbf to csv\nDone...\nConverting mcd14ml_2003_005_01_conus.dbf to csv\nDone...\nConverting modis_fire_2014_365_conus.dbf to csv\nDone...\nConverting mcd14ml_2007_005_01_conus.dbf to csv\nDone...\nConverting mcd14ml_2008_005_01_conus.dbf to csv\nDone...\nConverting mcd14ml_2006_005_01_conus.dbf to csv\nDone...\nConverting modis_fire_2010_365_conus.dbf to csv\nDone...\n"}]},"apps":[],"jobName":"paragraph_1508876509276_-96747179","id":"20171024-202149_2037979424","dateCreated":"2017-10-24T20:21:49+0000","dateStarted":"2017-11-10T05:30:13+0000","dateFinished":"2017-11-10T05:33:04+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4336"},{"title":"Import Spark ML Libraries","text":"import org.apache.spark._\nimport org.apache.spark.rdd.RDD\nimport org.apache.spark.sql.types._\nimport org.apache.spark.sql.functions._\nimport org.apache.spark.sql._\nimport org.apache.spark._\nimport org.apache.spark.ml.feature.StringIndexer\nimport org.apache.spark.ml.feature.VectorAssembler\nimport org.apache.spark.ml.clustering.KMeans\nimport org.apache.spark.ml.clustering.KMeansModel\nimport org.apache.spark.mllib.linalg.Vectors","user":"anonymous","dateUpdated":"2017-11-10T05:32:11+0000","config":{"colWidth":12,"editorMode":"ace/mode/scala","results":[{"graph":{"mode":"table","height":300,"optionOpen":false,"keys":[],"values":[],"groups":[],"scatter":{}}}],"enabled":true,"editorSetting":{"language":"scala"},"tableHide":true,"title":true,"editorHide":false},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"\nimport org.apache.spark._\n\nimport org.apache.spark.rdd.RDD\n\nimport org.apache.spark.sql.types._\n\nimport org.apache.spark.sql.functions._\n\nimport org.apache.spark.sql._\n\nimport org.apache.spark._\n\nimport org.apache.spark.ml.feature.StringIndexer\n\nimport org.apache.spark.ml.feature.VectorAssembler\n\nimport org.apache.spark.ml.clustering.KMeans\n\nimport org.apache.spark.ml.clustering.KMeansModel\n\nimport org.apache.spark.mllib.linalg.Vectors\n"}]},"apps":[],"jobName":"paragraph_1508821773190_2078330813","id":"20161030-025214_1655763979","dateCreated":"2017-10-24T05:09:33+0000","dateStarted":"2017-11-10T05:32:11+0000","dateFinished":"2017-11-10T05:32:17+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4337"},{"title":"Define schema for datasets since 2009","text":"import sqlContext.implicits._\nimport sqlContext._\nval modis_schema = StructType(Array(\n StructField(\"area\", DoubleType, true),\n StructField(\"perimeter\", DoubleType, true),\n StructField(\"firenum\", IntegerType, true), \n StructField(\"fire_id\", IntegerType, true), \n StructField(\"latitude\", DoubleType, true),\n StructField(\"longitude\", DoubleType, true),\n StructField(\"date\", TimestampType, true),\n StructField(\"julian\", IntegerType, true),\n StructField(\"gmt\", IntegerType, true),\n StructField(\"temp\", DoubleType, true), \n StructField(\"spix\", DoubleType, true), \n StructField(\"tpix\", DoubleType, true), \n StructField(\"src\", StringType, true),\n StructField(\"sat_src\", StringType, true), \n StructField(\"conf\", IntegerType, true),\n StructField(\"frp\", DoubleType, true)\n))","user":"anonymous","dateUpdated":"2017-11-10T05:33:55+0000","config":{"colWidth":12,"results":[{"graph":{"mode":"table","height":300,"optionOpen":false,"keys":[],"values":[],"groups":[],"scatter":{}}}],"enabled":true,"editorSetting":{"language":"scala"},"editorMode":"ace/mode/scala","tableHide":true,"title":true},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"\nimport sqlContext.implicits._\n\nimport sqlContext._\n\nmodis_schema: org.apache.spark.sql.types.StructType = StructType(StructField(area,DoubleType,true), StructField(perimeter,DoubleType,true), StructField(firenum,IntegerType,true), StructField(fire_id,IntegerType,true), StructField(latitude,DoubleType,true), StructField(longitude,DoubleType,true), StructField(date,TimestampType,true), StructField(julian,IntegerType,true), StructField(gmt,IntegerType,true), StructField(temp,DoubleType,true), StructField(spix,DoubleType,true), StructField(tpix,DoubleType,true), StructField(src,StringType,true), StructField(sat_src,StringType,true), StructField(conf,IntegerType,true), StructField(frp,DoubleType,true))\n"}]},"apps":[],"jobName":"paragraph_1508821773191_2077946064","id":"20161030-030543_519944270","dateCreated":"2017-10-24T05:09:33+0000","dateStarted":"2017-11-10T05:33:56+0000","dateFinished":"2017-11-10T05:34:00+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4338"},{"title":"Define schema for datasets before 2009","text":"import sqlContext.implicits._\nimport sqlContext._\nval mcd14ml_schema = StructType(Array(\n StructField(\"area\", DoubleType, true),\n StructField(\"perimeter\", DoubleType, true),\n StructField(\"mcd14ml_\", IntegerType, true), \n StructField(\"latitude\", DoubleType, true), \n StructField(\"longitude\", DoubleType, true),\n StructField(\"t21\", DoubleType, true),\n StructField(\"t31\", DoubleType, true),\n StructField(\"spix\", DoubleType, true),\n StructField(\"tpix\", DoubleType, true),\n StructField(\"date\", TimestampType, true), \n StructField(\"jdate\", IntegerType, true), \n StructField(\"utc\", IntegerType, true), \n StructField(\"satellite\", StringType, true),\n StructField(\"frp\", DoubleType, true), \n StructField(\"confidence\", IntegerType, true)\n))","user":"anonymous","dateUpdated":"2017-11-10T05:34:04+0000","config":{"colWidth":12,"results":[{"graph":{"mode":"table","height":300,"optionOpen":false,"keys":[],"values":[],"groups":[],"scatter":{}}}],"enabled":true,"editorSetting":{"language":"scala"},"editorMode":"ace/mode/scala","tableHide":true,"title":true,"editorHide":false},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"\nimport sqlContext.implicits._\n\nimport sqlContext._\n\nmcd14ml_schema: org.apache.spark.sql.types.StructType = StructType(StructField(area,DoubleType,true), StructField(perimeter,DoubleType,true), StructField(mcd14ml_,IntegerType,true), StructField(latitude,DoubleType,true), StructField(longitude,DoubleType,true), StructField(t21,DoubleType,true), StructField(t31,DoubleType,true), StructField(spix,DoubleType,true), StructField(tpix,DoubleType,true), StructField(date,TimestampType,true), StructField(jdate,IntegerType,true), StructField(utc,IntegerType,true), StructField(satellite,StringType,true), StructField(frp,DoubleType,true), StructField(confidence,IntegerType,true))\n"}]},"apps":[],"jobName":"paragraph_1508987801157_642409146","id":"20171026-031641_291214581","dateCreated":"2017-10-26T03:16:41+0000","dateStarted":"2017-11-10T05:34:04+0000","dateFinished":"2017-11-10T05:34:09+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4339"},{"title":"Load Raw Data","text":"// Load datasets containing years 2009-2016\nval df_modis_all = sqlContext.read.format(\"com.databricks.spark.csv\").option(\"header\", \"true\").schema(modis_schema).load(\"/user/mapr/data/fires/modis*.csv\")\n// Include only fires with coordinates in Cascadia\nval df_modis = df_modis_all.filter($\"latitude\" > 42).filter($\"latitude\" < 50).filter($\"longitude\" > -124).filter($\"longitude\" < -110)\n// Load datasets containing years 2000-2008\nval df_mcd14ml_all = sqlContext.read.format(\"com.databricks.spark.csv\").option(\"header\", \"true\").schema(modis_schema).load(\"/user/mapr/data/fires/mcd14ml*.csv\")\n// Include only fires with coordinates in Cascadia\nval df_mcd14ml = df_mcd14ml_all.filter($\"latitude\" > 42).filter($\"latitude\" < 50).filter($\"longitude\" > -124).filter($\"longitude\" < -110)\n// Join both datasets\nval df = df_modis.union(df_mcd14ml).select($\"latitude\", $\"longitude\")","user":"anonymous","dateUpdated":"2017-11-10T05:34:28+0000","config":{"colWidth":12,"results":[{"graph":{"mode":"table","height":300,"optionOpen":false,"keys":[],"values":[],"groups":[],"scatter":{}}}],"enabled":true,"editorSetting":{"language":"scala"},"editorMode":"ace/mode/scala","title":true},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"\ndf_modis_all: org.apache.spark.sql.DataFrame = [area: double, perimeter: double ... 14 more fields]\n\ndf_modis: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [area: double, perimeter: double ... 14 more fields]\n\ndf_mcd14ml_all: org.apache.spark.sql.DataFrame = [area: double, perimeter: double ... 14 more fields]\n\ndf_mcd14ml: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [area: double, perimeter: double ... 14 more fields]\n\ndf: org.apache.spark.sql.DataFrame = [latitude: double, longitude: double]\n"}]},"apps":[],"jobName":"paragraph_1508821773191_2077946064","id":"20161030-030618_394385178","dateCreated":"2017-10-24T05:09:33+0000","dateStarted":"2017-11-10T05:34:29+0000","dateFinished":"2017-11-10T05:34:32+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4340"},{"title":"What does this data look like, anyway?","text":"df_modis.show(10)\ndf.count()","user":"anonymous","dateUpdated":"2017-11-10T05:36:34+0000","config":{"colWidth":12,"enabled":true,"results":{},"editorSetting":{"language":"scala"},"editorMode":"ace/mode/scala","title":true},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"+----+---------+-------+-------+--------+---------+--------------------+------+----+-----+----+----+----+-------+----+-----+\n|area|perimeter|firenum|fire_id|latitude|longitude| date|julian| gmt| temp|spix|tpix| src|sat_src|conf| frp|\n+----+---------+-------+-------+--------+---------+--------------------+------+----+-----+----+----+----+-------+----+-----+\n| 0.0| 0.0| 1| 760349| 49.455| -120.222|2012-12-02 00:00:...| 337|1938|327.4| 1.8| 1.3|ssec| T| 80|104.9|\n| 0.0| 0.0| 2| 760350| 49.455| -120.222|2012-12-02 00:00:...| 337|1940|327.7| 1.8| 1.3|gsfc| T| 81|105.8|\n| 0.0| 0.0| 10| 751851| 49.447| -118.341|2012-11-08 00:00:...| 313|1849|341.3| 1.1| 1.0|ssec| T| 91| 69.6|\n| 0.0| 0.0| 11| 751852| 49.447| -118.341|2012-11-08 00:00:...| 313|1849|341.3| 1.1| 1.0|rsac| T| 91| 69.6|\n| 0.0| 0.0| 12| 751853| 49.447| -118.341|2012-11-08 00:00:...| 313|1850|341.0| 1.0| 1.0|gsfc| T| 90| 68.8|\n| 0.0| 0.0| 18| 368270| 49.442| -118.334|2012-11-08 00:00:...| 313|2030|315.6| 1.1| 1.0|rsac| A| 0| 30.1|\n| 0.0| 0.0| 19| 368272| 49.442| -118.334|2012-11-08 00:00:...| 313|2035|315.6| 1.1| 1.0|gsfc| A| 0| 30.5|\n| 0.0| 0.0| 20| 368271| 49.442| -118.334|2012-11-08 00:00:...| 313|2032|315.6| 1.1| 1.0|ssec| A| 0| 30.1|\n| 0.0| 0.0| 24| 751847| 49.438| -118.343|2012-11-08 00:00:...| 313|1849|310.1| 1.1| 1.0|ssec| T| 29| 24.1|\n| 0.0| 0.0| 25| 751846| 49.438| -118.343|2012-11-08 00:00:...| 313|1849|310.1| 1.1| 1.0|rsac| T| 29| 24.1|\n+----+---------+-------+-------+--------+---------+--------------------+------+----+-----+----+----+----+-------+----+-----+\nonly showing top 10 rows\n\n\nres29: Long = 382059\n"}]},"apps":[],"jobName":"paragraph_1508993481697_820542127","id":"20171026-045121_278676373","dateCreated":"2017-10-26T04:51:21+0000","dateStarted":"2017-11-10T05:36:34+0000","dateFinished":"2017-11-10T05:36:39+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4341"},{"title":"Train the KMeans model for 100 clusters","text":"val featureCols = Array(\"latitude\", \"longitude\")\nval assembler = new VectorAssembler().setInputCols(featureCols).setOutputCol(\"features\")\nval df2 = assembler.transform(df)\nval Array(trainingData, testData) = df2.randomSplit(Array(0.7, 0.3), 5043)\n\nval kmeans = new KMeans().setK(100).setFeaturesCol(\"features\").setMaxIter(4)\nval model = kmeans.fit(trainingData)\nprintln(\"Final Centers: \")\nmodel.clusterCenters.foreach(println)","user":"anonymous","dateUpdated":"2017-11-10T05:37:49+0000","config":{"colWidth":12,"results":[{"graph":{"mode":"table","height":300,"optionOpen":false,"keys":[],"values":[],"groups":[],"scatter":{}}}],"enabled":true,"editorSetting":{"language":"scala"},"editorMode":"ace/mode/scala","title":true},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[{"type":"TEXT","data":"\nfeatureCols: Array[String] = Array(latitude, longitude)\n\nassembler: org.apache.spark.ml.feature.VectorAssembler = vecAssembler_0cd76fa7fa8a\n\ndf2: org.apache.spark.sql.DataFrame = [latitude: double, longitude: double ... 1 more field]\n\n\ntrainingData: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [latitude: double, longitude: double ... 1 more field]\ntestData: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [latitude: double, longitude: double ... 1 more field]\n\nkmeans: org.apache.spark.ml.clustering.KMeans = kmeans_2935193b4f9b\n\nmodel: org.apache.spark.ml.clustering.KMeansModel = kmeans_2935193b4f9b\nFinal Centers: \n[42.680210355177884,-111.59725937968977]\n[43.46119708737863,-123.4865427184466]\n[42.442929411764744,-117.53071372549019]\n[46.0875555350554,-117.83899317343179]\n[45.41602265641227,-114.0714599875284]\n[47.55513630540607,-113.53880297207272]\n[48.053261769911444,-120.0575677876107]\n[44.56749924698794,-120.14084864457843]\n[44.82837657623227,-115.5686400458541]\n[47.19294624505929,-116.3193355731225]\n[46.0001227998363,-121.42910908718788]\n[44.8225060851927,-121.18407758620698]\n[48.33468846079004,-118.85384316014812]\n[42.190387077294645,-120.67088979468606]\n[44.02496713615028,-110.15051525821596]\n[48.84425513196482,-122.80656598240468]\n[42.31075105485234,-122.05234358047026]\n[47.551169991326965,-110.58471465741545]\n[48.366605830765494,-119.25822801611778]\n[48.64446906398731,-117.12346694870438]\n[43.585575029080985,-114.48100155098888]\n[44.658283501161904,-121.77650561580175]\n[44.2141452775774,-118.80088211968264]\n[44.55109049773757,-111.99945135746607]\n[44.18748645226641,-115.5765583692076]\n[46.32117078916372,-123.24499646643106]\n[42.52315782286375,-123.75475790089729]\n[47.83102030838668,-113.02192553591584]\n[47.88128612303293,-115.86531330472098]\n[47.12880022363031,-120.65368169958985]\n[49.058754511120405,-116.16455518254297]\n[46.380826304691595,-113.73693647865049]\n[42.44345835543764,-115.48029708222808]\n[48.59209729418539,-120.00220245634236]\n[45.832998529411775,-119.65369852941178]\n[42.80554616087751,-118.89474588665446]\n[45.599176134333675,-116.27001750625217]\n[48.933861839804116,-118.36850682056688]\n[49.048490677134495,-114.68882433758593]\n[46.53233677991136,-120.07666395864108]\n[46.81406035665295,-111.96022908093278]\n[47.1773843137255,-117.76145490196078]\n[43.81896350364965,-111.10441423357665]\n[43.16925459375698,-122.42875027673234]\n[44.98837316293926,-117.31802939297128]\n[45.092147323173165,-116.45970379054316]\n[43.94497546728972,-117.09796144859811]\n[47.73799279279282,-111.90307207207206]\n[46.80794588449294,-115.59445702592082]\n[43.30027756410261,-114.9767320512819]\n[46.09602029064988,-115.33729366602705]\n[48.499308211473554,-113.3735359955005]\n[44.98750756756764,-118.83719081081081]\n[44.40895686591275,-114.98714361874055]\n[47.24852644492911,-116.94616739367501]\n[45.91029123173278,-111.63577557411271]\n[48.003583602584705,-118.1010024232631]\n[47.21107859888938,-112.88779026057243]\n[45.47023435063486,-115.44834172421476]\n[43.238007769145376,-116.86420588235295]\n[46.941102733270526,-114.56711545711585]\n[46.65701529902643,-117.49917941585535]\n[45.94805246913578,-114.46334876543213]\n[48.68818710324091,-119.61178349482135]\n[48.989644295302,-120.55701135776974]\n[46.945085929108494,-119.16613748657356]\n[44.985136440391756,-110.62700236406614]\n[43.04288829787237,-120.98841445035451]\n[44.484553467271546,-122.90281594296823]\n[48.713811594202916,-110.6999927536232]\n[44.41465921945706,-117.99498783936656]\n[47.60024606580833,-120.80637696709574]\n[43.30823627998692,-118.22460795267843]\n[45.525432050701745,-114.76495699411534]\n[45.06997001303781,-123.51793546284232]\n[46.03363036649214,-118.63290209424083]\n[42.411339405560874,-113.01552540747844]\n[47.353657579062165,-122.92253217011991]\n[42.53206368330465,-122.91702237521517]\n[46.33238826102806,-116.22532664965372]\n[48.13757210578842,-120.63054391217578]\n[43.42515586116568,-112.30286116568435]\n[47.41553032015065,-120.28863050847455]\n[43.98151189785259,-116.05173766686019]\n[44.031103838245336,-119.47318060315274]\n[47.7202422171602,-114.89755618830677]\n[43.36918918918921,-110.43353443766351]\n[42.23112643678158,-112.30654844006571]\n[48.53467159450896,-115.09414625131996]\n[42.19413026052105,-118.14942525050081]\n[45.980087702573876,-116.88747473784561]\n[46.2978552056652,-114.7551792978756]\n[43.361500727096455,-115.6408245273873]\n[42.509375226586116,-110.5823160120845]\n[44.030212504652035,-121.63664123557874]\n[45.64626304464767,-120.59849650349648]\n[43.691489661882756,-115.3556038676723]\n[44.70506696208765,-114.67740915805025]\n[48.59285780346821,-114.0871531791909]\n[42.57220040180812,-114.04146408839789]\n"}]},"apps":[],"jobName":"paragraph_1508821773192_2076022320","id":"20161030-041240_922666463","dateCreated":"2017-10-24T05:09:33+0000","dateStarted":"2017-11-10T05:37:49+0000","dateFinished":"2017-11-10T05:38:44+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4342"},{"title":"Save the Model","text":"model.write.overwrite().save(\"/user/mapr/data/save_fire_model\")","user":"anonymous","dateUpdated":"2017-11-10T05:44:53+0000","config":{"colWidth":12,"enabled":true,"results":{},"editorSetting":{"language":"scala"},"editorMode":"ace/mode/scala","title":true,"editorHide":false},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[]},"apps":[],"jobName":"paragraph_1508825252627_49727175","id":"20171024-060732_541192335","dateCreated":"2017-10-24T06:07:32+0000","dateStarted":"2017-11-10T05:44:53+0000","dateFinished":"2017-11-10T05:44:54+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4343"},{"title":"Setup Map (save centroids)","text":"z.angularBind(\"centroid\", model.clusterCenters)","user":"anonymous","dateUpdated":"2017-11-10T05:44:58+0000","config":{"colWidth":12,"graph":{"mode":"table","height":300,"optionOpen":false,"keys":[],"values":[],"groups":[],"scatter":{}},"enabled":true,"results":{},"editorSetting":{"language":"scala"},"editorMode":"ace/mode/scala","title":true,"tableHide":true,"editorHide":false},"settings":{"params":{},"forms":{}},"results":{"code":"SUCCESS","msg":[]},"apps":[],"jobName":"paragraph_1508821773196_2074483324","id":"20161116-075433_1562509402","dateCreated":"2017-10-24T05:09:33+0000","dateStarted":"2017-11-10T05:44:58+0000","dateFinished":"2017-11-10T05:44:59+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4344"},{"title":"Setup Map (initialize angular)","text":"%angular\n\n\n
Operationalizing the model
\n
Which staging area should respond when a new forest fire starts?
We can use our previously saved model to answer that question. The following section shows how the model we built above can be used to answer that question and how it can be applied to a live feed of newly detected fires for rapid fire response.
\n
"}]},"apps":[],"jobName":"paragraph_1510290203666_-866429867","id":"20171110-050323_189619447","dateCreated":"2017-11-10T05:03:23+0000","dateStarted":"2017-11-10T05:45:09+0000","dateFinished":"2017-11-10T05:45:09+0000","status":"FINISHED","progressUpdateIntervalMs":500,"$$hashKey":"object:4347"},{"title":"Which fire station (centroid) should respond to a new fire?","text":"val featureCols = Array(\"lat\", \"lon\")\nval assembler = new VectorAssembler().setInputCols(featureCols).setOutputCol(\"features\")\nval fire_location = Seq((42.3,-112.2 )).toDF(\"lat\", \"lon\")\nval df3 = assembler.transform(fire_location)\nval categories = model.transform(df3)\nval centroid_id = categories.select(\"prediction\").rdd.map(r => r(0)).collect()(0).asInstanceOf[Int]\nval centroid_coordinate = model.clusterCenters(centroid_id)\nprintln(\"%html