{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Castile and Leon: Crops\n", "\n", "### Description:\n", "TODO\n", "\n", "### Author:\n", "Sergio García Prado ([garciparedes.me](https://garciparedes.me))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setting up environment:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Cleaning up" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "inputHidden": false, "outputHidden": false, "scrolled": false }, "outputs": [], "source": [ "rm(list = ls())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Library Imports" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "scrolled": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "Attaching package: ‘dplyr’\n", "\n", "The following objects are masked from ‘package:stats’:\n", "\n", " filter, lag\n", "\n", "The following objects are masked from ‘package:base’:\n", "\n", " intersect, setdiff, setequal, union\n", "\n", "\n", "Attaching package: ‘reshape2’\n", "\n", "The following object is masked from ‘package:tidyr’:\n", "\n", " smiths\n", "\n", "\n", "Attaching package: ‘lubridate’\n", "\n", "The following object is masked from ‘package:base’:\n", "\n", " date\n", "\n" ] } ], "source": [ "library(readr)\n", "library(ggplot2)\n", "library(dplyr)\n", "library(tidyr)\n", "library(RSocrata)\n", "library(ca)\n", "library(forcats)\n", "library(reshape2)\n", "library(lubridate)\n", "library(repr)\n", "library(stringr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Adquisition:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "inputHidden": false, "outputHidden": false, "scrolled": false }, "outputs": [], "source": [ "my.read.csv <- function(url, filename) {\n", " if (!file.exists(filename)) {\n", " write_csv(read.socrata(url), filename)\n", " }\n", " return(read_csv(filename))\n", "}" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "inputHidden": false, "outputHidden": false, "scrolled": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Parsed with column specification:\n", "cols(\n", " a_o = col_integer(),\n", " codigo_comarca = col_integer(),\n", " codigo_muncipio = col_integer(),\n", " codigo_producto = col_character(),\n", " codigo_provincia = col_integer(),\n", " comarca = col_character(),\n", " cultivo = col_character(),\n", " grupo_de_cultivo = col_character(),\n", " municipio = col_character(),\n", " ocupaci_n_primera_regad_o = col_integer(),\n", " ocupaci_n_primera_secano = col_integer(),\n", " ocupaciones_asociadas_regad_o = col_character(),\n", " ocupaciones_asociadas_secano = col_character(),\n", " ocupaciones_posteriores_regad_o = col_character(),\n", " ocupaciones_posteriores_secano = col_character()\n", ")\n" ] } ], "source": [ "crops.herbaceous <- my.read.csv(\"https://analisis.datosabiertos.jcyl.es/resource/agu2-cspz.csv\",\n", " \"./data/crops-herbaceous.csv\")" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": false }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Parsed with column specification:\n", "cols(\n", " a_o = col_datetime(format = \"\"),\n", " codigo_comarca = col_integer(),\n", " codigo_muncipio = col_integer(),\n", " codigo_producto = col_character(),\n", " codigo_provincia = col_integer(),\n", " comarca = col_character(),\n", " cultivo = col_character(),\n", " grupo_de_cultivo = col_character(),\n", " municipio = col_character(),\n", " n_total_arboles_diseminados = col_integer(),\n", " superficie_regad_o_en_producci_n = col_integer(),\n", " superficie_regad_o_que_no_pruduce = col_integer(),\n", " superficie_secano_en_producci_n = col_integer(),\n", " superficie_secano_que_no_pruduce = col_integer()\n", ")\n" ] } ], "source": [ "crops.woody <- my.read.csv(\"https://analisis.datosabiertos.jcyl.es/resource/2vwa-si9n.csv\",\n", " \"./data/crops-woody.csv\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Cleaning:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Remove Unnecesary Columns and Rename Interesting Columns" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "inputHidden": false, "outputHidden": false, "scrolled": false }, "outputs": [], "source": [ "code.province.to.province <- function(code) {\n", " recode(code,\n", " \"47\" = \"Valladolid\",\n", " \"24\" = \"León\",\n", " \"34\" = \"Palencia\",\n", " \"37\" = \"Salamanca\",\n", " \"9\" = \"Burgos\",\n", " \"49\" = \"Zamora\",\n", " \"5\" = \"Ávila\",\n", " \"42\" = \"Soria\",\n", " \"40\" = \"Segovia\")\n", "}" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "inputHidden": false, "outputHidden": false, "scrolled": false }, "outputs": [], "source": [ "crops.herbaceous.use <- crops.herbaceous %>%\n", " select(a_o, codigo_provincia:ocupaci_n_primera_secano) %>%\n", " rename(year = a_o, \n", " code.province = codigo_provincia, \n", " crop = cultivo, \n", " crop.group = grupo_de_cultivo, \n", " region = comarca, \n", " town = municipio, \n", " irrigation = ocupaci_n_primera_regad_o, \n", " dry = ocupaci_n_primera_secano) %>%\n", " gather(c(dry, irrigation), \n", " key = \"crop.technique\", value = \"area\") %>%\n", " filter(!is.na(area) & area > 0 ) %>%\n", " mutate(crop = factor(crop),\n", " crop.group = as.factor(crop.group),\n", " region = as.factor(region),\n", " town = as.factor(town),\n", " province = as.factor(code.province.to.province(code.province))) %>%\n", " select(-code.province)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "inputHidden": false, "outputHidden": false, "scrolled": false }, "outputs": [], "source": [ "crops.woody.use <- crops.woody %>%\n", " select(a_o, codigo_provincia:municipio,\n", " superficie_regad_o_en_producci_n,\n", " superficie_secano_en_producci_n) %>%\n", " rename(year = a_o, \n", " code.province = codigo_provincia, \n", " crop = cultivo, \n", " crop.group = grupo_de_cultivo, \n", " region = comarca, \n", " town = municipio, \n", " irrigation = superficie_regad_o_en_producci_n, \n", " dry = superficie_secano_en_producci_n) %>%\n", " gather(c(dry, irrigation), \n", " key = \"crop.technique\", value = \"area\") %>%\n", " filter(!is.na(area) & area > 0 ) %>%\n", " mutate(crop = as.factor(crop),\n", " crop.group = as.factor(crop.group),\n", " region = as.factor(region),\n", " town = as.factor(town),\n", " province = as.factor(code.province.to.province(code.province)),\n", " year = year(round(as.POSIXct(year), \"days\")),\n", " crop.group = recode(crop.group, \n", " \"VIÑEDO OCUPACIÓN PRINCIPAL\" = \"VIÑEDO\",\n", " \"CITRICOS\" = \"FRUTALES\"),\n", " crop.group = replace(crop.group, crop == \"MIMBRERO\" | \n", " crop == \"MORERA Y OTROS\", \"OTROS CULTIVOS LEÑOSOS\"),\n", " crop.group = replace(crop.group, crop == \"OLI ACEITUNA ACEITE\" | \n", " crop == \"OLIVAR ACEIUNA MESA\", \"OLIVAR\")) %>%\n", " select(-code.province)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "inputHidden": false, "outputHidden": false, "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
year | region | crop | crop.group | town | crop.technique | area | province | crop.type |
---|---|---|---|---|---|---|---|---|
2015 | AGUILAR | VALLICO | CULTIVOS FORRAJEROS | BRAÑOSERA | dry | 12 | Palencia | herbaceous |
2012 | BENAVENTE Y LOS VALLES | PATATA MED. ESTACION | TUBERCULOS | SAN CRISTOBAL DE ENTREVIÑAS | irrigation | 5 | Zamora | herbaceous |
2013 | SANABRIA | MANZANO | FRUTALES | PEQUE | dry | 3 | Zamora | woody |
2013 | BENAVENTE Y LOS VALLES | COL Y REPOLLO | HORTALIZAS | PUEBLICA DE VALVERDE | dry | 2 | Zamora | herbaceous |
2014 | BOEDO-OJEDA | ALFALFA | CULTIVOS FORRAJEROS | SANTIBAÑEZ DE ECLA | dry | 6 | Palencia | herbaceous |
2014 | SORIA | OTROS C.INDUSTRIALES | CULTIV. INDUSTRIALES | GARRAY | irrigation | 13 | Soria | herbaceous |
2010 | CAMPOS | GUISANTE SECO | LEGUMINOSAS GRANO | MARCILLA DE CAMPOS | dry | 161 | Palencia | herbaceous |
2010 | CAMPOS | TRIGO | CEREALES GRANO | ESPINOSA DE VILLAGONZALO | dry | 985 | Palencia | herbaceous |
2010 | SEPULVEDA I | CENTENO | CEREALES GRANO | SACRAMENIA | irrigation | 10 | Segovia | herbaceous |
2015 | CAMPOS-PAN | MAIZ | CEREALES GRANO | VEZDEMARBAN | irrigation | 111 | Zamora | herbaceous |