{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Demo material for notebook on \"Digital economy and society statistics\"\n", "\n", "Prepared by [**F. Sheeka**](fsheeka@gmail.com).\n", "\n", "This notebook aims at illustrating the *Statistics Explained* article on [**Digital economy and society statistics - households and individuals**](https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Digital_economy_and_society_statistics_-_households_and_individuals)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Figure 1: Internet access and broadband internet connections of households, EU-28, 2008-2018 (% of all households)" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "\n", "Attaching package: ‘dplyr’\n", "\n", "The following objects are masked from ‘package:stats’:\n", "\n", " filter, lag\n", "\n", "The following objects are masked from ‘package:base’:\n", "\n", " intersect, setdiff, setequal, union\n", "\n", "Loading required package: usethis\n", "restatapi: - config file with the API version 1 loaded from GitHub (the 'current' API version number is 1).\n", " - 2 from the 4 cores are used for parallel computing.\n", " - 'libcurl' will be used for file download.\n", " - the Table of contents (TOC) was not pre-loaded into the deafult cache ('.restatapi_env').\n" ] } ], "source": [ "library(ggplot2)\n", "library(tidyr)\n", "library(repr)\n", "library(dplyr)\n", "library(devtools)\n", "library(restatapi)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Using get_eurostat_data to pull data from the first dataset we want. \n", "I've called it dataset1_1 so we can differentiate from the other dataset we are pulling " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "dataset1_1 <- get_eurostat_data(id=\"isoc_ci_in_h\", \n", "filters = list(geo = \"EU28\", unit = \"PC_HH\", hhtyp = \"TOTAL\"),\n", "date_filter = \"2008:2018\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* The first dataset I pulled does not have the same variables as the second one. In order to differentiate them, I will create a new variable in my first dataset and call it \"INT_ACCESS\" " ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "dataset1_1$indic_is <- \"INT_ACCESS\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* And now data from the other dataset" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "dataset1_2 <- get_eurostat_data(id=\"isoc_ci_it_h\", \n", "filters = list(geo = \"EU28\", unit = \"PC_HH\", hhtyp = \"TOTAL\", indic_is = \"H_BROAD\"),\n", "date_filter = \"2008:2018\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Combining the datasets by row using r(ow)bind to transform it into one dataset" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "dataset_fig1 <- rbind(dataset1_1, dataset1_2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Converting the years in the dataset to numerical values. This is necessary for line graphs! " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "dataset_fig1$time<-as.numeric(as.character(dataset_fig1$time))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* If you like, you can have a sneakpeak of the dataset using head() just to make sure everything looks okay before you go ahead with the visualisation. You can also just type in the name of the dataset and run it in order to see the full dataset. " ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| unit | hhtyp | geo | time | values | indic_is |
|---|---|---|---|---|---|
| <fct> | <fct> | <fct> | <dbl> | <dbl> | <fct> |
| PC_HH | TOTAL | EU28 | 2008 | 60 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2009 | 66 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2010 | 70 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2011 | 73 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2012 | 76 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2013 | 79 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2014 | 81 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2015 | 83 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2016 | 85 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2017 | 87 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2018 | 89 | INT_ACCESS |
| PC_HH | TOTAL | EU28 | 2008 | 48 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2009 | 56 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2010 | 61 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2011 | 67 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2012 | 72 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2013 | 76 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2014 | 78 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2015 | 80 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2016 | 83 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2017 | 85 | H_BROAD |
| PC_HH | TOTAL | EU28 | 2018 | 86 | H_BROAD |