{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Preparació de dades\n",
"\n",
"\n",
"Preparació de dades per a [Innovation and Human Rights](https://ihr.world/ca/projecte-sumarissims).\n",
"\n",
"En aquest notebook, convertim les dades originals a formats que es poden treballar mès fàcilment a Python, i anem mostrant petites anàlisis."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np \n",
"import plydata as ply\n",
"import datetime\n",
"import yaml\n",
"from functools import lru_cache\n",
"\n",
"ply.options.modify_input_data = True\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pas 1: llegir arxiu"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"original = pd.read_csv(\"data/input/20171101_actualitzat_llistat_Llei_11_2017.csv\",delimiter=\";\",encoding=\"latin-1\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Un primer cop d'ull a les dades"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Codi | \n",
" cognoms nom | \n",
" cognoms | \n",
" nom | \n",
" sexe | \n",
" edat | \n",
" Municipi naixement | \n",
" Pedanies/Agregats naixement | \n",
" Comarca naixement | \n",
" Província naixement | \n",
" ... | \n",
" Tipus procediment 2 | \n",
" num causa | \n",
" any inicial | \n",
" any aprovació sen o altra resol | \n",
" pena | \n",
" commutació/indult(demanat) | \n",
" afusellades | \n",
" ref num arxiu | \n",
" autoria i data de la descripció | \n",
" data correccio registre | \n",
"
\n",
" \n",
" \n",
" \n",
" 17 | \n",
" 135218 | \n",
" ABAD ALFONSO, Erundina | \n",
" ABAD ALFONSO | \n",
" Erundina | \n",
" Dona | \n",
" 43.0 | \n",
" Alcoi | \n",
" -- | \n",
" Alcoià | \n",
" Alacant | \n",
" ... | \n",
" Sumaríssim | \n",
" 030827 | \n",
" 1942.0 | \n",
" 1943.0 | \n",
" Sobreseïment | \n",
" NaN | \n",
" NaN | \n",
" 37729.0 | \n",
" ANC 2017 07 12 | \n",
" NaN | \n",
"
\n",
" \n",
" 18 | \n",
" 25335 | \n",
" ABAD ARBÓS, Ricardo | \n",
" ABAD ARBÓS | \n",
" Ricardo | \n",
" Home | \n",
" 30.0 | \n",
" Barcelona | \n",
" -- | \n",
" Barcelonès | \n",
" Barcelona | \n",
" ... | \n",
" -- | \n",
" -- | \n",
" 1939.0 | \n",
" 1939.0 | \n",
" Sense declaració de responsabilitats | \n",
" NaN | \n",
" NaN | \n",
" 12842.0 | \n",
" ANC 2017 07 12 | \n",
" NaN | \n",
"
\n",
" \n",
" 19 | \n",
" 123192 | \n",
" ABAD BARAS, José | \n",
" ABAD BARAS | \n",
" José | \n",
" Home | \n",
" 30.0 | \n",
" Benavarri | \n",
" -- | \n",
" Ribagorça | \n",
" Osca | \n",
" ... | \n",
" Sumaríssim | \n",
" 000940 | \n",
" 1939.0 | \n",
" 1939.0 | \n",
" Absolt | \n",
" NaN | \n",
" NaN | \n",
" 49476.0 | \n",
" ANC 2017 07 12 | \n",
" NaN | \n",
"
\n",
" \n",
" 20 | \n",
" 171 | \n",
" ABAD BATLLONE, Juan | \n",
" ABAD BATLLONE | \n",
" Juan | \n",
" Home | \n",
" 42.0 | \n",
" Barcelona | \n",
" -- | \n",
" Barcelonès | \n",
" Barcelona | \n",
" ... | \n",
" Diligències prèvies | \n",
" -- | \n",
" 1939.0 | \n",
" 1939.0 | \n",
" Sense declaració de responsabilitats | \n",
" NaN | \n",
" NaN | \n",
" 4063.0 | \n",
" ANC 2017 07 12 | \n",
" NaN | \n",
"
\n",
" \n",
" 21 | \n",
" 24485 | \n",
" ABAD BOIRA, Ricardo | \n",
" ABAD BOIRA | \n",
" Ricardo | \n",
" Home | \n",
" 48.0 | \n",
" Tauste | \n",
" -- | \n",
" -- | \n",
" Saragossa | \n",
" ... | \n",
" Diligències prèvies | \n",
" -- | \n",
" 1941.0 | \n",
" 1942.0 | \n",
" Sense declaració de responsabilitats | \n",
" NaN | \n",
" NaN | \n",
" 12915.0 | \n",
" ANC 2017 07 12 | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
5 rows × 29 columns
\n",
"
"
],
"text/plain": [
" Codi cognoms nom cognoms nom sexe edat \\\n",
"17 135218 ABAD ALFONSO, Erundina ABAD ALFONSO Erundina Dona 43.0 \n",
"18 25335 ABAD ARBÓS, Ricardo ABAD ARBÓS Ricardo Home 30.0 \n",
"19 123192 ABAD BARAS, José ABAD BARAS José Home 30.0 \n",
"20 171 ABAD BATLLONE, Juan ABAD BATLLONE Juan Home 42.0 \n",
"21 24485 ABAD BOIRA, Ricardo ABAD BOIRA Ricardo Home 48.0 \n",
"\n",
" Municipi naixement Pedanies/Agregats naixement Comarca naixement \\\n",
"17 Alcoi -- Alcoià \n",
"18 Barcelona -- Barcelonès \n",
"19 Benavarri -- Ribagorça \n",
"20 Barcelona -- Barcelonès \n",
"21 Tauste -- -- \n",
"\n",
" Província naixement ... Tipus procediment 2 num causa \\\n",
"17 Alacant ... Sumaríssim 030827 \n",
"18 Barcelona ... -- -- \n",
"19 Osca ... Sumaríssim 000940 \n",
"20 Barcelona ... Diligències prèvies -- \n",
"21 Saragossa ... Diligències prèvies -- \n",
"\n",
" any inicial any aprovació sen o altra resol \\\n",
"17 1942.0 1943.0 \n",
"18 1939.0 1939.0 \n",
"19 1939.0 1939.0 \n",
"20 1939.0 1939.0 \n",
"21 1941.0 1942.0 \n",
"\n",
" pena commutació/indult(demanat) \\\n",
"17 Sobreseïment NaN \n",
"18 Sense declaració de responsabilitats NaN \n",
"19 Absolt NaN \n",
"20 Sense declaració de responsabilitats NaN \n",
"21 Sense declaració de responsabilitats NaN \n",
"\n",
" afusellades ref num arxiu autoria i data de la descripció \\\n",
"17 NaN 37729.0 ANC 2017 07 12 \n",
"18 NaN 12842.0 ANC 2017 07 12 \n",
"19 NaN 49476.0 ANC 2017 07 12 \n",
"20 NaN 4063.0 ANC 2017 07 12 \n",
"21 NaN 12915.0 ANC 2017 07 12 \n",
"\n",
" data correccio registre \n",
"17 NaN \n",
"18 NaN \n",
"19 NaN \n",
"20 NaN \n",
"21 NaN \n",
"\n",
"[5 rows x 29 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# sexe \"--\" marca persones jurídiques\n",
"\n",
"original.query(\"sexe != '--'\").head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pas 2: Canviar noms de columnes a identificadors permesos en *Python*"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"original >> ply.rename(\n",
"{'afusellades': 'afusellades',\n",
" 'any_inicial': 'any inicial',\n",
" 'any_resol': 'any aprovació sen o altra resol',\n",
" 'ca_naix': 'Comunitat autònoma naixement',\n",
" 'ca_res': 'Comunitat autònoma residència',\n",
" 'codi': 'Codi',\n",
" 'cognoms': 'cognoms',\n",
" 'cognoms_nom': 'cognoms nom',\n",
" 'com_naix': 'Comarca naixement',\n",
" 'com_res': 'Comarca residència',\n",
" 'correccio': 'data correccio registre',\n",
" 'descr': 'autoria i data de la descripció',\n",
" 'edat': 'edat',\n",
" 'indult': 'commutació/indult(demanat)',\n",
" 'mun_naix': 'Municipi naixement',\n",
" 'mun_res': 'Municipi residència',\n",
" 'nom': 'nom',\n",
" 'num_causa': 'num causa',\n",
" 'pais_naix': 'País naixement',\n",
" 'pais_res': 'País residència',\n",
" 'ped_naix': 'Pedanies/Agregats naixement',\n",
" 'ped_res': 'Pedanies/Agregats residencia',\n",
" 'pena': 'pena',\n",
" 'proc_1': 'Tipus procediment 1',\n",
" 'proc_2': 'Tipus procediment 2',\n",
" 'prov_naix': 'Província naixement',\n",
" 'prov_res': 'Província residència',\n",
" 'ref_arxiu': 'ref num arxiu',\n",
" 'genere': 'sexe'}\n",
")\n",
"\n",
"pass\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pas 3: Canviar el format dels anys a data per automatitzar el càlcul dels eixos \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# l'ús de lru_cache fa tot més ràpid, perquè cada any només ha de ser convertit a data una sola vegada\n",
"\n",
"\n",
"@lru_cache(maxsize=100)\n",
"def pd_year(n: float) :\n",
" try :\n",
" return pd.date_range(start=datetime.datetime(int(n),1,1),end=datetime.datetime(int(n),12,31),freq=\"Y\")[0]\n",
" except (TypeError,ValueError,OverflowError) :\n",
" return n\n",
"\n",
"\n",
"original[\"any_inicial\"] = original[\"any_inicial\"].apply(pd_year)\n",
"original[\"any_resol\"] = original[\"any_resol\"].apply(pd_year)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Primera anàlisi: els sumaríssims de més durada"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" cognoms_nom | \n",
" any_inicial | \n",
" any_resol | \n",
" genere | \n",
" pena | \n",
" pais_naix | \n",
" dura | \n",
"
\n",
" \n",
" \n",
" \n",
" 55258 | \n",
" ROIG ALTADILL, Ramón | \n",
" 1942-12-31 | \n",
" 1974-12-31 | \n",
" Home | \n",
" Arxiu | \n",
" Espanya | \n",
" 11688 days | \n",
"
\n",
" \n",
" 3172 | \n",
" ARCELUS MARTÍNEZ, Aurelio | \n",
" 1942-12-31 | \n",
" 1972-12-31 | \n",
" Home | \n",
" Sobreseïment | \n",
" Espanya | \n",
" 10958 days | \n",
"
\n",
" \n",
" 17551 | \n",
" COT REVERTER, Juan | \n",
" 1940-12-31 | \n",
" 1969-12-31 | \n",
" Home | \n",
" Repatriació | \n",
" Espanya | \n",
" 10592 days | \n",
"
\n",
" \n",
" 19057 | \n",
" DOLZ BES, José María | \n",
" 1938-12-31 | \n",
" 1966-12-31 | \n",
" Home | \n",
" Sobreseïment | \n",
" Espanya | \n",
" 10227 days | \n",
"
\n",
" \n",
" 49033 | \n",
" PIÑOL TORT, León | \n",
" 1941-12-31 | \n",
" 1968-12-31 | \n",
" Home | \n",
" Sobreseïment | \n",
" Espanya | \n",
" 9862 days | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" cognoms_nom any_inicial any_resol genere pena \\\n",
"55258 ROIG ALTADILL, Ramón 1942-12-31 1974-12-31 Home Arxiu \n",
"3172 ARCELUS MARTÍNEZ, Aurelio 1942-12-31 1972-12-31 Home Sobreseïment \n",
"17551 COT REVERTER, Juan 1940-12-31 1969-12-31 Home Repatriació \n",
"19057 DOLZ BES, José María 1938-12-31 1966-12-31 Home Sobreseïment \n",
"49033 PIÑOL TORT, León 1941-12-31 1968-12-31 Home Sobreseïment \n",
"\n",
" pais_naix dura \n",
"55258 Espanya 11688 days \n",
"3172 Espanya 10958 days \n",
"17551 Espanya 10592 days \n",
"19057 Espanya 10227 days \n",
"49033 Espanya 9862 days "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"dura = original.loc[:,[\"cognoms_nom\",\"any_inicial\",\"any_resol\",\"genere\",\"pena\",\"pais_naix\"]]\n",
"\n",
"dura[\"dura\"] = dura[\"any_resol\"] - dura[\"any_inicial\"]\n",
"\n",
"dura.sort_values([\"dura\"],ascending=False).iloc[:5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pas 4: Categoritzar penes\n",
"\n",
"Aquestes categories Es deriven de l’[*Anexo de la Circular de Creación de las Comisiones de Examen de Penas* del any 1940](https://www.boe.es/datos/pdfs/BOE//1940/026/A00662-00665.pdf). \n",
"\n",
"En alguns casos, les dades publicades són ambigües - per exemple quan la pena és de *\"Sobreseïment provisional, tres mesos d'arrest major i llibertat\"*. En aquests casos, categoritzem com la pena més greu que es va imposar (*\"tres mesos\"*). Si queden dubtes, la categoria es \"altres\" - per exemple amb *\"Pena de multa, remissió o a disposició d'altres autoritats\"*.\n",
"\n",
"Cal tenir en compte també que el “sobreseïment” es podia produir també quan la persona havia mort durant la seva reclusió mentre esperava judici o en altres circumstàncies.\n",
"\n",
"La relació entre les penes que consten a la base de dades original (etiquetes) i les categories analitzades està documentada a l'arxiu [categories.yml](https://github.com/martinvirtel/sumarissims-dades/blob/master/work/data/input/categories.yml). \n",
"\n",
"| categoria | descripció | |\n",
"| ----------| ----------- | --- |\n",
"|mort\t|pena de mort\t||\t\n",
"|30a+\t|30 anys i un dia a perpètua||\t\t\n",
"|20-30a\t|20 anys i un dia a 30 anys |Reclusión\tMayor|\n",
"|12-20a\t|12 anys i un día a 20 anys|Menor|\n",
"|6-12a\t|6 anys i un dia a 12 anys |Prisión\tMayor|\n",
"|6m-6a\t|6 mesos i un dia a 6 anys\t\t|Menor|\n",
"|<6m\t|fins 6 mesos\t||\n",
"|sob\t|sobreseïment\t||\n",
"|lib\t|sin pena\t\t||\n",
"|alt\t|altres\t\t||\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"penes_cat = yaml.load(open(\"data/input/categories.yml\"))\n",
"ord_categories = { a[1][\"nom\"] : \"%02d\" % a[0] for a in enumerate(penes_cat[\"categories\"]) }\n",
"\n",
"categories_penes={ a[\"pena\"] : \"%s %s\" % (ord_categories[a[\"categoria\"]],a[\"categoria\"]) \n",
" for a in penes_cat[\"etiquetes\"] }\n",
"\n",
"original[\"pena_cat\"] = original[\"pena\"].apply(\n",
" lambda a: categories_penes.get(a,None))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Primera anàlisi: relació entre penes imposades i gènere\n",
"\n",
"\n",
"Entre els homes, el grup més gran és el de condemnat a 12 - 20 anys (30.0%). Entre les dones, són les que queden en llibertat (40.8%)."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" \n",
" \n",
" \n",
" genere | \n",
" Dona | \n",
" Home | \n",
" Dones % | \n",
" Homes % | \n",
"
\n",
" pena_cat | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" 00 mort | \n",
" 41 | \n",
" 4376 | \n",
" 0.7% | \n",
" 6.9% | \n",
"
\n",
" 01 30a+ | \n",
" 119 | \n",
" 3628 | \n",
" 2.2% | \n",
" 5.7% | \n",
"
\n",
" 02 20-30a | \n",
" 50 | \n",
" 1155 | \n",
" 0.9% | \n",
" 1.8% | \n",
"
\n",
" 03 12-20a | \n",
" 856 | \n",
" 18935 | \n",
" 15.6% | \n",
" 30.0% | \n",
"
\n",
" 04 6-12a | \n",
" 590 | \n",
" 2378 | \n",
" 10.7% | \n",
" 3.8% | \n",
"
\n",
" 05 6m-6a | \n",
" 95 | \n",
" 1673 | \n",
" 1.7% | \n",
" 2.6% | \n",
"
\n",
" 06 <6m | \n",
" 247 | \n",
" 766 | \n",
" 4.5% | \n",
" 1.2% | \n",
"
\n",
" 07 sob | \n",
" 813 | \n",
" 9382 | \n",
" 14.8% | \n",
" 14.8% | \n",
"
\n",
" 08 lib | \n",
" 2246 | \n",
" 14404 | \n",
" 40.8% | \n",
" 22.8% | \n",
"
\n",
" 09 alt | \n",
" 445 | \n",
" 6525 | \n",
" 8.1% | \n",
" 10.3% | \n",
"
\n",
"
"
],
"text/plain": [
""
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"penes = original.query(\"genere != '--'\").\\\n",
" groupby([\"genere\",\"pena_cat\"])[\"pena_cat\"].count().\\\n",
" unstack(\"genere\")\n",
" \n",
"penes[\"Dones %\"] = penes[\"Dona\"].div(sum(penes[\"Dona\"]))\n",
"penes[\"Homes %\"] = penes[\"Home\"].div(sum(penes[\"Home\"]))\n",
"\n",
"\n",
"def highlight_max(s):\n",
" is_max = s == s.max()\n",
" return ['background-color:yellow' if v else '' for v in is_max]\n",
"\n",
"penes.style.apply(highlight_max,axis=0).\\\n",
" format(\"{:.1%}\",subset=['Dones %','Homes %'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pas 5: Categoritzar origen i residència\n",
"\n",
"Categories: \n",
"\n",
" - b - Barcelona\n",
" - c - Catalunya (fora de la província de Barcelona)\n",
" - e - Espanya (fora de Catalunya)\n",
" - i - internacional (fora d'Espanya)\n",
" - ? - altres / no consta\n",
" \n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def cat_naix(cas) :\n",
" if cas.prov_naix == '--' and cas.pais_naix == '--' :\n",
" naix=\"?\"\n",
" if cas.prov_naix == \"Barcelona\" :\n",
" naix=\"b\"\n",
" elif cas.ca_naix == \"Catalunya\" :\n",
" naix=\"c\"\n",
" elif cas.pais_naix == \"Espanya\" :\n",
" naix=\"e\"\n",
" else:\n",
" naix=\"i\"\n",
" return naix\n",
" \n",
"def cat_res(cas) :\n",
" if cas.prov_res == '--' and cas.pais_res == '--' :\n",
" return \"?\"\n",
" if cas.prov_res == \"Barcelona\" :\n",
" res=\"b\"\n",
" elif cas.ca_res == \"Catalunya\" :\n",
" res=\"c\"\n",
" elif cas.pais_res.find(\"Espanya\")>-1 :\n",
" res=\"e\"\n",
" elif cas.pais_res.find(\"Sense domicili fix\")>-1 :\n",
" res=\"e\"\n",
" else :\n",
" res=\"i\"\n",
" return res\n",
"\n",
"\n",
"original[\"cat_naix\"]=original.apply(cat_naix,axis=1)\n",
"original[\"cat_res\"]=original.apply(cat_res,axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Primera anàlisi: Relació entre lloc de naixement i de residència\n",
"\n",
"Aplicant les nostres categories, a primer cop d’ull es noten poques diferències entre homes i dones. Queda per investigar."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"naix_res = original.query(\"genere != '--'\").\\\n",
" groupby([\"cat_naix\",\"cat_res\",\"genere\"])[\"nom\"].\\\n",
" count().\\\n",
" unstack(\"cat_res\")\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"La taula mostra el percentatge entre totes les dones o tots els homes d'aquesta categoria de residència. Exemple: **28.2%** de les dones (i **16.0%** dels homes) nascudes a Catalunya fora de la província de Barcelona resideixen a la província de Barcelona.\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" \n",
" \n",
" \n",
" | \n",
" cat_res | \n",
" ? | \n",
" b | \n",
" c | \n",
" e | \n",
" i | \n",
"
\n",
" cat_naix | \n",
" genere | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" b | \n",
" Dona | \n",
" 0.3% | \n",
" 95.9% | \n",
" 3.7% | \n",
" 0.1% | \n",
" nan% | \n",
"
\n",
" Home | \n",
" 0.1% | \n",
" 95.3% | \n",
" 4.2% | \n",
" 0.3% | \n",
" 0.1% | \n",
"
\n",
" c | \n",
" Dona | \n",
" 0.1% | \n",
" 28.2% | \n",
" 71.2% | \n",
" 0.4% | \n",
" 0.1% | \n",
"
\n",
" Home | \n",
" 0.1% | \n",
" 16.0% | \n",
" 83.5% | \n",
" 0.3% | \n",
" 0.1% | \n",
"
\n",
" e | \n",
" Dona | \n",
" 0.2% | \n",
" 87.0% | \n",
" 7.2% | \n",
" 5.6% | \n",
" 0.1% | \n",
"
\n",
" Home | \n",
" 0.3% | \n",
" 73.7% | \n",
" 14.3% | \n",
" 11.3% | \n",
" 0.4% | \n",
"
\n",
" i | \n",
" Dona | \n",
" 11.8% | \n",
" 67.8% | \n",
" 16.9% | \n",
" 2.4% | \n",
" 1.2% | \n",
"
\n",
" Home | \n",
" 19.0% | \n",
" 37.4% | \n",
" 39.2% | \n",
" 1.8% | \n",
" 2.5% | \n",
"
\n",
"
"
],
"text/plain": [
""
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\n",
"naix_res = naix_res.div(naix_res.sum(axis=1),axis=0)\n",
"\n",
"def highlight_max(s):\n",
" is_max = s == s.max()\n",
" return ['font-weight:bold' if v else '' for v in is_max]\n",
"\n",
"naix_res.style.apply(highlight_max,axis=1,\n",
" subset=(pd.IndexSlice['c'],pd.IndexSlice['b'])).\\\n",
" format(\"{:.1%}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pas 6: executades o no?"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"def cat_execut(a) :\n",
" if str(a).find(\"executa\") > -1 :\n",
" return \"exec\"\n",
" else :\n",
" return \"no exec\"\n",
"\n",
"\n",
"original[\"af_cat\"]=original[\"afusellades\"].apply(cat_execut)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Primera análisi: Pena de mort i execucions\n",
"\n",
"La majoria de les 41 dones condemnades a pena de mort no van ser executades. Entre els homes, una quarta part dels 4376 condemnats a mort no van ser executats segons les dades."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" genere | \n",
" Dona | \n",
" Home | \n",
" sum | \n",
"
\n",
" \n",
" af_cat | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" exec | \n",
" 17 | \n",
" 3345 | \n",
" 3362 | \n",
"
\n",
" \n",
" no exec | \n",
" 24 | \n",
" 1031 | \n",
" 1055 | \n",
"
\n",
" \n",
" sum | \n",
" 41 | \n",
" 4376 | \n",
" 4417 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"genere Dona Home sum\n",
"af_cat \n",
"exec 17 3345 3362\n",
"no exec 24 1031 1055\n",
"sum 41 4376 4417"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pena_mort = original.query(\"pena_cat == '00 mort'\").\\\n",
" groupby([\"genere\",\"af_cat\"])[\"pena_cat\"].count().\\\n",
" unstack(\"genere\")\n",
" \n",
"pena_mort_sum=pena_mort.sum(axis=0)\n",
"pena_mort_sum.name=\"sum\"\n",
"pena_mort = pena_mort.append(pena_mort_sum)\n",
"pena_mort[\"sum\"] = pena_mort.sum(axis=1)\n",
"\n",
"pena_mort"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Pas 7: Guardar dades processats"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"original.to_msgpack(\"data/processat.msg\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Primera anàlisi: Relaciò entre \"Tipus procediment 1\" i \"Tipus procediment 2\"\n",
"\n",
"Trobem aquestes dues columnes de les dades originals difícils d'interpretar."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" proc_1_simple | \n",
" -- | \n",
" Consells de guerra | \n",
" Diligències prèvies | \n",
"
\n",
" \n",
" proc_2 | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" -- | \n",
" 3313 | \n",
" 10696 | \n",
" 4 | \n",
"
\n",
" \n",
" Acord | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Actuacions | \n",
" 9 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Antecedents | \n",
" 8 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Auditoria | \n",
" 2 | \n",
" 3 | \n",
" | \n",
"
\n",
" \n",
" Causa | \n",
" | \n",
" 1100 | \n",
" | \n",
"
\n",
" \n",
" Causa acumulada | \n",
" | \n",
" 965 | \n",
" | \n",
"
\n",
" \n",
" Causa criminal | \n",
" | \n",
" 59 | \n",
" | \n",
"
\n",
" \n",
" Causa d'Oficials Generals | \n",
" | \n",
" 14 | \n",
" | \n",
"
\n",
" \n",
" Causa ordinària | \n",
" 37 | \n",
" 1975 | \n",
" | \n",
"
\n",
" \n",
" Causa sumaríssima | \n",
" | \n",
" 3 | \n",
" | \n",
"
\n",
" \n",
" Certificat | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Declaració jurada | \n",
" 2 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Denúncia | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Diligències | \n",
" 3 | \n",
" | \n",
" 118 | \n",
"
\n",
" \n",
" Diligències d'urgència | \n",
" | \n",
" | \n",
" 1 | \n",
"
\n",
" \n",
" Diligències de guàrdia | \n",
" 2 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Diligències governatives | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Diligències informatives | \n",
" 2 | \n",
" | \n",
" 1 | \n",
"
\n",
" \n",
" Diligències prèvies | \n",
" | \n",
" | \n",
" 4181 | \n",
"
\n",
" \n",
" Documentació | \n",
" 6 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Expedient | \n",
" 236 | \n",
" 55 | \n",
" | \n",
"
\n",
" \n",
" Expedient governatiu | \n",
" 38 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Expedient informatiu | \n",
" 14 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Expedient judicial | \n",
" 15 | \n",
" 29 | \n",
" | \n",
"
\n",
" \n",
" Expedient no judicial | \n",
" 2 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Informació | \n",
" 11 | \n",
" | \n",
" 1 | \n",
"
\n",
" \n",
" Informació continuada en causa | \n",
" | \n",
" 1 | \n",
" | \n",
"
\n",
" \n",
" Informe | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Informe judicial | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Informe no judicial | \n",
" 2 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Informes | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Instància de testimoni | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Liquidacions | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" Procediment | \n",
" 3 | \n",
" 16 | \n",
" | \n",
"
\n",
" \n",
" Procediment acumulat | \n",
" | \n",
" 6 | \n",
" | \n",
"
\n",
" \n",
" Procediment ordinari | \n",
" 1 | \n",
" 7 | \n",
" | \n",
"
\n",
" \n",
" Procediment previ | \n",
" | \n",
" | \n",
" 236 | \n",
"
\n",
" \n",
" Sumari | \n",
" | \n",
" 2 | \n",
" | \n",
"
\n",
" \n",
" Sumaríssim | \n",
" 2 | \n",
" 46598 | \n",
" | \n",
"
\n",
" \n",
" Testimoni | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
" informacions | \n",
" 1 | \n",
" | \n",
" | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
"proc_1_simple -- Consells de guerra Diligències prèvies\n",
"proc_2 \n",
"-- 3313 10696 4\n",
"Acord 1 \n",
"Actuacions 9 \n",
"Antecedents 8 \n",
"Auditoria 2 3 \n",
"Causa 1100 \n",
"Causa acumulada 965 \n",
"Causa criminal 59 \n",
"Causa d'Oficials Generals 14 \n",
"Causa ordinària 37 1975 \n",
"Causa sumaríssima 3 \n",
"Certificat 1 \n",
"Declaració jurada 2 \n",
"Denúncia 1 \n",
"Diligències 3 118\n",
"Diligències d'urgència 1\n",
"Diligències de guàrdia 2 \n",
"Diligències governatives 1 \n",
"Diligències informatives 2 1\n",
"Diligències prèvies 4181\n",
"Documentació 6 \n",
"Expedient 236 55 \n",
"Expedient governatiu 38 \n",
"Expedient informatiu 14 \n",
"Expedient judicial 15 29 \n",
"Expedient no judicial 2 \n",
"Informació 11 1\n",
"Informació continuada en causa 1 \n",
"Informe 1 \n",
"Informe judicial 1 \n",
"Informe no judicial 2 \n",
"Informes 1 \n",
"Instància de testimoni 1 \n",
"Liquidacions 1 \n",
"Procediment 3 16 \n",
"Procediment acumulat 6 \n",
"Procediment ordinari 1 7 \n",
"Procediment previ 236\n",
"Sumari 2 \n",
"Sumaríssim 2 46598 \n",
"Testimoni 1 \n",
"informacions 1 "
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"original[\"proc_1_simple\"]=original[\"proc_1\"].apply(lambda a: a.replace(\" []\",\"\"))\n",
"\n",
"original.groupby([\"proc_2\",\"proc_1_simple\"])[\"nom\"].count().unstack(\"proc_1_simple\").fillna(' ')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Primera anàlisi: Relaciò entre \"Tipus procediment 1\" i \"pena\"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
" \n",
" \n",
" \n",
" proc_1_simple | \n",
" -- | \n",
" Consells de guerra | \n",
" Diligències prèvies | \n",
"
\n",
" genere | \n",
" -- | \n",
" Dona | \n",
" Home | \n",
" -- | \n",
" Dona | \n",
" Home | \n",
" -- | \n",
" Dona | \n",
" Home | \n",
"
\n",
" pena_cat | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" 00 mort | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
" 0 | \n",
" 41 | \n",
" 4371 | \n",
" 0 | \n",
" 0 | \n",
" 4 | \n",
"
\n",
" 01 30a+ | \n",
" 0 | \n",
" 1 | \n",
" 8 | \n",
" 0 | \n",
" 118 | \n",
" 3617 | \n",
" 0 | \n",
" 0 | \n",
" 3 | \n",
"
\n",
" 02 20-30a | \n",
" 0 | \n",
" 0 | \n",
" 2 | \n",
" 0 | \n",
" 50 | \n",
" 1152 | \n",
" 0 | \n",
" 0 | \n",
" 1 | \n",
"
\n",
" 03 12-20a | \n",
" 0 | \n",
" 2 | \n",
" 17 | \n",
" 0 | \n",
" 851 | \n",
" 18881 | \n",
" 0 | \n",
" 3 | \n",
" 37 | \n",
"
\n",
" 04 6-12a | \n",
" 0 | \n",
" 2 | \n",
" 5 | \n",
" 0 | \n",
" 588 | \n",
" 2367 | \n",
" 0 | \n",
" 0 | \n",
" 6 | \n",
"
\n",
" 05 6m-6a | \n",
" 0 | \n",
" 0 | \n",
" 25 | \n",
" 0 | \n",
" 95 | \n",
" 1643 | \n",
" 0 | \n",
" 0 | \n",
" 5 | \n",
"
\n",
" 06 <6m | \n",
" 0 | \n",
" 48 | \n",
" 78 | \n",
" 0 | \n",
" 186 | \n",
" 640 | \n",
" 0 | \n",
" 13 | \n",
" 48 | \n",
"
\n",
" 07 sob | \n",
" 0 | \n",
" 4 | \n",
" 128 | \n",
" 5 | \n",
" 789 | \n",
" 9095 | \n",
" 2 | \n",
" 20 | \n",
" 159 | \n",
"
\n",
" 08 lib | \n",
" 0 | \n",
" 651 | \n",
" 1772 | \n",
" 0 | \n",
" 1095 | \n",
" 9307 | \n",
" 7 | \n",
" 500 | \n",
" 3325 | \n",
"
\n",
" 09 alt | \n",
" 0 | \n",
" 158 | \n",
" 753 | \n",
" 0 | \n",
" 250 | \n",
" 5425 | \n",
" 2 | \n",
" 37 | \n",
" 333 | \n",
"
\n",
"
"
],
"text/plain": [
""
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"original.groupby([\"pena_cat\",\"proc_1_simple\",\"genere\"])[\"nom\"].\\\n",
" count().\\\n",
" unstack(\"proc_1_simple\").\\\n",
" unstack(\"genere\").\\\n",
" fillna(0).\\\n",
" style.format(\"{:0g}\")\n",
" "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}