{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Preparació de dades\n", "\n", "\n", "Preparació de dades per a [Innovation and Human Rights](https://ihr.world/ca/projecte-sumarissims).\n", "\n", "En aquest notebook, convertim les dades originals a formats que es poden treballar mès fàcilment a Python, i anem mostrant petites anàlisis." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np \n", "import plydata as ply\n", "import datetime\n", "import yaml\n", "from functools import lru_cache\n", "\n", "ply.options.modify_input_data = True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pas 1: llegir arxiu" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "original = pd.read_csv(\"data/input/20171101_actualitzat_llistat_Llei_11_2017.csv\",delimiter=\";\",encoding=\"latin-1\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Un primer cop d'ull a les dades" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Codicognoms nomcognomsnomsexeedatMunicipi naixementPedanies/Agregats naixementComarca naixementProvíncia naixement...Tipus procediment 2num causaany inicialany aprovació sen o altra resolpenacommutació/indult(demanat)afuselladesref num arxiuautoria i data de la descripciódata correccio registre
17135218ABAD ALFONSO, ErundinaABAD ALFONSOErundinaDona43.0Alcoi--AlcoiàAlacant...Sumaríssim0308271942.01943.0SobreseïmentNaNNaN37729.0ANC 2017 07 12NaN
1825335ABAD ARBÓS, RicardoABAD ARBÓSRicardoHome30.0Barcelona--BarcelonèsBarcelona...----1939.01939.0Sense declaració de responsabilitatsNaNNaN12842.0ANC 2017 07 12NaN
19123192ABAD BARAS, JoséABAD BARASJoséHome30.0Benavarri--RibagorçaOsca...Sumaríssim0009401939.01939.0AbsoltNaNNaN49476.0ANC 2017 07 12NaN
20171ABAD BATLLONE, JuanABAD BATLLONEJuanHome42.0Barcelona--BarcelonèsBarcelona...Diligències prèvies--1939.01939.0Sense declaració de responsabilitatsNaNNaN4063.0ANC 2017 07 12NaN
2124485ABAD BOIRA, RicardoABAD BOIRARicardoHome48.0Tauste----Saragossa...Diligències prèvies--1941.01942.0Sense declaració de responsabilitatsNaNNaN12915.0ANC 2017 07 12NaN
\n", "

5 rows × 29 columns

\n", "
" ], "text/plain": [ " Codi cognoms nom cognoms nom sexe edat \\\n", "17 135218 ABAD ALFONSO, Erundina ABAD ALFONSO Erundina Dona 43.0 \n", "18 25335 ABAD ARBÓS, Ricardo ABAD ARBÓS Ricardo Home 30.0 \n", "19 123192 ABAD BARAS, José ABAD BARAS José Home 30.0 \n", "20 171 ABAD BATLLONE, Juan ABAD BATLLONE Juan Home 42.0 \n", "21 24485 ABAD BOIRA, Ricardo ABAD BOIRA Ricardo Home 48.0 \n", "\n", " Municipi naixement Pedanies/Agregats naixement Comarca naixement \\\n", "17 Alcoi -- Alcoià \n", "18 Barcelona -- Barcelonès \n", "19 Benavarri -- Ribagorça \n", "20 Barcelona -- Barcelonès \n", "21 Tauste -- -- \n", "\n", " Província naixement ... Tipus procediment 2 num causa \\\n", "17 Alacant ... Sumaríssim 030827 \n", "18 Barcelona ... -- -- \n", "19 Osca ... Sumaríssim 000940 \n", "20 Barcelona ... Diligències prèvies -- \n", "21 Saragossa ... Diligències prèvies -- \n", "\n", " any inicial any aprovació sen o altra resol \\\n", "17 1942.0 1943.0 \n", "18 1939.0 1939.0 \n", "19 1939.0 1939.0 \n", "20 1939.0 1939.0 \n", "21 1941.0 1942.0 \n", "\n", " pena commutació/indult(demanat) \\\n", "17 Sobreseïment NaN \n", "18 Sense declaració de responsabilitats NaN \n", "19 Absolt NaN \n", "20 Sense declaració de responsabilitats NaN \n", "21 Sense declaració de responsabilitats NaN \n", "\n", " afusellades ref num arxiu autoria i data de la descripció \\\n", "17 NaN 37729.0 ANC 2017 07 12 \n", "18 NaN 12842.0 ANC 2017 07 12 \n", "19 NaN 49476.0 ANC 2017 07 12 \n", "20 NaN 4063.0 ANC 2017 07 12 \n", "21 NaN 12915.0 ANC 2017 07 12 \n", "\n", " data correccio registre \n", "17 NaN \n", "18 NaN \n", "19 NaN \n", "20 NaN \n", "21 NaN \n", "\n", "[5 rows x 29 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# sexe \"--\" marca persones jurídiques\n", "\n", "original.query(\"sexe != '--'\").head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pas 2: Canviar noms de columnes a identificadors permesos en *Python*" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "original >> ply.rename(\n", "{'afusellades': 'afusellades',\n", " 'any_inicial': 'any inicial',\n", " 'any_resol': 'any aprovació sen o altra resol',\n", " 'ca_naix': 'Comunitat autònoma naixement',\n", " 'ca_res': 'Comunitat autònoma residència',\n", " 'codi': 'Codi',\n", " 'cognoms': 'cognoms',\n", " 'cognoms_nom': 'cognoms nom',\n", " 'com_naix': 'Comarca naixement',\n", " 'com_res': 'Comarca residència',\n", " 'correccio': 'data correccio registre',\n", " 'descr': 'autoria i data de la descripció',\n", " 'edat': 'edat',\n", " 'indult': 'commutació/indult(demanat)',\n", " 'mun_naix': 'Municipi naixement',\n", " 'mun_res': 'Municipi residència',\n", " 'nom': 'nom',\n", " 'num_causa': 'num causa',\n", " 'pais_naix': 'País naixement',\n", " 'pais_res': 'País residència',\n", " 'ped_naix': 'Pedanies/Agregats naixement',\n", " 'ped_res': 'Pedanies/Agregats residencia',\n", " 'pena': 'pena',\n", " 'proc_1': 'Tipus procediment 1',\n", " 'proc_2': 'Tipus procediment 2',\n", " 'prov_naix': 'Província naixement',\n", " 'prov_res': 'Província residència',\n", " 'ref_arxiu': 'ref num arxiu',\n", " 'genere': 'sexe'}\n", ")\n", "\n", "pass\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pas 3: Canviar el format dels anys a data per automatitzar el càlcul dels eixos \n", "\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# l'ús de lru_cache fa tot més ràpid, perquè cada any només ha de ser convertit a data una sola vegada\n", "\n", "\n", "@lru_cache(maxsize=100)\n", "def pd_year(n: float) :\n", " try :\n", " return pd.date_range(start=datetime.datetime(int(n),1,1),end=datetime.datetime(int(n),12,31),freq=\"Y\")[0]\n", " except (TypeError,ValueError,OverflowError) :\n", " return n\n", "\n", "\n", "original[\"any_inicial\"] = original[\"any_inicial\"].apply(pd_year)\n", "original[\"any_resol\"] = original[\"any_resol\"].apply(pd_year)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primera anàlisi: els sumaríssims de més durada" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cognoms_nomany_inicialany_resolgenerepenapais_naixdura
55258ROIG ALTADILL, Ramón1942-12-311974-12-31HomeArxiuEspanya11688 days
3172ARCELUS MARTÍNEZ, Aurelio1942-12-311972-12-31HomeSobreseïmentEspanya10958 days
17551COT REVERTER, Juan1940-12-311969-12-31HomeRepatriacióEspanya10592 days
19057DOLZ BES, José María1938-12-311966-12-31HomeSobreseïmentEspanya10227 days
49033PIÑOL TORT, León1941-12-311968-12-31HomeSobreseïmentEspanya9862 days
\n", "
" ], "text/plain": [ " cognoms_nom any_inicial any_resol genere pena \\\n", "55258 ROIG ALTADILL, Ramón 1942-12-31 1974-12-31 Home Arxiu \n", "3172 ARCELUS MARTÍNEZ, Aurelio 1942-12-31 1972-12-31 Home Sobreseïment \n", "17551 COT REVERTER, Juan 1940-12-31 1969-12-31 Home Repatriació \n", "19057 DOLZ BES, José María 1938-12-31 1966-12-31 Home Sobreseïment \n", "49033 PIÑOL TORT, León 1941-12-31 1968-12-31 Home Sobreseïment \n", "\n", " pais_naix dura \n", "55258 Espanya 11688 days \n", "3172 Espanya 10958 days \n", "17551 Espanya 10592 days \n", "19057 Espanya 10227 days \n", "49033 Espanya 9862 days " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dura = original.loc[:,[\"cognoms_nom\",\"any_inicial\",\"any_resol\",\"genere\",\"pena\",\"pais_naix\"]]\n", "\n", "dura[\"dura\"] = dura[\"any_resol\"] - dura[\"any_inicial\"]\n", "\n", "dura.sort_values([\"dura\"],ascending=False).iloc[:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pas 4: Categoritzar penes\n", "\n", "Aquestes categories Es deriven de l’[*Anexo de la Circular de Creación de las Comisiones de Examen de Penas* del any 1940](https://www.boe.es/datos/pdfs/BOE//1940/026/A00662-00665.pdf). \n", "\n", "En alguns casos, les dades publicades són ambigües - per exemple quan la pena és de *\"Sobreseïment provisional, tres mesos d'arrest major i llibertat\"*. En aquests casos, categoritzem com la pena més greu que es va imposar (*\"tres mesos\"*). Si queden dubtes, la categoria es \"altres\" - per exemple amb *\"Pena de multa, remissió o a disposició d'altres autoritats\"*.\n", "\n", "Cal tenir en compte també que el “sobreseïment” es podia produir també quan la persona havia mort durant la seva reclusió mentre esperava judici o en altres circumstàncies.\n", "\n", "La relació entre les penes que consten a la base de dades original (etiquetes) i les categories analitzades està documentada a l'arxiu [categories.yml](https://github.com/martinvirtel/sumarissims-dades/blob/master/work/data/input/categories.yml). \n", "\n", "| categoria | descripció | |\n", "| ----------| ----------- | --- |\n", "|mort\t|pena de mort\t||\t\n", "|30a+\t|30 anys i un dia a perpètua||\t\t\n", "|20-30a\t|20 anys i un dia a 30 anys |Reclusión\tMayor|\n", "|12-20a\t|12 anys i un día a 20 anys|Menor|\n", "|6-12a\t|6 anys i un dia a 12 anys |Prisión\tMayor|\n", "|6m-6a\t|6 mesos i un dia a 6 anys\t\t|Menor|\n", "|<6m\t|fins 6 mesos\t||\n", "|sob\t|sobreseïment\t||\n", "|lib\t|sin pena\t\t||\n", "|alt\t|altres\t\t||\n", "\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "penes_cat = yaml.load(open(\"data/input/categories.yml\"))\n", "ord_categories = { a[1][\"nom\"] : \"%02d\" % a[0] for a in enumerate(penes_cat[\"categories\"]) }\n", "\n", "categories_penes={ a[\"pena\"] : \"%s %s\" % (ord_categories[a[\"categoria\"]],a[\"categoria\"]) \n", " for a in penes_cat[\"etiquetes\"] }\n", "\n", "original[\"pena_cat\"] = original[\"pena\"].apply(\n", " lambda a: categories_penes.get(a,None))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primera anàlisi: relació entre penes imposades i gènere\n", "\n", "\n", "Entre els homes, el grup més gran és el de condemnat a 12 - 20 anys (30.0%). Entre les dones, són les que queden en llibertat (40.8%)." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
genereDonaHomeDones %Homes %
pena_cat
00 mort4143760.7%6.9%
01 30a+11936282.2%5.7%
02 20-30a5011550.9%1.8%
03 12-20a8561893515.6%30.0%
04 6-12a590237810.7%3.8%
05 6m-6a9516731.7%2.6%
06 <6m2477664.5%1.2%
07 sob813938214.8%14.8%
08 lib22461440440.8%22.8%
09 alt44565258.1%10.3%
" ], "text/plain": [ "" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "penes = original.query(\"genere != '--'\").\\\n", " groupby([\"genere\",\"pena_cat\"])[\"pena_cat\"].count().\\\n", " unstack(\"genere\")\n", " \n", "penes[\"Dones %\"] = penes[\"Dona\"].div(sum(penes[\"Dona\"]))\n", "penes[\"Homes %\"] = penes[\"Home\"].div(sum(penes[\"Home\"]))\n", "\n", "\n", "def highlight_max(s):\n", " is_max = s == s.max()\n", " return ['background-color:yellow' if v else '' for v in is_max]\n", "\n", "penes.style.apply(highlight_max,axis=0).\\\n", " format(\"{:.1%}\",subset=['Dones %','Homes %'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pas 5: Categoritzar origen i residència\n", "\n", "Categories: \n", "\n", " - b - Barcelona\n", " - c - Catalunya (fora de la província de Barcelona)\n", " - e - Espanya (fora de Catalunya)\n", " - i - internacional (fora d'Espanya)\n", " - ? - altres / no consta\n", " \n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "def cat_naix(cas) :\n", " if cas.prov_naix == '--' and cas.pais_naix == '--' :\n", " naix=\"?\"\n", " if cas.prov_naix == \"Barcelona\" :\n", " naix=\"b\"\n", " elif cas.ca_naix == \"Catalunya\" :\n", " naix=\"c\"\n", " elif cas.pais_naix == \"Espanya\" :\n", " naix=\"e\"\n", " else:\n", " naix=\"i\"\n", " return naix\n", " \n", "def cat_res(cas) :\n", " if cas.prov_res == '--' and cas.pais_res == '--' :\n", " return \"?\"\n", " if cas.prov_res == \"Barcelona\" :\n", " res=\"b\"\n", " elif cas.ca_res == \"Catalunya\" :\n", " res=\"c\"\n", " elif cas.pais_res.find(\"Espanya\")>-1 :\n", " res=\"e\"\n", " elif cas.pais_res.find(\"Sense domicili fix\")>-1 :\n", " res=\"e\"\n", " else :\n", " res=\"i\"\n", " return res\n", "\n", "\n", "original[\"cat_naix\"]=original.apply(cat_naix,axis=1)\n", "original[\"cat_res\"]=original.apply(cat_res,axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primera anàlisi: Relació entre lloc de naixement i de residència\n", "\n", "Aplicant les nostres categories, a primer cop d’ull es noten poques diferències entre homes i dones. Queda per investigar." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "naix_res = original.query(\"genere != '--'\").\\\n", " groupby([\"cat_naix\",\"cat_res\",\"genere\"])[\"nom\"].\\\n", " count().\\\n", " unstack(\"cat_res\")\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "La taula mostra el percentatge entre totes les dones o tots els homes d'aquesta categoria de residència. Exemple: **28.2%** de les dones (i **16.0%** dels homes) nascudes a Catalunya fora de la província de Barcelona resideixen a la província de Barcelona.\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cat_res?bcei
cat_naixgenere
bDona0.3%95.9%3.7%0.1%nan%
Home0.1%95.3%4.2%0.3%0.1%
cDona0.1%28.2%71.2%0.4%0.1%
Home0.1%16.0%83.5%0.3%0.1%
eDona0.2%87.0%7.2%5.6%0.1%
Home0.3%73.7%14.3%11.3%0.4%
iDona11.8%67.8%16.9%2.4%1.2%
Home19.0%37.4%39.2%1.8%2.5%
" ], "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "\n", "naix_res = naix_res.div(naix_res.sum(axis=1),axis=0)\n", "\n", "def highlight_max(s):\n", " is_max = s == s.max()\n", " return ['font-weight:bold' if v else '' for v in is_max]\n", "\n", "naix_res.style.apply(highlight_max,axis=1,\n", " subset=(pd.IndexSlice['c'],pd.IndexSlice['b'])).\\\n", " format(\"{:.1%}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pas 6: executades o no?" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "def cat_execut(a) :\n", " if str(a).find(\"executa\") > -1 :\n", " return \"exec\"\n", " else :\n", " return \"no exec\"\n", "\n", "\n", "original[\"af_cat\"]=original[\"afusellades\"].apply(cat_execut)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primera análisi: Pena de mort i execucions\n", "\n", "La majoria de les 41 dones condemnades a pena de mort no van ser executades. Entre els homes, una quarta part dels 4376 condemnats a mort no van ser executats segons les dades." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
genereDonaHomesum
af_cat
exec1733453362
no exec2410311055
sum4143764417
\n", "
" ], "text/plain": [ "genere Dona Home sum\n", "af_cat \n", "exec 17 3345 3362\n", "no exec 24 1031 1055\n", "sum 41 4376 4417" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pena_mort = original.query(\"pena_cat == '00 mort'\").\\\n", " groupby([\"genere\",\"af_cat\"])[\"pena_cat\"].count().\\\n", " unstack(\"genere\")\n", " \n", "pena_mort_sum=pena_mort.sum(axis=0)\n", "pena_mort_sum.name=\"sum\"\n", "pena_mort = pena_mort.append(pena_mort_sum)\n", "pena_mort[\"sum\"] = pena_mort.sum(axis=1)\n", "\n", "pena_mort" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pas 7: Guardar dades processats" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "original.to_msgpack(\"data/processat.msg\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primera anàlisi: Relaciò entre \"Tipus procediment 1\" i \"Tipus procediment 2\"\n", "\n", "Trobem aquestes dues columnes de les dades originals difícils d'interpretar." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
proc_1_simple--Consells de guerraDiligències prèvies
proc_2
--3313106964
Acord1
Actuacions9
Antecedents8
Auditoria23
Causa1100
Causa acumulada965
Causa criminal59
Causa d'Oficials Generals14
Causa ordinària371975
Causa sumaríssima3
Certificat1
Declaració jurada2
Denúncia1
Diligències3118
Diligències d'urgència1
Diligències de guàrdia2
Diligències governatives1
Diligències informatives21
Diligències prèvies4181
Documentació6
Expedient23655
Expedient governatiu38
Expedient informatiu14
Expedient judicial1529
Expedient no judicial2
Informació111
Informació continuada en causa1
Informe1
Informe judicial1
Informe no judicial2
Informes1
Instància de testimoni1
Liquidacions1
Procediment316
Procediment acumulat6
Procediment ordinari17
Procediment previ236
Sumari2
Sumaríssim246598
Testimoni1
informacions1
\n", "
" ], "text/plain": [ "proc_1_simple -- Consells de guerra Diligències prèvies\n", "proc_2 \n", "-- 3313 10696 4\n", "Acord 1 \n", "Actuacions 9 \n", "Antecedents 8 \n", "Auditoria 2 3 \n", "Causa 1100 \n", "Causa acumulada 965 \n", "Causa criminal 59 \n", "Causa d'Oficials Generals 14 \n", "Causa ordinària 37 1975 \n", "Causa sumaríssima 3 \n", "Certificat 1 \n", "Declaració jurada 2 \n", "Denúncia 1 \n", "Diligències 3 118\n", "Diligències d'urgència 1\n", "Diligències de guàrdia 2 \n", "Diligències governatives 1 \n", "Diligències informatives 2 1\n", "Diligències prèvies 4181\n", "Documentació 6 \n", "Expedient 236 55 \n", "Expedient governatiu 38 \n", "Expedient informatiu 14 \n", "Expedient judicial 15 29 \n", "Expedient no judicial 2 \n", "Informació 11 1\n", "Informació continuada en causa 1 \n", "Informe 1 \n", "Informe judicial 1 \n", "Informe no judicial 2 \n", "Informes 1 \n", "Instància de testimoni 1 \n", "Liquidacions 1 \n", "Procediment 3 16 \n", "Procediment acumulat 6 \n", "Procediment ordinari 1 7 \n", "Procediment previ 236\n", "Sumari 2 \n", "Sumaríssim 2 46598 \n", "Testimoni 1 \n", "informacions 1 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "original[\"proc_1_simple\"]=original[\"proc_1\"].apply(lambda a: a.replace(\" []\",\"\"))\n", "\n", "original.groupby([\"proc_2\",\"proc_1_simple\"])[\"nom\"].count().unstack(\"proc_1_simple\").fillna(' ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Primera anàlisi: Relaciò entre \"Tipus procediment 1\" i \"pena\"\n", "\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
proc_1_simple--Consells de guerraDiligències prèvies
genere--DonaHome--DonaHome--DonaHome
pena_cat
00 mort0010414371004
01 30a+01801183617003
02 20-30a0020501152001
03 12-20a02170851188810337
04 6-12a02505882367006
05 6m-6a00250951643005
06 <6m04878018664001348
07 sob0412857899095220159
08 lib0651177201095930775003325
09 alt015875302505425237333
" ], "text/plain": [ "" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "original.groupby([\"pena_cat\",\"proc_1_simple\",\"genere\"])[\"nom\"].\\\n", " count().\\\n", " unstack(\"proc_1_simple\").\\\n", " unstack(\"genere\").\\\n", " fillna(0).\\\n", " style.format(\"{:0g}\")\n", " " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 2 }