{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 7) Kreuztabellierung\n", "\n", "Diese Kapitel beschäftigt sich mit einer in den Sozialwissenschaften sehr häufig angewandten Methode; allerdings vorerst nur rein deskriptiv, sozusagen als zweidimensionale Häufigkeitsanalyse. Die mit der Kreuztabellierung häufig verbundenen statistischen Analysen sind mit **Pandas** nicht möglich und werden später in Kapitel 14 behandelt.\n", "\n", "Importieren wir zuerst wieder **Pandas** und laden den bereits bekannten Datensatz." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sexagewohnortvolksmusikhardrock
015022.673.67
115711.003.33
226632.004.33
\n", "
" ], "text/plain": [ " sex age wohnort volksmusik hardrock\n", "0 1 50 2 2.67 3.67\n", "1 1 57 1 1.00 3.33\n", "2 2 66 3 2.00 4.33" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "daten = pd.read_csv(\"C:\\\\Datenfiles\\\\daten.csv\")\n", "\n", "daten.head(3).round(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 7.1) Kreuztabellen mit crosstab()\n", "\n", "**Pandas** bietet mit **crosstab()** eine Funktion zur Erstellung von zwei- und dreidimensionalen Kreuztabellen. Sehen wir uns im ersten Beispiel den Zusammenhang von *sex* und *wohnort* an.\n", "\n", "**crosstab()** bietet ein paar interessante Optionen, daher sollte die Dokumentation (vgl. Link) durchgesehen werden.\n", "\n", "[pandas.crosstab](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.crosstab.html#pandas.crosstab)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sex12
wohnort
15837
23525
37366
\n", "
" ], "text/plain": [ "sex 1 2\n", "wohnort \n", "1 58 37\n", "2 35 25\n", "3 73 66" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kreuztab = pd.crosstab(daten.wohnort, daten.sex) # Abbildung der absoluten Häufigkeiten\n", "\n", "# die Zuweisung der Kreuztabelle zum Objekt 'kreuztab' erfolgt nur, weil wir die Daten weiter unten noch für eine Grafik verwenden wollen.\n", "\n", "kreuztab" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Mit Randsummen\n", "\n", "Unsere erste Kreuztabelle enthielt die absoluten Häufigkeiten, d.h. man weiß nun z.B., dass 58 Frauen in ländlicher Umgebung wohnen, 25 Männer in kleinstädtischer Umgebung wohnen, usw. Lassen wir uns nun noch die Randsummen ausgeben, dann wissen wir auch, wieviele Personen insgesamt auf jeden Wohnort kommen bzw. wieviele Frauen bzw. Männer in der Stichprobe sind." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sex12All
wohnort
1583795
2352560
37366139
All166128294
\n", "
" ], "text/plain": [ "sex 1 2 All\n", "wohnort \n", "1 58 37 95\n", "2 35 25 60\n", "3 73 66 139\n", "All 166 128 294" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.crosstab(daten.wohnort, daten.sex, margins = True)\n", "# 'margins=True' gibt uns die Randsummen aus ('False' ist die Standardeinstellung)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Grafische Darstellung\n", "\n", "Kreuztabellen können natürlich auch grafisch dargestellt werden, insbesondere mit einem gestapelten Säulendiagramm." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEGCAYAAACevtWaAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAATMklEQVR4nO3df6zddX3H8eebtrOgBVq4JVcu7namshYYCFd0k1VnbYpALJER24gWaWzM2ChzVduRDEjEYJjAFqdJhY5OsYK/VoIObSqsiEDX8kNaagcTBhcKvRStwFJsy3t/3K94vdxy7z0/7rn30+cjac73+/n+epWTvPj2c35FZiJJKstBrQ4gSWo8y12SCmS5S1KBLHdJKpDlLkkFGt/qAABHHnlkdnZ2tjqGJI0pmzZtei4z2wbaNirKvbOzk40bN7Y6hiSNKRHxv/vb5rSMJBXIcpekAlnuklSgUTHnPpA9e/bQ3d3N7t27Wx3ldU2cOJGOjg4mTJjQ6iiS9KpByz0iVgJnATsy8/h+25YCVwFtmflcNbYcWATsAy7KzB/WEqy7u5tJkybR2dlJRNRyiqbLTHbu3El3dzfTpk1rdRxJetVQpmVuAE7vPxgRxwBzgCf6jM0E5gPHVcd8OSLG1RJs9+7dHHHEEaO22AEigiOOOGLU/+tC0oFn0HLPzPXA8wNsugb4DND3ayXnAd/MzJcz8zHgUeDUWsON5mL/rbGQUdKBp6YXVCPig8BTmflgv01HA0/2We+uxgY6x+KI2BgRG3t6emqJIUnaj2GXe0QcAlwC/MNAmwcYG/AL4zNzRWZ2ZWZXW9uAH7CSJNWolnfLvBWYBjxYTUl0APdFxKn03qkf02ffDuDpekNKKsRlh7U6QXNdtqvVCV417Dv3zHwoM6dmZmdmdtJb6Cdn5jPALcD8iHhDREwDpgMbGpp4EC+99BJnnnkmJ554Iscffzw33XQTmzZt4j3veQ+nnHIKc+fOZfv27ezatYtjjz2Wbdu2AbBgwQK++tWvjmRUSWqaobwVcjXwXuDIiOgGLs3M6wfaNzO3RMTNwMPAXuDCzNzXwLyDuu2223jzm9/M97//fQB27drFBz7wAdasWUNbWxs33XQTl1xyCStXruRLX/oS559/PkuWLOGXv/wln/jEJ0YyqiQ1zaDlnpkLBtne2W/9CuCK+mLV7oQTTmDp0qV89rOf5ayzzmLy5Mls3ryZOXPmALBv3z7a29sBmDNnDt/61re48MILefDB/q8NS9LYNWo/oVqrt73tbWzatIkf/OAHLF++nDlz5nDcccdx9913v2bfV155ha1bt3LwwQfz/PPP09HR0YLEktR4xX23zNNPP80hhxzCeeedx9KlS7n33nvp6el5tdz37NnDli1bALjmmmuYMWMGq1ev5oILLmDPnj2tjC5JDVPcnftDDz3Epz/9aQ466CAmTJjAV77yFcaPH89FF13Erl272Lt3LxdffDETJkzguuuuY8OGDUyaNIlZs2bxuc99jssvv7zVfwVJqltx5T537lzmzp37mvH169e/Zmzr1q2vLl999dVNzSVJI6m4aRlJkuUuSUWy3CWpQJa7JBXIcpekAlnuklSgMfNWyM5l32/o+R6/8sxB97ngggu49dZbmTp1Kps3b27o9SWpmbxzfx3nn38+t912W6tjSNKwWe6vY9asWUyZMqXVMSRp2Cx3SSqQ5S5JBbLcJalAlrskFWjMvBVyKG9dbLQFCxZwxx138Nxzz9HR0cHll1/OokWLRjyHJA3XmCn3Vli9enWrI0hSTZyWkaQCWe6SVKBByz0iVkbEjojY3Gfsqoj4eUT8LCK+FxGH99m2PCIejYhtEfHan0SSJDXdUO7cbwBO7ze2Fjg+M/8E+G9gOUBEzATmA8dVx3w5IsY1LK0kaUgGLffMXA8832/sR5m5t1q9B+iolucB38zMlzPzMeBR4NQG5pUkDUEj5twvAP6jWj4aeLLPtu5q7DUiYnFEbIyIjT09PQ2IIUn6rbreChkRlwB7gRt/OzTAbjnQsZm5AlgB0NXVNeA+v+eyw2oLud/z7Rp0lyeffJKPfexjPPPMMxx00EEsXryYJUuWNDaHJDVBzeUeEQuBs4DZmfnbcu4GjumzWwfwdO3xWmv8+PF88Ytf5OSTT+aFF17glFNOYc6cOcycObPV0STpddU0LRMRpwOfBT6Ymf/XZ9MtwPyIeENETAOmAxvqj9ka7e3tnHzyyQBMmjSJGTNm8NRTT7U4lSQNbtA794hYDbwXODIiuoFL6X13zBuAtREBcE9mfjIzt0TEzcDD9E7XXJiZ+5oVfiQ9/vjj3H///bzzne9sdRRJGtSg5Z6ZCwYYvv519r8CuKKeUKPNiy++yDnnnMO1117LoYce2uo4kjQoP6E6iD179nDOOefwkY98hA996EOtjiNJQ2K5v47MZNGiRcyYMYNPfepTrY4jSUM2dr4VcghvXWy0u+66i6997WuccMIJnHTSSQB8/vOf54wzzhjxLJI0HGOn3FvgtNNO43fv8pSkscNpGUkqkOUuSQUa1eU+FqZExkJGSQeeUVvuEydOZOfOnaO6PDOTnTt3MnHixFZHkaTfM2pfUO3o6KC7u5vR/o2REydOpKOjY/AdJWkEjdpynzBhAtOmTWt1DEkak0bttIwkqXaWuyQVyHKXpAJZ7pJUIMtdkgpkuUtSgSx3SSqQ5S5JBbLcJalAlrskFchyl6QCDVruEbEyInZExOY+Y1MiYm1EPFI9Tu6zbXlEPBoR2yJibrOCS5L2byh37jcAp/cbWwasy8zpwLpqnYiYCcwHjquO+XJEjGtYWknSkAxa7pm5Hni+3/A8YFW1vAo4u8/4NzPz5cx8DHgUOLUxUSVJQ1XrnPtRmbkdoHqcWo0fDTzZZ7/uauw1ImJxRGyMiI2j/TvbJWmsafQLqjHA2IA/pZSZKzKzKzO72traGhxDkg5stZb7sxHRDlA97qjGu4Fj+uzXATxdezxJUi1qLfdbgIXV8kJgTZ/x+RHxhoiYBkwHNtQXUZI0XIP+zF5ErAbeCxwZEd3ApcCVwM0RsQh4AjgXIDO3RMTNwMPAXuDCzNzXpOySpP0YtNwzc8F+Ns3ez/5XAFfUE0qSVB8/oSpJBRr0zr1Ilx3W6gTNddmuVieQ1GLeuUtSgSx3SSqQ5S5JBbLcJalAlrskFchyl6QCWe6SVCDLXZIKZLlLUoEsd0kqkOUuSQWy3CWpQJa7JBXIcpekAlnuklQgy12SCmS5S1KBLHdJKlBd5R4RfxsRWyJic0SsjoiJETElItZGxCPV4+RGhZUkDU3N5R4RRwMXAV2ZeTwwDpgPLAPWZeZ0YF21LkkaQfVOy4wHDo6I8cAhwNPAPGBVtX0VcHad15AkDVPN5Z6ZTwH/CDwBbAd2ZeaPgKMyc3u1z3ZgaiOCSpKGrp5pmcn03qVPA94MvDEizhvG8YsjYmNEbOzp6ak1hiRpAPVMy7wfeCwzezJzD/Bd4M+AZyOiHaB63DHQwZm5IjO7MrOrra2tjhiSpP7qKfcngHdFxCEREcBsYCtwC7Cw2mchsKa+iJKk4Rpf64GZeW9EfBu4D9gL3A+sAN4E3BwRi+j9H8C5jQgqSRq6mssdIDMvBS7tN/wyvXfxkqQW8ROqklQgy12SCmS5S1KBLHdJKpDlLkkFquvdMpI0HJ27v9HqCE31eKsD9OGduyQVyHKXpAJZ7pJUIMtdkgpkuUtSgSx3SSqQ5S5JBbLcJalAlrskFchyl6QCWe6SVCDLXZIKZLlLUoEsd0kqkOUuSQWqq9wj4vCI+HZE/DwitkbEn0bElIhYGxGPVI+TGxVWkjQ09d65/xNwW2b+MXAisBVYBqzLzOnAumpdkjSCai73iDgUmAVcD5CZv8nMXwHzgFXVbquAs+uLKEkarnp+Zu+PgB7gXyPiRGATsAQ4KjO3A2Tm9oiYOtDBEbEYWAzwlre8pY4YOuBcdlirEzTXZbtanUAFqGdaZjxwMvCVzHw78BLDmILJzBWZ2ZWZXW1tbXXEkCT1V0+5dwPdmXlvtf5tesv+2YhoB6ged9QXUZI0XDVPy2TmMxHxZEQcm5nbgNnAw9WfhcCV1eOahiRtIH+BXVLp6plzB/gb4MaI+APgF8DH6f3XwM0RsQh4Aji3zmtIkoaprnLPzAeArgE2za7nvJKk+vgJVUkqkOUuSQWy3CWpQJa7JBXIcpekAlnuklQgy12SCmS5S1KBLHdJKpDlLkkFstwlqUCWuyQVyHKXpAJZ7pJUIMtdkgpkuUtSgSx3SSqQ5S5JBbLcJalAlrskFchyl6QC1V3uETEuIu6PiFur9SkRsTYiHqkeJ9cfU5I0HI24c18CbO2zvgxYl5nTgXXVuiRpBNVV7hHRAZwJXNdneB6wqlpeBZxdzzUkScNX7537tcBngFf6jB2VmdsBqsepAx0YEYsjYmNEbOzp6akzhiSpr5rLPSLOAnZk5qZajs/MFZnZlZldbW1ttcaQJA1gfB3Hvhv4YEScAUwEDo2IrwPPRkR7Zm6PiHZgRyOCSpKGruY798xcnpkdmdkJzAd+nJnnAbcAC6vdFgJr6k4pSRqWZrzP/UpgTkQ8Asyp1iVJI6ieaZlXZeYdwB3V8k5gdiPOK0mqjZ9QlaQCWe6SVKCGTMtII6lz9zdaHaGpHm91ABXBO3dJKpDlLkkFstwlqUCWuyQVyHKXpAJZ7pJUIMtdkgpkuUtSgSx3SSqQ5S5JBbLcJalAlrskFchyl6QCWe6SVCDLXZIKZLlLUoEsd0kqkOUuSQWqudwj4piIuD0itkbElohYUo1PiYi1EfFI9Ti5cXElSUNRz537XuDvMnMG8C7gwoiYCSwD1mXmdGBdtS5JGkE1l3tmbs/M+6rlF4CtwNHAPGBVtdsq4Ow6M0qShqkhc+4R0Qm8HbgXOCozt0Pv/wCAqfs5ZnFEbIyIjT09PY2IIUmq1F3uEfEm4DvAxZn566Eel5krMrMrM7va2trqjSFJ6qOuco+ICfQW+42Z+d1q+NmIaK+2twM76osoSRquet4tE8D1wNbMvLrPpluAhdXyQmBN7fEkSbUYX8ex7wY+CjwUEQ9UY38PXAncHBGLgCeAc+tKKEkatprLPTN/AsR+Ns+u9bySpPr5CVVJKpDlLkkFstwlqUCWuyQVyHKXpAJZ7pJUIMtdkgpkuUtSgSx3SSqQ5S5JBbLcJalAlrskFchyl6QCWe6SVCDLXZIKZLlLUoEsd0kqkOUuSQWy3CWpQJa7JBXIcpekAjWt3CPi9IjYFhGPRsSyZl1HkvRaTSn3iBgH/AvwAWAmsCAiZjbjWpKk12rWnfupwKOZ+YvM/A3wTWBek64lSepnfJPOezTwZJ/1buCdfXeIiMXA4mr1xYjY1qQso8GRwHMjdbH4wkhd6YDh8zd2lf7c/eH+NjSr3GOAsfy9lcwVwIomXX9UiYiNmdnV6hyqjc/f2HUgP3fNmpbpBo7ps94BPN2ka0mS+mlWuf8XMD0ipkXEHwDzgVuadC1JUj9NmZbJzL0R8dfAD4FxwMrM3NKMa40RB8T0U8F8/sauA/a5i8wcfC9J0pjiJ1QlqUCWuyQVyHJvoohYGRE7ImJzq7NoeCLimIi4PSK2RsSWiFjS6kwauoiYGBEbIuLB6vm7vNWZRppz7k0UEbOAF4F/y8zjW51HQxcR7UB7Zt4XEZOATcDZmflwi6NpCCIigDdm5osRMQH4CbAkM+9pcbQR4517E2XmeuD5VufQ8GXm9sy8r1p+AdhK7yevNQZkrxer1QnVnwPqTtZylwYREZ3A24F7WxxFwxAR4yLiAWAHsDYzD6jnz3KXXkdEvAn4DnBxZv661Xk0dJm5LzNPovcT8qdGxAE1NWq5S/tRzdV+B7gxM7/b6jyqTWb+CrgDOL21SUaW5S4NoHpB7npga2Ze3eo8Gp6IaIuIw6vlg4H3Az9vaagRZrk3UUSsBu4Gjo2I7ohY1OpMGrJ3Ax8F3hcRD1R/zmh1KA1ZO3B7RPyM3u+6WpuZt7Y404jyrZCSVCDv3CWpQJa7JBXIcpekAlnuklQgy12SCmS5S5WIeHHwvWo67+ER8VfNOLe0P5a71EQRMQ44HLDcNaIsdxUnIj4TERdVy9dExI+r5dkR8fWIWBARD0XE5oj4Qr9jr6i+A/yeiDiqGrshIv45In4aEb+IiL+sxiMirqrO81BEfLgaf2/1XfDfAB4CrgTeWn0Q6qoR/E+hA5jlrhKtB/68Wu4C3lR9T8xpwCPAF4D3AScB74iIs6t93wjck5knVuf4RJ9ztlfHn0VvWQN8qDrHifR+vP2q6nvgAU4FLsnMmcAy4H8y86TM/HRD/6bSfljuKtEm4JTqRzZepvcrILroLfxfAXdkZk9m7gVuBGZVx/0GuLXPOTr7nPPfM/OV6sc6jqrGTgNWV98++Czwn8A7qm0bMvOxZvzlpKGw3FWczNwDPA58HPgpcCfwF8BbgSde59A9+bvv49gHjO+z7eU+y9HvcSAvDSOy1HCWu0q1HlhaPd4JfBJ4ALgHeE9EHFm92LmA3jvuWq/x4epHIdro/RfAhgH2ewGYVOM1pJpY7irVnfTOk99dTZnsBu7MzO3AcuB24EHgvsxcU+M1vgf8rDrPj4HPZOYz/XfKzJ3AXdULr76gqhHht0JKUoG8c5ekAlnuklQgy12SCmS5S1KBLHdJKpDlLkkFstwlqUD/DwND4wn5aKX9AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = kreuztab.plot.bar(rot = 0, stacked = True)\n", "# 'stacked=True' liefert gestapelte Säulen, 'False' ist die Standardeinstellung und liefert getrennte Säulen pro Geschlecht" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Gestapeltes (auf 100% pro Säule) Diagramm inkl. Anzeige der dazugehörigen Kreuztabelle mit Zeilenprozenten\n", "\n", "Absolute Häufigkeiten lassen Zusammenhänge aber nur schwer erkennen; deshalb möchte man in Kreuztabellen zumeist auch relative Häufigkeiten (Prozentwerte) abbilden - und zwar typischerweise in Richtung der unabhängigen Variable. D.h., befindet sich die UV in den Spalten der Kreuztabelle, wird spaltenweise auf 100% aufsummiert, befindet sich die UV in den Zeilen, wird zeilenweise auf 100% aufsummiert. In unserem Beispiel können wir aber nicht unbedingt davon sprechen, eine UV bzw. eine AV zu haben. Nehmen wir an, uns interessiert die Geschlechterverteilung pro Wohnort - in diesem Fall prozentuieren wir nach Wohnort.\n", "\n", "Im folgenden Beispiel berechnen wir - mit den Daten aus der ersten Kreuztabelle von vorhin (Objekt *a*) - mittels einer **lambda** Funktion die Zeilenprozente. Früher musste man dies so durchführen, mittlerweile exisitiert in Pandas aber schon seit längerem auch eine komfortablere Möglichkeit, wie wir weiter unten noch sehen werden.\n", "\n", "[Display percent of 100 in stacked bar plot from crosstab from matplotlib in pandas ](https://stackoverflow.com/questions/57981287/display-percent-of-100-in-stacked-bar-plot-from-crosstab-from-matplotlib-in-pand)\n", "\n", "[pandas.DataFrame.apply](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sex12
wohnort
161.0538.95
258.3341.67
352.5247.48
\n", "
" ], "text/plain": [ "sex 1 2\n", "wohnort \n", "1 61.05 38.95\n", "2 58.33 41.67\n", "3 52.52 47.48" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEGCAYAAACevtWaAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAREklEQVR4nO3dfZBddXnA8e9DEicgQQhs6MKiixRpooyQrGAVAzVkouA0GSIjGcEgGTJOqYRalSB/KFNUGCugQ+tMBGqKGAG1DUMrNg2kQeSlCS+SsDJBTGFhIUuASGgDCTz9Y0/tmheze1/2bn75fmaYe++595zzJHfmm8PZe89GZiJJKss+rR5AktR4xl2SCmTcJalAxl2SCmTcJalAo1s9AMAhhxySnZ2drR5DkvYoq1evfiEz23b23IiIe2dnJ6tWrWr1GJK0R4mI/9rVc56WkaQCGXdJKpBxl6QCjYhz7pLUKlu3bqWnp4ctW7a0epRdGjt2LB0dHYwZM2bQ6xh3SXu1np4exo0bR2dnJxHR6nF2kJls3LiRnp4ejjzyyEGv52kZSXu1LVu2cPDBB4/IsANEBAcffPCQ/89it3GPiBsiYkNErBmwbHxELIuIddXtQQOeuyQinoiIxyNixpCmkaQWGKlh/z+1zDeYI/fvAR/ZbtlCYHlmHg0srx4TEZOAs4B3V+v8fUSMGvJUkqS67DbumbkSeHG7xTOBxdX9xcCsAct/mJmvZeZvgCeAExozqiRpsGr9geqhmdkLkJm9ETGhWn44cN+A1/VUy3YQEfOB+QBvf/vbaxyjRl952/Dub7h9ZVOrJ2gu378910h872bcAs826JMyhx3fmO00QKN/oLqzE0M7/VVPmbkoM7sys6utbaeXRpCkEenV//4fTj/nQt576id4z4fP5OalP2P1Lx/j5JNPZsqUKcyYMYPe3l42bdrEMcccw+OPPw7AnDlz+O53vzssM9Z65P58RLRXR+3twIZqeQ9wxIDXdQDP1jOgJI00d9z1Cw77ozb+5cZvA7Dpt6/w0bM/y9Kf/jttbW3cfPPNXHrppdxwww1ce+21nHvuuSxYsICXXnqJ888/f1hmrDXutwFzgSuq26UDlv8gIq4CDgOOBh6od0hJGkmO/ZM/5vN/czUXf/VbfOzUD3HQ2w5gzeO/Zvr06QC88cYbtLe3AzB9+nRuvfVWLrjgAh555JFhm3G3cY+IJcApwCER0QN8mf6o3xIR84CngDMBMnNtRNwCPAZsAy7IzDeaNLsktcS7jnoHq396E/9658+55OvXMn3qibz7Xe/k3tUP7/DaN998k+7ubvbdd19efPFFOjo6hmXGwXxaZk5mtmfmmMzsyMzrM3NjZk7LzKOr2xcHvP6rmXlUZh6TmT9t7viSNPyefa6P/fYdy9mzT+fznzmH+x9aQ9+LL3HvvfcC/Zc0WLt2LQBXX301EydOZMmSJZx33nls3bp1WGb08gOSNESP/modX7j8GvaJfRgzZjTf+fqXGD1qFBdefDGbNm1i27ZtXHTRRYwZM4brrruOBx54gHHjxjF16lQuv/xyLrvssqbPaNwlaYhmnPIBZpzygR2Wr1y5codl3d3dv7t/1VVXNXWugby2jCQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoH8KKQkDdD57Xouh7XjuuuvOH23a5133nncfvvtTJgwgTVr1uz29YPhkbsktdi5557LHXfc0dBtGndJarGpU6cyfvz4hm7TuEtSgYy7JBXIuEtSgYy7JBXIj0JK0gDrLzys9pVr/AXZc+bMYcWKFbzwwgt0dHRw2WWXMW/evNrnwLhLUsstWbKk4dv0tIwkFci4S1KBjLskFci4S1KBjLskFci4S1KB/CikJA206JTGbu8rm3b7kqeffppPfepTPPfcc+yzzz7Mnz+fBQsW1LVb4y5JLTZ69Gi++c1vMnnyZF555RWmTJnC9OnTmTRpUs3b9LSMJLVYe3s7kydPBmDcuHFMnDiRZ555pq5tGndJGkHWr1/PQw89xIknnljXdoy7JI0QmzdvZvbs2VxzzTUccMABdW3LuEvSCLB161Zmz57NJz/5Sc4444y6t2fcJanFMpN58+YxceJEPve5zzVkm35aRpIGmr+i9nVrvOTvPffcw4033sixxx7LcccdB8DXvvY1TjvttJpHMe6S1GInnXQSmdnQbdZ1WiYi/ioi1kbEmohYEhFjI2J8RCyLiHXV7UGNGlaSNDg1xz0iDgcuBLoy8z3AKOAsYCGwPDOPBpZXjyVJw6jeH6iOBvaNiNHAfsCzwExgcfX8YmBWnfuQpCbKhp8SabRa5qs57pn5DPC3wFNAL7ApM/8NODQze6vX9AITdrZ+RMyPiFURsaqvr6/WMSSpLmM3PcnGV7eN2MBnJhs3bmTs2LFDWq/mH6hW59JnAkcCLwO3RsTZg10/MxcBiwC6urpG5t+qpOJ1PHglPVxM39veCUR9G9vU3ZCZtjd27Fg6OjqGtE49n5Y5FfhNZvYBRMRPgA8Az0dEe2b2RkQ7sKGOfUhSU415/WWOvO+SxmxsEFeAHC71nHN/Cnh/ROwXEQFMA7qB24C51WvmAkvrG1GSNFQ1H7ln5v0R8SPgQWAb8BD9p1n2B26JiHn0/wNwZiMGlSQNXl1fYsrMLwNf3m7xa/QfxUuSWsRry0hSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBXIuEtSgYy7JBWorrhHxIER8aOI+FVEdEfEn0bE+IhYFhHrqtuDGjWsJGlwRte5/reAOzLz4xHxFmA/4EvA8sy8IiIWAguBi+vcT0N1bvlBq0doqvWtHkBSy9V85B4RBwBTgesBMvP1zHwZmAksrl62GJhV34iSpKGq57TMO4E+4B8i4qGIuC4i3gocmpm9ANXthAbMKUkagnriPhqYDHwnM48HXqX/FMygRMT8iFgVEav6+vrqGEOStL164t4D9GTm/dXjH9Ef++cjoh2gut2ws5Uzc1FmdmVmV1tbWx1jSJK2V3PcM/M54OmIOKZaNA14DLgNmFstmwssrWtCSdKQ1ftpmc8CN1WflHkS+DT9/2DcEhHzgKeAM+vch/R7/LSTtHt1xT0zHwa6dvLUtHq2K0mqj99QlaQCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKlC915aRpEHzukDDxyN3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAtUd94gYFREPRcTt1ePxEbEsItZVtwfVP6YkaSgaceS+AOge8HghsDwzjwaWV48lScOorrhHRAdwOnDdgMUzgcXV/cXArHr2IUkaunqP3K8Bvgi8OWDZoZnZC1DdTtjZihExPyJWRcSqvr6+OseQJA1Uc9wj4mPAhsxcXcv6mbkoM7sys6utra3WMSRJOzG6jnU/CPx5RJwGjAUOiIjvA89HRHtm9kZEO7ChEYNKkgav5iP3zLwkMzsysxM4C7gzM88GbgPmVi+bCyyte0pJ0pA043PuVwDTI2IdML16LEkaRvWclvmdzFwBrKjubwSmNWK7kqTa+A1VSSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAtUc94g4IiLuiojuiFgbEQuq5eMjYllErKtuD2rcuJKkwajnyH0b8NeZORF4P3BBREwCFgLLM/NoYHn1WJI0jGqOe2b2ZuaD1f1XgG7gcGAmsLh62WJgVp0zSpKGqCHn3COiEzgeuB84NDN7of8fAGDCLtaZHxGrImJVX19fI8aQJFXqjntE7A/8GLgoM3872PUyc1FmdmVmV1tbW71jSJIGqCvuETGG/rDflJk/qRY/HxHt1fPtwIb6RpQkDVU9n5YJ4HqgOzOvGvDUbcDc6v5cYGnt40mSajG6jnU/CJwDPBoRD1fLvgRcAdwSEfOAp4Az65pQkjRkNcc9M38OxC6enlbrdiVJ9fMbqpJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUIOMuSQUy7pJUoKbFPSI+EhGPR8QTEbGwWfuRJO2oKXGPiFHA3wEfBSYBcyJiUjP2JUnaUbOO3E8AnsjMJzPzdeCHwMwm7UuStJ3RTdru4cDTAx73ACcOfEFEzAfmVw83R8TjTZplJDgEeGG4dhZXDtee9hq+f3uu0t+7d+zqiWbFPXayLH/vQeYiYFGT9j+iRMSqzOxq9Ryqje/fnmtvfu+adVqmBzhiwOMO4Nkm7UuStJ1mxf0/gaMj4siIeAtwFnBbk/YlSdpOU07LZOa2iPhL4GfAKOCGzFzbjH3tIfaK008F8/3bc+21711k5u5fJUnao/gNVUkqkHGXpAIZ9yaKiBsiYkNErGn1LBqaiDgiIu6KiO6IWBsRC1o9kwYvIsZGxAMR8Uj1/l3W6pmGm+fcmygipgKbgX/MzPe0eh4NXkS0A+2Z+WBEjANWA7My87EWj6ZBiIgA3pqZmyNiDPBzYEFm3tfi0YaNR+5NlJkrgRdbPYeGLjN7M/PB6v4rQDf937zWHiD7ba4ejqn+26uOZI27tBsR0QkcD9zf4lE0BBExKiIeBjYAyzJzr3r/jLv0B0TE/sCPgYsy87etnkeDl5lvZOZx9H9D/oSI2KtOjRp3aReqc7U/Bm7KzJ+0eh7VJjNfBlYAH2ntJMPLuEs7Uf1A7nqgOzOvavU8GpqIaIuIA6v7+wKnAr9q6VDDzLg3UUQsAe4FjomInoiY1+qZNGgfBM4BPhwRD1f/ndbqoTRo7cBdEfFL+q91tSwzb2/xTMPKj0JKUoE8cpekAhl3SSqQcZekAhl3SSqQcZekAhl3qRIRm3f/qpq2e2BE/EUzti3tinGXmigiRgEHAsZdw8q4qzgR8cWIuLC6f3VE3FndnxYR34+IORHxaESsiYgrt1v3q9U1wO+LiEOrZd+LiG9HxC8i4smI+Hi1PCLiG9V2Ho2IT1TLT6muBf8D4FHgCuCo6otQ3xjGvwrtxYy7SrQS+FB1vwvYv7pOzEnAOuBK4MPAccD7ImJW9dq3Avdl5nurbZw/YJvt1fofoz/WAGdU23gv/V9v/0Z1HXiAE4BLM3MSsBD4dWYel5lfaOifVNoF464SrQamVL9k4zX6LwHRRX/wXwZWZGZfZm4DbgKmVuu9Dtw+YBudA7b5z5n5ZvXLOg6tlp0ELKmuPvg88B/A+6rnHsjM3zTjDycNhnFXcTJzK7Ae+DTwC+Bu4M+Ao4Cn/sCqW/P/r8fxBjB6wHOvDbgf293uzKtDGFlqOOOuUq0EPl/d3g18BngYuA84OSIOqX7YOYf+I+5a9/GJ6pdCtNH/fwAP7OR1rwDjatyHVBPjrlLdTf958nurUyZbgLszsxe4BLgLeAR4MDOX1riPfwJ+WW3nTuCLmfnc9i/KzI3APdUPXv2BqoaFV4WUpAJ55C5JBTLuklQg4y5JBTLuklQg4y5JBTLuklQg4y5JBfpfwiBCujx51qAAAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "crosstab = kreuztab.apply(lambda z: z/z.sum()*100, axis = 1).round(2) # Berechnung der Zeilenprozente,\n", "# 'axis = 0' (Spaltenprozente) / 'axis = 1' (Zeilenprozente)\n", "\n", "# Variablen werden aus obiger Kreuztabelle übernommen ('kreuztab' steht für obige Kreuztabelle)\n", "\n", "display(crosstab) # Anzeige der neuen Kreuztabelle mit Zeilenprozenten. Oder nur 'crosstab' in Jupyter...\n", "\n", "ax = crosstab.plot.bar(stacked = True, rot = 0) # Anzeige der Grafik" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Durch diese Darstellung, sowohl tabellarisch als auch grafisch, können nun Zusammenhänge bzw. Unterschiede leichter ausgemacht werden. So geht klar hervor, dass das Geschlechterverhältnis in der Stichprobe bei Wohnort 3 beinahe ausgewogen ist, während bei Wohnort 1 über 61% Frauen und nur knapp 39% Männer befragt werden konnten.\n", "\n", "Einfacher geht die Prozentuierung mit dem Parameter **normalize**. Damit hat man folgende Möglichkeiten:\n", "\n", "|Option||Option||\n", "|-|-|-|-|\n", "|'index'|Prozentuierung entlang der Zeilen|True|Prozentuierung über alle Werte|\n", "|1|Prozentuierung entlang der Zeilen|'all'|Prozentuierung über alle Werte|\n", "|'columns'|Prozentuierung entlang der Spalten|False|keine Prozentuierung, Standardeinstellung|\n", "|0|Prozentuierung entlang der Spalten|" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sex12
wohnort
161.0538.95
258.3341.67
352.5247.48
\n", "
" ], "text/plain": [ "sex 1 2\n", "wohnort \n", "1 61.05 38.95\n", "2 58.33 41.67\n", "3 52.52 47.48" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.crosstab(daten.wohnort, daten.sex, normalize = 'index').round(4)*100" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Obige Tabelle liefert natürlich das gleiche Ergebnis wie unsere eigene Berechnung mittels **lambda** Funktion - nur einfacher." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### \"Dreidimensionale\" Kreuztabelle mit crosstab()\n", "\n", "Auch das Hinzufügen einer dritten Variable ist mit der **crosstab()** Funktion möglich. Da unser kleiner Datensatz derzeit keine passende dritte Variable enthält, erstellen wir einfach eine. Die Variable *hardrock*, mit derzeit insgesamt 16 Ausprägungen (div. Dezimalzahlen zwischen 1 und 5) soll durch eine Division ohne Rest schließlich nur mehr 5 Ausprägungen haben (1, 2, 3, 4, 5). Da wir die Variable nicht überschreiben wollen, fügen wir eine neue Variable ins Dataframe ein." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "daten['hardrockpräferenz'] = daten['hardrock']//1 # d.h. bspw. alle Werte von 1 bis <2 werden zu 1, von 2 bis <3 zu 2, usw." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sexagewohnortvolksmusikhardrockhardrockpräferenz
015022.673.673.0
115711.003.333.0
226632.004.334.0
315022.332.672.0
416032.333.003.0
\n", "
" ], "text/plain": [ " sex age wohnort volksmusik hardrock hardrockpräferenz\n", "0 1 50 2 2.67 3.67 3.0\n", "1 1 57 1 1.00 3.33 3.0\n", "2 2 66 3 2.00 4.33 4.0\n", "3 1 50 2 2.33 2.67 2.0\n", "4 1 60 3 2.33 3.00 3.0" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "daten.head().round(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Erstellen wir nun eine Kreuztabelle mit drei Variablen. *sex* und *wohnort* sollen in den Zeilen zu finden sein, *hardrockpräferenz* in den Spalten (innerhalb eckiger Klammern muss stets die dritte Variable stehen, egal ob sie als Zeilen- oder Spaltenvariable dienen soll). Zusätzlich prozentuieren wir im Beispiel nach Zeilen und lassen und die Randsummen (in diesem Fall: Randprozente) ausgeben." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
hardrockpräferenz1.02.03.04.05.0
sexwohnort
118.6224.1436.2131.030.00
28.5725.7140.0017.148.57
317.8117.8135.6226.032.74
2127.0329.7329.7310.812.70
220.0036.0028.008.008.00
321.2128.7930.3015.154.55
All17.0125.5133.6720.073.74
\n", "
" ], "text/plain": [ "hardrockpräferenz 1.0 2.0 3.0 4.0 5.0\n", "sex wohnort \n", "1 1 8.62 24.14 36.21 31.03 0.00\n", " 2 8.57 25.71 40.00 17.14 8.57\n", " 3 17.81 17.81 35.62 26.03 2.74\n", "2 1 27.03 29.73 29.73 10.81 2.70\n", " 2 20.00 36.00 28.00 8.00 8.00\n", " 3 21.21 28.79 30.30 15.15 4.55\n", "All 17.01 25.51 33.67 20.07 3.74" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crosstab3D = pd.crosstab([daten.sex, daten.wohnort], daten.hardrockpräferenz, normalize = 'index', margins = True).round(4)*100\n", "# bzw. '(daten.wohnort, [daten.sex, daten.hardrockpräferenz])' wenn 'sex' als Drittvariable in den Spalten stehen soll\n", "\n", "crosstab3D" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Eine solche Kreuztabelle kann auch grafisch dargestellt werden, allerdings ist dies nicht mehr ganz so einfach les- bzw. interpretierbar." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = crosstab3D.plot.bar(stacked = True, rot = 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Weitere Möglichkeiten mit Kreuztabellen\n", "\n", "Darstellungen der absoluten oder relativen Häufigkeiten sind nicht alles, was mit **crosstab()** möglich ist. Mit **aggfunc** lassen sich div. Kennzahlen einer dritten Variable in der Kreuztabelle darstellen. Im Folgenden wird wieder eine einfache Kreuztabelle mit *sex* und *wohnort* erstellt. In den Zellen abgebildet werden soll aber jeweils der Mittelwert der Variable *volksmusik*. Man erhält also mit dieser Kreuztabelle Informationen über die Volksmusikpräferenz nach Geschlecht und Wohnort.\n", "\n", "[Pandas Crosstab Explained](https://pbpython.com/pandas-crosstab.html)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
wohnort123
sex
13.383.714.11
23.413.644.02
\n", "
" ], "text/plain": [ "wohnort 1 2 3\n", "sex \n", "1 3.38 3.71 4.11\n", "2 3.41 3.64 4.02" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "crosstabNeu = pd.crosstab(daten.sex, daten.wohnort, values = daten.volksmusik, aggfunc='mean').round(2)\n", "\n", "crosstabNeu # zwecks zuweisung zu grafik unten..." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXQAAAEGCAYAAAB1iW6ZAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAAU7klEQVR4nO3df3BV5Z3H8c+HEElHqOxCVJaAoR22BYkgZDCoRGTqjlBmsdVpRa2/2M3o2qnVLivb7tjlD6ujbceltDI4tdW2g3Zax2Eo6mgVgW4VAgYBqTvosENqWgNdkFSwIX73j1ydNCS595Kb3OTJ+zVzhnPO89xzvmHix8Nzn3OOI0IAgMFvWLELAAAUBoEOAIkg0AEgEQQ6ACSCQAeARAwv1onHjh0blZWVxTo9AAxK27dvPxgR5V21FS3QKysrVV9fX6zTA8CgZPt/u2tjyAUAEkGgA0AiCHQASASBDgCJINABIBEEOgAkgkAHgEQQ6ACQCAIdABJRtDtFMTBUPVrVr+fbdcOufj0fMJRwhQ4Aicg50G2X2H7V9vou2mx7pe19tl+zPbOwZQIAssnnCv12SXu7aVsgaXJmqZP0UC/rAgDkKacxdNsVkj4r6R5Jd3bRZbGkx6L9jdMv2x5te1xENBWuVABDCd/v5C/XK/QHJf2bpA+6aR8v6UCH7cbMvr9iu852ve365ubmfOoEAGSRNdBtL5L0TkRs76lbF/vipB0RayKiOiKqy8u7fD47AOAU5XKFfpGkf7S9X9Ljkubb/mmnPo2SJnTYrpD0dkEqBADkJGugR8S/R0RFRFRKulrSCxFxXadu6yRdn5ntUiPpCOPnANC/TvnGItu3SFJErJa0QdJCSfskvSfppoJUBwDIWV6BHhEbJW3MrK/usD8k3VbIwgAA+eFOUQBIBIEOAIkg0AEgEQQ6ACSCQAeARBDoAJAIAh0AEkGgA0AiCHQASATvFB1o/vOM/j3fpIn9ez4AfYYrdABIBIEOAIlgyAVAbhgOHPC4QgeARBDoAJAIAh0AEpHLS6LLbG+1vdP2Htsruugzz/YR2w2Z5e6+KRcA0J1cvhR9X9L8iGixXSppi+2nI+LlTv02R8SiwpcIAMhF1kDPvF6uJbNZmlmiL4sCAOQvpzF02yW2GyS9I+m5iHili25zMsMyT9s+t5vj1Nmut13f3Nx86lUDAE6SU6BHRFtEzJBUIWm27WmduuyQdE5ETJf0PUlPdXOcNRFRHRHV5eXlp141AOAkec1yiYjDkjZKurzT/ncjoiWzvkFSqe2xBaoRAJCDXGa5lNsenVn/mKTPSPpdpz5n23ZmfXbmuIcKXi0AoFu5zHIZJ+lR2yVqD+qfR8R627dIUkSslnSVpFttn5B0TNLVmS9TAQD9JJdZLq9JOr+L/as7rK+StKqwpQHoSeXyX/Xr+faX9evpcAp4OFcW/EcDYLDg1n8ASASBDgCJINABIBEEOgAkgkAHgEQQ6ACQCAIdABJBoANAIgh0AEgEgQ4AiSDQASARBDoAJIJAB4BEEOgAkAgCHQASkcsr6Mpsb7W90/Ye2yu66GPbK23vs/2a7Zl9Uy4AoDu5vODifUnzI6LFdqmkLbafjoiXO/RZIGlyZrlA0kOZPwEA/STrFXq0a8lslmaWzu8LXSzpsUzflyWNtj2usKUCAHqS0xi67RLbDZLekfRcRLzSqct4SQc6bDdm9gEA+klOgR4RbRExQ1KFpNm2p3Xq4q4+1nmH7Trb9bbrm5ub8y4WANC9vGa5RMRhSRslXd6pqVHShA7bFZLe7uLzayKiOiKqy8vL86sUANCjXGa5lNsenVn/mKTPSPpdp27rJF2fme1SI+lIRDQVulgAQPdymeUyTtKjtkvU/j+An0fEetu3SFJErJa0QdJCSfskvSfppj6qFwDQjayBHhGvSTq/i/2rO6yHpNsKWxoAIB/cKQoAiSDQASARBDoAJIJAB4BEEOgAkAgCHQASQaADQCIIdABIBIEOAIkg0AEgEQQ6ACSCQAeARBDoAJAIAh0AEkGgA0AiCHQASASBDgCJyOWdohNsv2h7r+09tm/vos8820dsN2SWu/umXABAd3J5p+gJSV+LiB22R0nabvu5iHi9U7/NEbGo8CUCAHKR9Qo9IpoiYkdm/aikvZLG93VhAID85DWGbrtS7S+MfqWL5jm2d9p+2va53Xy+zna97frm5ub8qwUAdCvnQLc9UtIvJX01It7t1LxD0jkRMV3S9yQ91dUxImJNRFRHRHV5efkplgwA6EpOgW67VO1h/rOIeLJze0S8GxEtmfUNkkptjy1opQCAHuUyy8WSfihpb0R8t5s+Z2f6yfbszHEPFbJQAEDPcpnlcpGkL0naZbshs+/rkiZKUkSslnSVpFttn5B0TNLVERGFLxcA0J2sgR4RWyQ5S59VklYVqigAQP64UxQAEkGgA0AiCHQASASBDgCJINABIBEEOgAkgkAHgEQQ6ACQCAIdABJBoANAIgh0AEgEgQ4AiSDQASARBDoAJIJAB4BEEOgAkIhcXkE3wfaLtvfa3mP79i762PZK2/tsv2Z7Zt+UCwDoTi6voDsh6WsRscP2KEnbbT8XEa936LNA0uTMcoGkhzJ/AgD6SdYr9IhoiogdmfWjkvZKGt+p22JJj0W7lyWNtj2u4NUCALqV1xi67UpJ50t6pVPTeEkHOmw36uTQl+062/W265ubm/MsFQDQk5wD3fZISb+U9NWIeLdzcxcfiZN2RKyJiOqIqC4vL8+vUgBAj3IKdNulag/zn0XEk110aZQ0ocN2haS3e18eACBXucxysaQfStobEd/tpts6SddnZrvUSDoSEU0FrBMAkEUus1wukvQlSbtsN2T2fV3SREmKiNWSNkhaKGmfpPck3VTwSgEAPcoa6BGxRV2PkXfsE5JuK1RRAID8cacoACSCQAeARBDoAJAIAh0AEkGgA0AiCHQASASBDgCJINABIBEEOgAkgkAHgEQQ6ACQCAIdABJBoANAIgh0AEgEgQ4AiSDQASARubyC7hHb79je3U37PNtHbDdklrsLXyYAIJtcXkH3Y0mrJD3WQ5/NEbGoIBUBAE5J1iv0iNgk6U/9UAsAoBcKNYY+x/ZO20/bPre7TrbrbNfbrm9ubi7QqQEAUmECfYekcyJiuqTvSXqqu44RsSYiqiOiury8vACnBgB8qNeBHhHvRkRLZn2DpFLbY3tdGQAgL70OdNtn23ZmfXbmmId6e1wAQH6yznKxvVbSPEljbTdK+qakUkmKiNWSrpJ0q+0Tko5Jujoios8qBgB0KWugR8SSLO2r1D6tEQBQRNwpCgCJINABIBG53CkKDCitra1qbGzU8ePHi13KKSkrK1NFRYVKS0uLXQoSQ6Bj0GlsbNSoUaNUWVmpzASrQSMidOjQITU2NmrSpEnFLgeJYcgFg87x48c1ZsyYQRfmkmRbY8aMGbT/usDARqBjUBqMYf6hwVw7BjYCHQASQaBjSBs5cmSfHPfw4cP6wQ9+0CfHBrpDoAMF1tbWRqCjKAh0JOH+++/XypUrJUl33HGH5s+fL0n69a9/reuuu05r165VVVWVpk2bprvuuuuvPvuNb3xD06dPV01Njf74xz9Kkm688UZ95Stf0YUXXqhPfOIT+sUvfiGpfZbKsmXLNG3aNFVVVemJJ56QJG3cuFGXXnqprrnmGlVVVWn58uV68803NWPGDC1btqy//howxBHoSEJtba02b94sSaqvr1dLS4taW1u1ZcsWTZ48WXfddZdeeOEFNTQ0aNu2bXrqqackSX/+859VU1OjnTt3qra2Vg8//PBHx2xqatKWLVu0fv16LV++XJL05JNPqqGhQTt37tTzzz+vZcuWqampSZK0detW3XPPPXr99dd133336ZOf/KQaGhr0wAMP9O9fBoYsAh1JmDVrlrZv366jR49qxIgRmjNnjurr67V582aNHj1a8+bNU3l5uYYPH65rr71WmzZtkiSddtppWrRo0UfH2L9//0fHvOKKKzRs2DBNnTr1oyv3LVu2aMmSJSopKdFZZ52lSy65RNu2bZMkzZ49m7nlKCoCHUkoLS1VZWWlfvSjH+nCCy/U3Llz9eKLL+rNN9/UxIkTe/zch9MIS0pKdOLEiY/aRowY8dH6hw8Q7elBoqeffnpvfwygVwh0JKO2tlbf/va3VVtbq7lz52r16tWaMWOGampq9NJLL+ngwYNqa2vT2rVrdckll5zyOZ544gm1tbWpublZmzZt0uzZs0/qN2rUKB09erS3PxKQFwIdyZg7d66ampo0Z84cnXXWWSorK9PcuXM1btw43Xvvvbr00ks1ffp0zZw5U4sXLz6lc3zuc5/Teeedp+nTp2v+/Pm6//77dfbZZ5/Ub8yYMbrooos0bdo0vhRFv3Gx3kVRXV0d9fX1RTl3PiqX/6pfz7e/7Jp+PV/VpO6HI/rCrht29foYe/fu1ZQpUwpQTfEU4mfgd7OwCvG72R9sb4+I6q7asl6h237E9ju2d3fTbtsrbe+z/Zrtmb0tGACQv1yGXH4s6fIe2hdImpxZ6iQ91PuyAAD5yhroEbFJ0p966LJY0mPR7mVJo22PK1SBAIDcFOJL0fGSDnTYbszsO4ntOtv1tuubm5sLcGoAwIcKEehdPQu0y29aI2JNRFRHRHV5eXkBTg0A+FAhAr1R0oQO2xWS3i7AcQEAeSjEK+jWSfqy7cclXSDpSEQ0FeC4QN4KPZVv/32fzanfzTffrPXr1+vMM8/U7t1dTggD+lwu0xbXSvqtpE/ZbrS91PYttm/JdNkg6S1J+yQ9LOlf+qxaYIC68cYb9cwzzxS7DAxxWa/QI2JJlvaQdFvBKgIGodra2r96sBdQDNz6DwCJINABIBEEOgAkgkAHgEQUYtoiMGDkOs2w0JYsWaKNGzfq4MGDqqio0IoVK7R06dKi1IKhi0AHCmDt2rXFLgFgyAUAUkGgA0AiCHQASASBDgCJINABIBEEOgAkgmmLSMt/nlHg4x3J2uXAgQO6/vrr9Yc//EHDhg1TXV2dbr/99sLWAeSAQAd6afjw4frOd76jmTNn6ujRo5o1a5Yuu+wyTZ06tdilYYhhyAXopXHjxmnmzJmSpFGjRmnKlCn6/e9/X+SqMBQR6EAB7d+/X6+++qouuOCCYpeCISinQLd9ue03bO+zvbyL9nm2j9huyCx3F75UYGBraWnRlVdeqQcffFAf//jHi10OhqCsY+i2SyR9X9Jlan8h9Dbb6yLi9U5dN0fEoj6oERjwWltbdeWVV+raa6/V5z//+WKXgyEqlyv02ZL2RcRbEfEXSY9LWty3ZQGDR0Ro6dKlmjJliu68885il4MhLJdZLuMlHeiw3SipqwHCObZ3Snpb0r9GxJ4C1AfkJ4dphoX2m9/8Rj/5yU9UVVWlGTNmSJK+9a1vaeHChf1eC4a2XALdXeyLTts7JJ0TES22F0p6StLkkw5k10mqk6SJEyfmVykwQF188cVqf1c6UFy5DLk0SprQYbtC7VfhH4mIdyOiJbO+QVKp7bGdDxQRayKiOiKqy8vLe1E2AKCzXAJ9m6TJtifZPk3S1ZLWdexg+2zbzqzPzhz3UKGLBQB0L+uQS0ScsP1lSc9KKpH0SETssX1Lpn21pKsk3Wr7hKRjkq4O/g0KAP0qp1v/M8MoGzrtW91hfZWkVYUtDQCQD+4UBYBEEOgAkAietoikVD1aVdDj7bphV9Y+x48fV21trd5//32dOHFCV111lVasWFHQOoBcEOhAL40YMUIvvPCCRo4cqdbWVl188cVasGCBampqil0ahhiGXIBesq2RI0dKan+mS2trqzKzeIF+RaADBdDW1qYZM2bozDPP1GWXXcbjc1EUBDpQACUlJWpoaFBjY6O2bt2q3bt3F7skDEEEOlBAo0eP1rx58/TMM88UuxQMQQQ60EvNzc06fPiwJOnYsWN6/vnn9elPf7q4RWFIYpYLkpLLNMNCa2pq0g033KC2tjZ98MEH+sIXvqBFi3jXC/ofgQ700nnnnadXX3212GUADLkAQCoIdABIBIGOQWkwP515MNeOgY1Ax6BTVlamQ4cODcpgjAgdOnRIZWVlxS4FCeJLUQw6FRUVamxsVHNzc7FLOSVlZWWqqKgodhlIEIGOQae0tFSTJk0qdhnAgJPTkIvty22/YXuf7eVdtNv2ykz7a7ZnFr5UAEBPsga67RJJ35e0QNJUSUtsT+3UbYGkyZmlTtJDBa4TAJBFLlfosyXti4i3IuIvkh6XtLhTn8WSHot2L0sabXtcgWsFAPQglzH08ZIOdNhulNT52aBd9RkvqaljJ9t1ar+Cl6QW22/kVe0Q0IunaI+VdDD/j/XvUwF9I88JH6z43RwwzumuIZdA7+qn7DxfLJc+iog1ktbkcE7kyXZ9RFQXuw6gM343+08uQy6NkiZ02K6Q9PYp9AEA9KFcAn2bpMm2J9k+TdLVktZ16rNO0vWZ2S41ko5ERFPnAwEA+k7WIZeIOGH7y5KelVQi6ZGI2GP7lkz7akkbJC2UtE/Se5Ju6ruS0Q2GsjBQ8bvZTzwYb58GAJyMZ7kAQCIIdABIBIE+yNl+xPY7tnnNPAYU2xNsv2h7r+09tm8vdk2pYwx9kLNdK6lF7XfqTit2PcCHMneLj4uIHbZHSdou6YqIeL3IpSWLK/RBLiI2SfpTsesAOouIpojYkVk/Kmmv2u8gRx8h0AH0OduVks6X9EqRS0kagQ6gT9keKemXkr4aEe8Wu56UEegA+oztUrWH+c8i4sli15M6Ah1An7BtST+UtDcivlvseoYCAn2Qs71W0m8lfcp2o+2lxa4JyLhI0pckzbfdkFkWFruolDFtEQASwRU6ACSCQAeARBDoAJAIAh0AEkGgA0AiCHQASASBDgCJINAxJNk+3favbO+0vdv2F23Psv2S7e22n7U9zvYZtt+w/anM59ba/udi1w90JetLooFEXS7p7Yj4rCTZPkPS05IWR0Sz7S9Kuicibs68JP3Htv9L0t9ExMPFKxvoHneKYkiy/feSnpX0c0nrJf2fpP+W9FamS4mkpoj4h0z/NZKulDQ9Ihr7v2IgO67QMSRFxP/YniVpoaR7JT0naU9EzOnc1/YwSVMkHZP0t5IIdAxIjKFjSLL9d5Lei4ifSvq2pAsklduek2kvtX1upvsdan/bzhJJj2QeCQsMOFyhY6iqkvSA7Q8ktUq6VdIJSSsz4+nDJT1ou1XSP0maHRFHbW+S9B+SvlmkuoFuMYYOAIlgyAUAEkGgA0AiCHQASASBDgCJINABIBEEOgAkgkAHgET8P1ajvNrlFDadAAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = crosstabNeu.plot.bar(rot = 0, stacked = False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Man sieht, dass die Präferenz von Volksmusik bei beiden Geschlechtern von ländlicher Umgebung (1) über kleinstädtische Umgebung (2) bis hin zu großstädtischer Umgebung (3) jeweils leicht abnimmt." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 7.2) Kreuztabellierung mit groupby()\n", "\n", "**crosstab()** ist nicht die einzige Möglichkeit, mit **Pandas** Kreuztabellen zu erstellen. Auch mit der aus dem vorangegangenen Kapitel bekannten Funktion **groupby()** können im Grunde die gleichen Kreuztabellen erstellt werden. Die Schreibweise des Codes unterscheidet sich natürlich.\n", "\n", "[Aggregation and Grouping](https://jakevdp.github.io/PythonDataScienceHandbook/03.08-aggregation-and-grouping.html)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sex12
wohnort
15837
23525
37366
\n", "
" ], "text/plain": [ "sex 1 2\n", "wohnort \n", "1 58 37\n", "2 35 25\n", "3 73 66" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "kt = daten.groupby(['wohnort', 'sex'])['sex'].count().unstack()\n", "# '.unstack()' für den 'Kreuztabellenlook' nötig. Anderenfalls würden beide Variablen als Zeilenvariablen dargestellt.\n", "\n", "kt" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEDCAYAAADOc0QpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAATLUlEQVR4nO3dfbCedZ3f8fdHkiUgEXk4MIEDm9RBNwGXLBxx6bKIy6bBwggjdZaMlCCMmZ3igrWoUP4AZ2SGHapIx67TCKypqwGku4VZqS4NUqzLQxMe5CGbwgqFAwFicCNowSR8+8e5wUM44Tzc55w7+eX9+ue6rt/veviGe+Zzfvzu67ruVBWSpLa8o9cFSJImn+EuSQ0y3CWpQYa7JDXIcJekBhnuktSgGb0uAGD//fevuXPn9roMSdqprFmz5mdV1TdS3w4R7nPnzmX16tW9LkOSdipJ/u/2+pyWkaQGGe6S1CDDXZIaNOqce5LrgFOAF6rqiG36LgSuBPqq6medtouBc4GtwPlV9YOJFLZ582YGBwd55ZVXJnL4tJk1axb9/f3MnDmz16VI0hvG8oXqN4GvAf9leGOSQ4BFwFPD2hYAZwCHAwcB/yPJe6tq63gLGxwcZPbs2cydO5ck4z18WlQVGzduZHBwkHnz5vW6HEl6w6jTMlV1J/DiCF1XAZ8Hhr9W8lTg+qp6taqeAB4HjplIYa+88gr77bffDhvsAEnYb7/9dvj/u5C065nQnHuSjwLPVNWD23QdDDw9bHuw0zYhO3Kwv25nqFHSrmfc97kn2RO4BPgXI3WP0DbiC+OTLAOWARx66KHjLUOS9DYm8hDTe4B5wIOdUWs/cF+SYxgaqR8ybN9+4NmRTlJVy4HlAAMDA/5iiLQruGzvXlcwtS7b1OsK3jDuaZmqeqiqDqiquVU1l6FAP6qqngNuAc5IsnuSecBhwL2TWvEofvnLX3LyySdz5JFHcsQRR3DDDTewZs0aPvShD3H00UezePFi1q9fz6ZNm3jf+97HunXrAFiyZAnf+MY3prNUSZoyY7kVciVwArB/kkHg0qq6dqR9q+qRJDcCjwJbgPMmcqdMN77//e9z0EEH8b3vfQ+ATZs28ZGPfISbb76Zvr4+brjhBi655BKuu+46vva1r3H22WdzwQUX8POf/5xPfepT01mqJE2ZUcO9qpaM0j93m+3Lgcu7K2vi3v/+93PhhRfyhS98gVNOOYV99tmHhx9+mEWLFgGwdetW5syZA8CiRYv47ne/y3nnnceDD2773bAk7bx2iBeHTab3vve9rFmzhltvvZWLL76YRYsWcfjhh3PXXXe9Zd/XXnuNtWvXsscee/Diiy/S39/fg4olafI19/qBZ599lj333JMzzzyTCy+8kHvuuYcNGza8Ee6bN2/mkUceAeCqq65i/vz5rFy5knPOOYfNmzf3snRJmjTNjdwfeughPve5z/GOd7yDmTNn8vWvf50ZM2Zw/vnns2nTJrZs2cJnPvMZZs6cyTXXXMO9997L7NmzOf744/nSl77EF7/4xV7/EySpa82F++LFi1m8ePFb2u+88863tK1du/aN9a985StTWpckTafmpmUkSYa7JDXJcJekBhnuktQgw12SGmS4S1KDDPe3cc4553DAAQdwxBFHjL6zJO1Adpr73Ode9L1JPd+TV5w86j5nn302n/70pznrrLMm9dqSNNUcub+N448/nn333bfXZUjSuBnuktQgw12SGmS4S1KDDHdJapDh/jaWLFnCsccey7p16+jv7+faa0f8dUFJ2uHsNLdCjuXWxcm2cuXKab+mJE0GR+6S1CDDXZIaNGq4J7kuyQtJHh7WdmWSf0jykyR/k+Tdw/ouTvJ4knVJ3vqTSJKkKTeWkfs3gZO2absNOKKqfhf4P8DFAEkWAGcAh3eO+Ysku01atZKkMRk13KvqTuDFbdr+rqq2dDbvBvo766cC11fVq1X1BPA4cMwk1itJGoPJmHM/B/jvnfWDgaeH9Q122iRJ06ircE9yCbAF+PbrTSPsVts5dlmS1UlWb9iwoZsypszTTz/Nhz/8YebPn8/hhx/O1Vdf3euSJGlMJnyfe5KlwCnAiVX1eoAPAocM260feHak46tqObAcYGBgYMQ/AG9y2d4TLXU759s06i4zZszgy1/+MkcddRQvvfQSRx99NIsWLWLBggWTW4skTbIJjdyTnAR8AfhoVf1qWNctwBlJdk8yDzgMuLf7Mntjzpw5HHXUUQDMnj2b+fPn88wzz/S4Kkka3agj9yQrgROA/ZMMApcydHfM7sBtSQDurqo/rapHktwIPMrQdM15VbV1qoqfTk8++ST3338/H/zgB3tdiiSNatRwr6olIzRv9yUrVXU5cHk3Re1oXn75ZU4//XS++tWv8q53vavX5UjSqHxCdRSbN2/m9NNP5xOf+AQf+9jHel2OJI2J4f42qopzzz2X+fPn89nPfrbX5UjSmBnub+PHP/4x3/rWt7j99ttZuHAhCxcu5NZbb+11WZI0qp3mlb9juXVxsh133HH85i5PSdp5OHKXpAYZ7pLUIMNdkhq0Q4f7zjDfvTPUKGnXs8OG+6xZs9i4ceMOHZ5VxcaNG5k1a1avS5GkN9lh75bp7+9ncHCQHfWNka+bNWsW/f39o+8oSdNohw33mTNnMm/evF6XIUk7pR12WkaSNHGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNWjUcE9yXZIXkjw8rG3fJLcleayz3GdY38VJHk+yLsniqSpckrR9Yxm5fxM4aZu2i4BVVXUYsKqzTZIFwBnA4Z1j/iLJbpNWrSRpTEYN96q6E3hxm+ZTgRWd9RXAacPar6+qV6vqCeBx4JjJKVWSNFYTnXM/sKrWA3SWB3TaDwaeHrbfYKftLZIsS7I6yeod/bW+krSzmewvVDNC24i/tlFVy6tqoKoG+vr6JrkMSdq1TTTcn08yB6CzfKHTPggcMmy/fuDZiZcnSZqIiYb7LcDSzvpS4OZh7Wck2T3JPOAw4N7uSpQkjdeov8SUZCVwArB/kkHgUuAK4MYk5wJPAR8HqKpHktwIPApsAc6rqq1TVLskaTtGDfeqWrKdrhO3s//lwOXdFCVJ6o5PqEpSgwx3SWrQqNMyTbps715XMLUu29TrCiT1mCN3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJalBX4Z7k3yZ5JMnDSVYmmZVk3yS3JXmss9xnsoqVJI3NhMM9ycHA+cBAVR0B7AacAVwErKqqw4BVnW1J0jTqdlpmBrBHkhnAnsCzwKnAik7/CuC0Lq8hSRqnCYd7VT0D/AfgKWA9sKmq/g44sKrWd/ZZDxwwGYVKksaum2mZfRgapc8DDgLemeTMcRy/LMnqJKs3bNgw0TIkSSPoZlrmj4EnqmpDVW0G/hr458DzSeYAdJYvjHRwVS2vqoGqGujr6+uiDEnStroJ96eA30+yZ5IAJwJrgVuApZ19lgI3d1eiJGm8Zkz0wKq6J8lNwH3AFuB+YDmwF3BjknMZ+gPw8ckoVJI0dhMOd4CquhS4dJvmVxkaxUuSesQnVCWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDuroVUpLGY+4r3+l1CVPqyV4XMIwjd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUoK7CPcm7k9yU5B+SrE1ybJJ9k9yW5LHOcp/JKlaSNDbdjtyvBr5fVb8DHAmsBS4CVlXVYcCqzrYkaRpNONyTvAs4HrgWoKp+XVX/BJwKrOjstgI4rbsSJUnj1c3I/Z8BG4C/THJ/kmuSvBM4sKrWA3SWB0xCnZKkcejmZ/ZmAEcBf1ZV9yS5mnFMwSRZBiwDOPTQQ7soQ7ucy/budQVT67JNva5ADehm5D4IDFbVPZ3tmxgK++eTzAHoLF8Y6eCqWl5VA1U10NfX10UZkqRtTTjcq+o54Okk7+s0nQg8CtwCLO20LQVu7qpCSdK4dTMtA/BnwLeT/BbwU+CTDP3BuDHJucBTwMe7vMak8xfYJbWuq3CvqgeAgRG6TuzmvJKk7viEqiQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDug73JLsluT/J33a2901yW5LHOst9ui9TkjQekzFyvwBYO2z7ImBVVR0GrOpsS5KmUVfhnqQfOBm4ZljzqcCKzvoK4LRuriFJGr9uR+5fBT4PvDas7cCqWg/QWR7Q5TUkSeM04XBPcgrwQlWtmeDxy5KsTrJ6w4YNEy1DkjSCbkbufwB8NMmTwPXAHyX5K+D5JHMAOssXRjq4qpZX1UBVDfT19XVRhiRpWxMO96q6uKr6q2oucAZwe1WdCdwCLO3sthS4uesqJUnjMhX3uV8BLEryGLCosy1JmkYzJuMkVXUHcEdnfSNw4mScV5I0MT6hKkkNMtwlqUGTMi0jTae5r3yn1yVMqSd7XYCa4MhdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KDJhzuSQ5J8sMka5M8kuSCTvu+SW5L8lhnuc/klStJGotuRu5bgH9XVfOB3wfOS7IAuAhYVVWHAas625KkaTThcK+q9VV1X2f9JWAtcDBwKrCis9sK4LQua5QkjdOkzLknmQv8HnAPcGBVrYehPwDAAZNxDUnS2HUd7kn2Av4r8Jmq+sU4jluWZHWS1Rs2bOi2DEnSMF2Fe5KZDAX7t6vqrzvNzyeZ0+mfA7ww0rFVtbyqBqpqoK+vr5syJEnb6OZumQDXAmur6ivDum4BlnbWlwI3T7w8SdJEzOji2D8A/jXwUJIHOm3/HrgCuDHJucBTwMe7qlCSNG4TDveq+l9AttN94kTPK0nqnk+oSlKDDHdJapDhLkkNMtwlqUGGuyQ1yHCXpAYZ7pLUIMNdkhpkuEtSgwx3SWqQ4S5JDTLcJalBhrskNchwl6QGGe6S1CDDXZIaZLhLUoMMd0lqkOEuSQ0y3CWpQYa7JDVoysI9yUlJ1iV5PMlFU3UdSdJbTUm4J9kN+E/AR4AFwJIkC6biWpKkt5qqkfsxwONV9dOq+jVwPXDqFF1LkrSNGVN03oOBp4dtDwIfHL5DkmXAss7my0nWTVEtO4L9gZ9N18Xy59N1pV2Gn9/Oq/XP7re31zFV4Z4R2upNG1XLgeVTdP0dSpLVVTXQ6zo0MX5+O69d+bObqmmZQeCQYdv9wLNTdC1J0jamKtz/N3BYknlJfgs4A7hliq4lSdrGlEzLVNWWJJ8GfgDsBlxXVY9MxbV2ErvE9FPD/Px2XrvsZ5eqGn0vSdJOxSdUJalBhrskNchwl6QGGe7SMEl+J8mJSfbapv2kXtWksUtyTJIPdNYXJPlskn/Z67p6wS9Up1GST1bVX/a6Do0syfnAecBaYCFwQVXd3Om7r6qO6mF5GkWSSxl6n9UM4DaGnoq/A/hj4AdVdXnvqpt+hvs0SvJUVR3a6zo0siQPAcdW1ctJ5gI3Ad+qqquT3F9Vv9fbCvV2Op/fQmB34Dmgv6p+kWQP4J6q+t1e1jfdpur1A7usJD/ZXhdw4HTWonHbrapeBqiqJ5OcANyU5LcZ+ZUa2rFsqaqtwK+S/GNV/QKgqv5fktd6XNu0M9wn34HAYuDn27QH+PvpL0fj8FyShVX1AEBnBH8KcB3w/p5WprH4dZI9q+pXwNGvNybZGzDc1bW/BfZ6PSCGS3LHtFej8TgL2DK8oaq2AGcl+c+9KUnjcHxVvQpQVcPDfCawtDcl9Y5z7pLUIG+FlKQGGe6S1CDDXepI8vIUnffdSf7NVJxb2h7DXZpCnR+LfzdguGtaGe5qTpLPd542JclVSW7vrJ+Y5K+SLEnyUJKHkzf/6mWSy5M8mOTuJAd22r6Z5D8m+fskP03yrzrtSXJl5zwPJfmTTvsJSX6Y5DvAQ8AVwHuSPJDkymn8T6FdmOGuFt0J/GFnfQDYK8lM4DjgMeDPgT9i6GnGDyQ5rbPvO4G7q+rIzjk+NeycczrHn8JQWAN8rHOOIxl6xP3KJHM6fccAl1TVAuAi4B+ramFVfW5S/6XSdhjuatEa4Ogks4FXgbsYCvk/BP4JuKOqNnTuYf82cHznuF8z9JzC6+eYO+yc/62qXquqR/nNk8bHASuramtVPQ/8T+ADnb57q+qJqfjHSWNhuKs5VbUZeBL4JENPBf8I+DDwHuCptzl0c/3mwY+tvPkhv1eHrWeb5Uh+OY6SpUlnuKtVdwIXdpY/Av4UeAC4G/hQkv07X3YuYWjEPdFr/EmS3ZL0MfR/APeOsN9LwOwJXkOaEMNdrfoRQ/Pkd3WmTF4BflRV64GLgR8CDwL3vf5a3wn4G+AnnfPcDny+qp7bdqeq2gj8uPPFq1+oalr4+gFJapAjd0lqkOEuSQ0y3CWpQYa7JDXIcJekBhnuktQgw12SGmS4S1KD/j+5EdzWEgHX/QAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = kt.plot.bar(stacked = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Möchte man bei **groupby()**-Kreuztabellen relative Häufigkeiten, ist man auf die bereits bekannte lambda Funktion angewiesen." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sex12
wohnort
161.0538.95
258.3341.67
352.5247.48
\n", "
" ], "text/plain": [ "sex 1 2\n", "wohnort \n", "1 61.05 38.95\n", "2 58.33 41.67\n", "3 52.52 47.48" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ktNeu = kt.apply(lambda z: z/z.sum()*100, axis = 1).round(2) # 'axis=1' = zeilenweise Prozentuierung\n", "\n", "ktNeu" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXcAAAEDCAYAAADOc0QpAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8QVMy6AAAACXBIWXMAAAsTAAALEwEAmpwYAAARBUlEQVR4nO3dfZBddXnA8e9jEhuQgAls0oUVF2nEBCwhWUEUQzBkooUpDCkjGZEgGTJOQaBUBeQPdAojjuXNoXUaAc0gRpDahrGopQEKIi9NCMhLzAQxhYUAa8DwooEkPP1jD3XZbEz23r33bn75fv659557zzlPcme+e3L2npvITCRJZXlHqweQJA094y5JBTLuklQg4y5JBTLuklQg4y5JBRrZ6gEA9tprr+zs7Gz1GJK0Q1m+fPlvM7NtoOeGRdw7OztZtmxZq8eQpB1KRPzv1p7ztIwkFci4S1KBjLskFWhYnHOXpFbZuHEj3d3dbNiwodWjbNXo0aPp6Ohg1KhR272OcZe0U+vu7mbMmDF0dnYSEa0eZwuZybp16+ju7ma//fbb7vW2eVomIq6LiBci4tE+y8ZFxG0Rsbq6HdvnuQsi4omIWBURswf9J5GkJtqwYQN77rnnsAw7QESw5557DvpfFttzzv27wCf6LTsfWJqZE4Gl1WMiYjJwEnBgtc4/R8SIQU0kSU02XMP+llrm22bcM/Mu4MV+i48DFlX3FwHH91n+g8x8PTN/AzwBHDroqSRJdan1nPuEzFwLkJlrI2J8tXwf4L4+r+uulm0hIhYACwD23XffGseo0Vf2aO7+mu0r61s9QWP5/u24huN7N/smeHaIfpm69yFDs50hMNQfhRzo3w4D/ldPmbkwM7sys6utbcCrZyVpWHrt93/gmM+cxcFHf4qDPn4iNy75Gct/+ThHHnkk06ZNY/bs2axdu5b169dzwAEHsGrVKgDmzp3Lt7/97abMWOuR+/MR0V4dtbcDL1TLu4H39HldB/BsPQNK0nDz0zt+wd5/3sZ/XP9NANa//AqfPPnzLPnJf9HW1saNN97IhRdeyHXXXcfVV1/Nqaeeytlnn81LL73E6aef3pQZa437LcA84NLqdkmf5d+PiMuBvYGJwAP1DilJw8kHP/AXfOEfruC8S67i2KM/xtg9dufRVb9m1qxZAGzevJn29nYAZs2axQ9/+EPOOOMMHn744abNuM24R8RiYAawV0R0AxfRG/WbImI+8BRwIkBmPhYRNwGPA5uAMzJzc4Nml6SWeP/+72X5T27g1tt/zgVfu5pZ0w/jwPe/j3uXP7TFa998801WrlzJLrvswosvvkhHR0dTZtyeT8vMzcz2zByVmR2ZeW1mrsvMmZk5sbp9sc/rL8nM/TPzgMz8SWPHl6Tme/a5HnbdZTQnzzmGL3zuM9y/4lF6XnyJe++9F+i96vWxxx4D4IorrmDSpEksXryY0047jY0bNzZlRq9QlaRBeuRXq/nixVfyjngHo0aN5Ftf+zIjR4zgrPPOY/369WzatIlzzjmHUaNGcc011/DAAw8wZswYpk+fzsUXX8xXv/rVhs9o3CVpkGbP+AizZ3xki+V33XXXFstWrlz5//cvv/zyhs7Vl98KKUkFMu6SVCDjLkkFMu6SVCDjLkkFMu6SVCDjLkktdtpppzF+/HgOOuigIdumn3OXpD46v1nPdx1uue6aS4/Z5lqnnnoqZ555Jqecckod+347j9wlqcWmT5/OuHHjhnSbxl2SCmTcJalAxl2SCmTcJalAxl2SWmzu3LkcfvjhrFq1io6ODq699tq6t+lHISWpjzVn7V37ynsfUtNqixcvrn2fW+GRuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyS12NNPP81RRx3FpEmTOPDAA7nqqqvq3qafc5ekvhbOGNrtfWX9Nl8ycuRILrvsMqZOncorr7zCtGnTmDVrFpMnT655tx65S1KLtbe3M3XqVADGjBnDpEmTeOaZZ+rapnGXpGFkzZo1rFixgsMOO6yu7Rh3SRomXn31VebMmcOVV17J7rvvXte2jLskDQMbN25kzpw5fPrTn+aEE06oe3vGXZJaLDOZP38+kyZN4txzzx2SbRp3SWqxe+65h+uvv57bb7+dKVOmMGXKFG699da6tulHISWprwV31r5ujV/5e8QRR5CZte93AHUduUfE30XEYxHxaEQsjojRETEuIm6LiNXV7dihGlaStH1qjntE7AOcBXRl5kHACOAk4HxgaWZOBJZWjyVJTVTvOfeRwC4RMRLYFXgWOA5YVD2/CDi+zn1Ikgap5rhn5jPAPwJPAWuB9Zn5n8CEzFxbvWYtMH4oBpWkxsghP9891GqZr57TMmPpPUrfD9gbeFdEnDyI9RdExLKIWNbT01PrGJJUl9Hrn2Tda5uGbeAzk3Xr1jF69OhBrVfPp2WOBn6TmT0AEfEj4CPA8xHRnplrI6IdeGErAy8EFgJ0dXUNz79VScXrePDrdHMePXu8D4j6NrZ+5ZDM1N/o0aPp6OgY1Dr1xP0p4MMRsSvwB2AmsAx4DZgHXFrdLqljH5LUUKPe+B373XfB0GxsO74Bsllqjntm3h8RNwMPApuAFfQeie8G3BQR8+n9AXDiUAwqSdp+dV3ElJkXARf1W/w6vUfxkqQW8esHJKlAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SCmTcJalAxl2SClRX3CPi3RFxc0T8KiJWRsThETEuIm6LiNXV7dihGlaStH3qPXK/CvhpZn4AOBhYCZwPLM3MicDS6rEkqYlG1rpiROwOTAdOBcjMN4A3IuI4YEb1skXAncB59Qw51Do3fL/VIzTUmlYPIKnl6jlyfx/QA3wnIlZExDUR8S5gQmauBahuxw/BnJKkQagn7iOBqcC3MvMQ4DUGcQomIhZExLKIWNbT01PHGJKk/uqJezfQnZn3V49vpjf2z0dEO0B1+8JAK2fmwszsysyutra2OsaQJPVXc9wz8zng6Yg4oFo0E3gcuAWYVy2bByypa0JJ0qDV/AvVyueBGyLincCTwGfp/YFxU0TMB54CTqxzH5KkQaor7pn5ENA1wFMz69mu9Kf4aSdp27xCVZIKZNwlqUDGXZIKZNwlqUDGXZIKZNwlqUDGXZIKZNwlqUDGXZIKZNwlqUDGXZIKVO8Xh0nSdvN7gZrHI3dJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKpBxl6QCGXdJKlDdcY+IERGxIiJ+XD0eFxG3RcTq6nZs/WNKkgZjKI7czwZW9nl8PrA0MycCS6vHkqQmqivuEdEBHANc02fxccCi6v4i4Ph69iFJGrx6j9yvBL4EvNln2YTMXAtQ3Y6vcx+SpEGqOe4RcSzwQmYur3H9BRGxLCKW9fT01DqGJGkA9Ry5fxT464hYA/wA+HhEfA94PiLaAarbFwZaOTMXZmZXZna1tbXVMYYkqb+a456ZF2RmR2Z2AicBt2fmycAtwLzqZfOAJXVPKUkalEZ8zv1SYFZErAZmVY8lSU00cig2kpl3AndW99cBM4diu5Kk2niFqiQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoGMuyQVyLhLUoFqjntEvCci7oiIlRHxWEScXS0fFxG3RcTq6nbs0I0rSdoe9Ry5bwL+PjMnAR8GzoiIycD5wNLMnAgsrR5Lkpqo5rhn5trMfLC6/wqwEtgHOA5YVL1sEXB8nTNKkgZpSM65R0QncAhwPzAhM9dC7w8AYPxQ7EOStP3qjntE7Ab8K3BOZr48iPUWRMSyiFjW09NT7xiSpD7qintEjKI37Ddk5o+qxc9HRHv1fDvwwkDrZubCzOzKzK62trZ6xpAk9VPPp2UCuBZYmZmX93nqFmBedX8esKT28SRJtRhZx7ofBT4DPBIRD1XLvgxcCtwUEfOBp4AT65pQkjRoNcc9M38OxFaenlnrdiVJ9fMKVUkqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqkHGXpAIZd0kqUMPiHhGfiIhVEfFERJzfqP1IkrbUkLhHxAjgn4BPApOBuRExuRH7kiRtqVFH7ocCT2Tmk5n5BvAD4LgG7UuS1M/IBm13H+DpPo+7gcP6viAiFgALqoevRsSqBs0yHOwF/LZZO4uvN2tPOw3fvx1X6e/de7f2RKPiHgMsy7c9yFwILGzQ/oeViFiWmV2tnkO18f3bce3M712jTst0A+/p87gDeLZB+5Ik9dOouP8PMDEi9ouIdwInAbc0aF+SpH4aclomMzdFxJnAz4ARwHWZ+Vgj9rWD2ClOPxXM92/HtdO+d5GZ236VJGmH4hWqklQg4y5JBTLuklQg4y71EREfiIiZEbFbv+WfaNVM2n4RcWhEfKi6Pzkizo2Iv2r1XK3gL1SbKCI+m5nfafUcGlhEnAWcAawEpgBnZ+aS6rkHM3NqC8fTNkTERfR+n9VI4DZ6r4q/Ezga+FlmXtK66ZrPuDdRRDyVmfu2eg4NLCIeAQ7PzFcjohO4Gbg+M6+KiBWZeUhrJ9SfUr1/U4A/A54DOjLz5YjYBbg/M/+ylfM1W6O+fmCnFRG/3NpTwIRmzqJBG5GZrwJk5pqImAHcHBHvZeCv1NDwsikzNwO/j4hfZ+bLAJn5h4h4s8WzNZ1xH3oTgNnAS/2WB/CL5o+jQXguIqZk5kMA1RH8scB1wAdbOpm2xxsRsWtm/h6Y9tbCiNgDMO6q24+B3d4KRF8RcWfTp9FgnAJs6rsgMzcBp0TEv7RmJA3C9Mx8HSAz+8Z8FDCvNSO1jufcJalAfhRSkgpk3CWpQMZdqkTEqw3a7rsj4m8bsW1pa4y71EDVfxb/bsC4q6mMu4oTEV+qrjYlIq6IiNur+zMj4nsRMTciHomIRyPe/r9eRsQlEfFwRNwXEROqZd+NiG9GxC8i4smI+JtqeUTEN6rtPBIRn6qWz4iIOyLi+8AjwKXA/hHxUER8o4l/FdqJGXeV6C7gY9X9LmC3iBgFHAGsBr4OfJzeqxk/FBHHV699F3BfZh5cbeP0Pttsr9Y/lt5YA5xQbeNgei9x/0ZEtFfPHQpcmJmTgfOBX2fmlMz84pD+SaWtMO4q0XJgWkSMAV4H7qU38h8DfgfcmZk91WfYbwCmV+u9Qe91Cm9to7PPNv89M9/MzMf545XGRwCLM3NzZj4P/Dfwoeq5BzLzN434w0nbw7irOJm5EVgDfJbeq4LvBo4C9gee+hOrbsw/Xvixmbdf5Pd6n/vR73Ygrw1iZGnIGXeV6i7gC9Xt3cDngIeA+4AjI2Kv6pedc+k94q51H5+KiBER0UbvvwAeGOB1rwBjatyHVBPjrlLdTe958nurUyYbgLszcy1wAXAH8DDw4Ftf61uDfwN+WW3nduBLmflc/xdl5jrgnuoXr/5CVU3h1w9IUoE8cpekAhl3SSqQcZekAhl3SSqQcZekAhl3SSqQcZekAhl3SSrQ/wETN0H055KwXQAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "ax = ktNeu.plot.bar(stacked = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### \"Dreidimensionale\" Kreuztabelle mit groupby()\n", "\n", "Auch das ist mit **groupby()** möglich. Zuerst wird wieder eine dritte kategoriale Variable erstellt, das Alter in Jahrzehnten." ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "daten['Lebensjahrzehnt'] = daten['age']//10+1\n", "\n", "# '//' = Division ohne Rest (z.B. 18//10 = 1), plus Addition von 1 (z.B. 18//10+1 = 2, also 2. Lebensjahrzehnt)" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
sexagewohnortvolksmusikhardrockhardrockpräferenzLebensjahrzehnt
015022.673.673.06
115711.003.333.06
226632.004.334.07
315022.332.672.06
416032.333.003.07
\n", "
" ], "text/plain": [ " sex age wohnort volksmusik hardrock hardrockpräferenz Lebensjahrzehnt\n", "0 1 50 2 2.67 3.67 3.0 6\n", "1 1 57 1 1.00 3.33 3.0 6\n", "2 2 66 3 2.00 4.33 4.0 7\n", "3 1 50 2 2.33 2.67 2.0 6\n", "4 1 60 3 2.33 3.00 3.0 7" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "daten.head().round(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Erstellen wir nun unsere Kreuztabelle mit *sex* als dritter übergeordneter Variable, sowie *wohnort* als Zeilen- und *Lebensjahrzehnt* als Spaltenvariable. " ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Lebensjahrzehnt234567810
sexwohnort
11NaN19.08.016.012.01.01.01.0
2NaN13.06.06.08.02.0NaNNaN
3NaN39.013.06.011.02.02.0NaN
211.07.010.07.09.02.01.0NaN
2NaN8.03.05.05.03.01.0NaN
3NaN23.019.08.010.05.01.0NaN
\n", "
" ], "text/plain": [ "Lebensjahrzehnt 2 3 4 5 6 7 8 10\n", "sex wohnort \n", "1 1 NaN 19.0 8.0 16.0 12.0 1.0 1.0 1.0\n", " 2 NaN 13.0 6.0 6.0 8.0 2.0 NaN NaN\n", " 3 NaN 39.0 13.0 6.0 11.0 2.0 2.0 NaN\n", "2 1 1.0 7.0 10.0 7.0 9.0 2.0 1.0 NaN\n", " 2 NaN 8.0 3.0 5.0 5.0 3.0 1.0 NaN\n", " 3 NaN 23.0 19.0 8.0 10.0 5.0 1.0 NaN" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "daten.groupby(['sex', 'wohnort', 'Lebensjahrzehnt'])['Lebensjahrzehnt'].count().unstack() # '.unstack()' ist nötig, um Kreuztabellenform zu erhalten" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Abgebildet werden absolute Häufigkeiten. Diese Tabelle bietet uns also eine Übersicht der Anzahl der Befragten nach Geschlecht, Wohnort und Lebensjahrzehnt.\n", "\n", "Offenbar liegen für einige Zeilen-/Spaltenkombinationen keine Werte vor. Man kann diese Zellen (derzeit: 'NaN') mit '0' auffüllen, wenn man möchte. Dazu **fillna()** anfügen (und in unserem Fall den Wert '0' als Auffüllwert angeben)." ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Lebensjahrzehnt234567810
sexwohnort
110.019.08.016.012.01.01.01.0
20.013.06.06.08.02.00.00.0
30.039.013.06.011.02.02.00.0
211.07.010.07.09.02.01.00.0
20.08.03.05.05.03.01.00.0
30.023.019.08.010.05.01.00.0
\n", "
" ], "text/plain": [ "Lebensjahrzehnt 2 3 4 5 6 7 8 10\n", "sex wohnort \n", "1 1 0.0 19.0 8.0 16.0 12.0 1.0 1.0 1.0\n", " 2 0.0 13.0 6.0 6.0 8.0 2.0 0.0 0.0\n", " 3 0.0 39.0 13.0 6.0 11.0 2.0 2.0 0.0\n", "2 1 1.0 7.0 10.0 7.0 9.0 2.0 1.0 0.0\n", " 2 0.0 8.0 3.0 5.0 5.0 3.0 1.0 0.0\n", " 3 0.0 23.0 19.0 8.0 10.0 5.0 1.0 0.0" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "daten.groupby(['sex', 'wohnort', 'Lebensjahrzehnt'])['Lebensjahrzehnt'].count().unstack().fillna(0)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" } }, "nbformat": 4, "nbformat_minor": 2 }