{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Condorcet-Schulze method\n", "\n", "The Schulze method is a refinemnet of the original Condorcet methods that are used for deciding the winning candidate of an election in which all the voters can express a number of different preferences.\n", "\n", "To read all about the method we have been using, have a look at [Wikipedia](https://en.wikipedia.org/wiki/Schulze_method).\n", "\n", "## How to use the condorcet module\n", "\n", "Create a dataframe with the ordered poreferences for each voter. The first column shall be the first preference for the i-th voter.\n", "\n", "The entries of each cell are the names of the candidates.\n", "\n", "Each voter can vote for all or part of the candidates. The non voted candidates are considered to be less preferred than the voted ones.\n" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import condorcet\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create or import your dataframe\n", "\n", "Make sure that the structure of the dataframe is a in the example below: each row represents the preferences of each voter (there are six voters in the exaple below). The names of the candidates are *a, b, c, d*." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
first_preferencesecond_preferencethird_preference
0bca
1cab
2cab
3bac
4dac
5dac
\n", "
" ], "text/plain": [ " first_preference second_preference third_preference\n", "0 b c a\n", "1 c a b\n", "2 c a b\n", "3 b a c\n", "4 d a c\n", "5 d a c" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create dataframe\n", "table = np.asarray([\n", " [\"b\",\"c\",\"a\"],\n", " [\"c\",\"a\",\"b\"],\n", " [\"c\",\"a\",\"b\"],\n", " [\"b\",\"a\",\"c\"],\n", " [\"d\",\"a\",\"c\"],\n", " [\"d\",\"a\",\"c\"]\n", " ])\n", "df = pd.DataFrame(table, columns = [\"first_preference\",\"second_preference\",\"third_preference\"])\n", "df\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run on the dataframe\n", "\n", "Just run the `condorcet.compute_ranks` method on the dataframe directly and the result will appear as a list of lists.\n", "\n", "In the example below, the output is ```[['a', 'c'], ['b'], ['d']]```, which means that there are two tie candidates in the first place, *a and c*, one candidate in the second place *b and* the least preferred candidate is *d*." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[['a', 'c'], ['b'], ['d']]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# run the classification\n", "condorcet.compute_ranks(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dive deeper\n", "\n", "You can extract the *d[V,W]* and *p[V,W]* matrices with the methods:\n", " - ```condorcet._compute_d(weighted_ranks, candidate_names)```\n", " - ```condorcet._compute_p(dmat, candidate_names)```\n", " \n", "To extract the candidate names and the weighted ranks to be input to the above methods, please use:\n", " - ```weighted_ranks = condorcet.weighted_ranks_from_df(df)```\n", " - ```candidate_names = condorcet.candidate_names_from_df(df)```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test on Wikipedia example\n", "\n", "We test the method on [wikipedia example](https://en.wikipedia.org/wiki/Schulze_method#Example) to make sure that the algorithm works properly." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
first_columnsecond_columnthird_columnfourth_column
0ACBE
1ACBE
2ACBE
3ACBE
4ACBE
5ADEC
6ADEC
7ADEC
8ADEC
9ADEC
10BEDA
11BEDA
12BEDA
13BEDA
14BEDA
15BEDA
16BEDA
17BEDA
18CABE
19CABE
20CABE
21CAEB
22CAEB
23CAEB
24CAEB
25CAEB
26CAEB
27CAEB
28CBAD
29CBAD
30DCEB
31DCEB
32DCEB
33DCEB
34DCEB
35DCEB
36DCEB
37EBAD
38EBAD
39EBAD
40EBAD
41EBAD
42EBAD
43EBAD
44EBAD
\n", "
" ], "text/plain": [ " first_column second_column third_column fourth_column\n", "0 A C B E\n", "1 A C B E\n", "2 A C B E\n", "3 A C B E\n", "4 A C B E\n", "5 A D E C\n", "6 A D E C\n", "7 A D E C\n", "8 A D E C\n", "9 A D E C\n", "10 B E D A\n", "11 B E D A\n", "12 B E D A\n", "13 B E D A\n", "14 B E D A\n", "15 B E D A\n", "16 B E D A\n", "17 B E D A\n", "18 C A B E\n", "19 C A B E\n", "20 C A B E\n", "21 C A E B\n", "22 C A E B\n", "23 C A E B\n", "24 C A E B\n", "25 C A E B\n", "26 C A E B\n", "27 C A E B\n", "28 C B A D\n", "29 C B A D\n", "30 D C E B\n", "31 D C E B\n", "32 D C E B\n", "33 D C E B\n", "34 D C E B\n", "35 D C E B\n", "36 D C E B\n", "37 E B A D\n", "38 E B A D\n", "39 E B A D\n", "40 E B A D\n", "41 E B A D\n", "42 E B A D\n", "43 E B A D\n", "44 E B A D" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# create dataframe\n", "df = pd.DataFrame()\n", "\n", "# populate with wikipedia preferences\n", "df[\"first_column\"] = 5*['A']+5*['A']+8*['B']+3*['C']+7*['C']+2*['C']+7*['D']+8*['E']\n", "df[\"second_column\"] = 5*['C']+5*['D']+8*['E']+3*['A']+7*['A']+2*['B']+7*['C']+8*['B']\n", "df[\"third_column\"] = 5*['B']+5*['E']+8*['D']+3*['B']+7*['E']+2*['A']+7*['E']+8*['A']\n", "df[\"fourth_column\"] = 5*['E']+5*['C']+8*['A']+3*['E']+7*['B']+2*['D']+7*['B']+8*['D']\n", "\n", "# we are not putting explicitly the last column, as it will be automatically inferred by the algorithm\n", "df" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[['E'], ['A'], ['C'], ['B'], ['D']]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# run the classification\n", "condorcet.compute_ranks(df)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "d matrix: defaultdict(, {('A', 'C'): 26, ('A', 'B'): 20, ('A', 'E'): 22, ('A', 'D'): 30, ('C', 'B'): 29, ('C', 'E'): 24, ('C', 'D'): 17, ('B', 'E'): 18, ('B', 'D'): 33, ('E', 'D'): 31, ('D', 'E'): 14, ('D', 'C'): 28, ('D', 'B'): 12, ('E', 'C'): 21, ('E', 'B'): 27, ('B', 'A'): 25, ('B', 'C'): 16, ('E', 'A'): 23, ('D', 'A'): 15, ('C', 'A'): 19})\n", "p matrix: {('A', 'B'): 28, ('A', 'C'): 28, ('A', 'D'): 30, ('A', 'E'): 24, ('B', 'A'): 25, ('B', 'C'): 28, ('B', 'D'): 33, ('B', 'E'): 24, ('C', 'A'): 25, ('C', 'B'): 29, ('C', 'D'): 29, ('C', 'E'): 24, ('D', 'A'): 25, ('D', 'B'): 28, ('D', 'C'): 28, ('D', 'E'): 24, ('E', 'A'): 25, ('E', 'B'): 28, ('E', 'C'): 28, ('E', 'D'): 31}\n" ] } ], "source": [ "# checking the d and p matrix\n", "candidate_names = condorcet.candidate_names_from_df(df)\n", "weighted_ranks = condorcet.weighted_ranks_from_df(df)\n", "d = condorcet._compute_d(weighted_ranks, candidate_names)\n", "p = condorcet._compute_p(d, candidate_names)\n", "print(\"d matrix: \", d)\n", "print(\"p matrix: \", p)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test passed!\n", "The algorithmis sound and we were able to reproduce exaclty Wikipedia's example." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Application to the gtda-challenge-2020 \n", "\n", "We first import the csv file and then we run the method" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Who would you choose as winner?Who would you choose for second place?Who would you choose for third place?
0marabernardirorondrelordgrilo
1Antonio-Leitaolordgrilomarabernardi
2marabernardirorondrekazuyamagiwa
3lordgriloAntonio-Leitaomarabernardi
4kazuyamagiwafilco306rorondre
5filco306rorondremarabernardi
6rorondrefilco306marabernardi
7Antonio-Leitaofilco306rorondre
8marabernardifilco306lordgrilo
9rorondremarabernardilordgrilo
10marabernardikazuyamagiwarorondre
11marabernardikazuyamagiwaAntonio-Leitao
12filco306lordgrilororondre
\n", "
" ], "text/plain": [ " Who would you choose as winner? Who would you choose for second place? \\\n", "0 marabernardi rorondre \n", "1 Antonio-Leitao lordgrilo \n", "2 marabernardi rorondre \n", "3 lordgrilo Antonio-Leitao \n", "4 kazuyamagiwa filco306 \n", "5 filco306 rorondre \n", "6 rorondre filco306 \n", "7 Antonio-Leitao filco306 \n", "8 marabernardi filco306 \n", "9 rorondre marabernardi \n", "10 marabernardi kazuyamagiwa \n", "11 marabernardi kazuyamagiwa \n", "12 filco306 lordgrilo \n", "\n", " Who would you choose for third place? \n", "0 lordgrilo \n", "1 marabernardi \n", "2 kazuyamagiwa \n", "3 marabernardi \n", "4 rorondre \n", "5 marabernardi \n", "6 marabernardi \n", "7 rorondre \n", "8 lordgrilo \n", "9 lordgrilo \n", "10 rorondre \n", "11 Antonio-Leitao \n", "12 rorondre " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# import file as dataframe\n", "\n", "df_temp = pd.read_csv(\"data/Evaluation of Jupyter Notebooks.csv\")\n", "df = df_temp[df_temp.columns[-3:]]\n", "df" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[['marabernardi'],\n", " ['filco306', 'rorondre'],\n", " ['lordgrilo'],\n", " ['kazuyamagiwa'],\n", " ['Antonio-Leitao']]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# run the final classification\n", "\n", "condorcet.compute_ranks(df)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Congratulations!!\n", "\n", "Congratulations to all the candidates that submitted a notebook to the challenge! All of the notebooks were original, well structured and clearly explained. Thank you for your efforts!\n", "\n", "I truly hope you enjoyed the challenge and do not hesitate to share your notebooks with your friends and colleagues!\n", "\n", "# The final classification is:\n", " - 1st place, **marabernardi**\n", " - 2nd place, **rorondre**\n", " - 2nd place ex aequo, **filco306**\n", " - 4th place, **lordgrilo**\n", " - 5th place, **kazuyamagiwa**\n", " - 6th place, **Antonio-Leitao**\n", " \n", "Bravo!! ;)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.8" } }, "nbformat": 4, "nbformat_minor": 2 }