{
"metadata": {
"name": "",
"signature": "sha256:bcd7a3ed694c62256fad87d40724fcd1a08a575d9477792087f7053263a14507"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Measuring Editor Collaborativeness With Economic Modelling\n",
"\n",
"##Max Klein [@notconfusing](https://twitter.com/notconfusing)\n",
"##Wikimania 2014"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Audience Poll\n",
"\n",
"+ How many people here are:\n",
" + wikipedia researchers?\n",
" + familiar with network/graph theory?\n",
" + sort of understand pagerank?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#New Editors\n",
"+ Leave when they encounter uncooperative Wikipedians\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Suggestions Need Input\n",
"\n",
"+ I wanted to make an input-less suggester\n",
" + So new editors can browse it."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#What To Show New Editors?\n",
"\n",
"\n",
"+ Can we fit them into the more functional parts of Wikipedia?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Collaboration\n",
"\n",
"\n",
"+ What is collaboration on a Wikipedia page?\n",
"+ How do you measure it?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Collaboration\n",
"\n",
"\n",
"+ How more contributing editor exeperience effect article quality?\n",
" + More experience not neccessarily better.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Collaboration Flipside\n",
"\n",
"+ File this away for later\n",
" + How does editing more articles effect editor expertise?\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Economic modelling?\n",
"\n",
"\n",
"+ These two economics papers got me thinking:\n",
"1. [The building blocks of economic complexity. Hidalgo & Hausman](http://www.pnas.org/content/106/26/10570.full)\n",
"2. [ A Network Analysis of Countries\u2019 Export Flows: Firm Grounds for the Building Blocks of the Economy](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0047278)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Principle\n",
"\n",
"\n",
"+ They both use an __network science__ algorithm on a _bi-partite_ graph, to __rank countries__ economic perfomance."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Key Insight\n",
"\n",
"+ Lower GDPs\n",
" + Just Agriculture\n",
" + __Ubiquitous products only__\n",
"+ Switzerland\n",
" + Agriculture & Watches\n",
" + __Ubiquitous and Rare products__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#So what?\n",
"\n",
"\n",
"+ Infer the GDP rankings of the world economy just by knowing\n",
" + Which countries \n",
" + export which products\n",
" + not even the quantities of the exports"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#It's notconfusing\n",
"\n",
"\n",
"+ \"Unlike laws and sausages, those who like Wikis and Tofu should inquire into how they are being made.\"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Bipartite Network \n",
"\n",
"+ A bi-partite network is where there are two distinct types of nodes in a graph.\n",
"+ In this case, countries and products.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Basically, it's the Page Rank Algroithm\n",
"\n",
"\n",
"+ __Except__ we have __two node-types__\n",
"+ And an __extra variable__ for improtance of highly connected nodes\n",
" + I'll explain more later"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Lay terms\n",
"\n",
"\n",
"+ If you know __who__ exports __what__\n",
" + Then you can rank Countries (In economics)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Translations\n",
"\n",
"+ Instead of \n",
" + __countries__ exporting __products__\n",
"+ What about\n",
" + __editors__ writing __articles__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Translations\n",
"\n",
"+ Instead of \n",
" + __rich countries__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Translations\n",
"\n",
"+ Instead of \n",
" + __rich countries__\n",
"+ What about\n",
" + __super users__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Translations\n",
"\n",
"+ Instead of \n",
" + __ubiqitous products__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Translations\n",
"\n",
"+ Instead of \n",
" + __ubiqitous products__\n",
"+ What about\n",
" + __highly edited articles__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Translations\n",
"\n",
"+ Instead of \n",
" + __global economy__\n",
"+ What about\n",
" + __a wikipedia category__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Editor Article Matrix\n",
"+ It's Triangular \n",
" + the power users are editing most of the articles in the category. \n",
"![Feminst Writers](paper/Figures/Category_Feminist_writerstriangle_matrix_corrected.png)\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Iterative Algorithm\n",
"\n",
"+ A nonmathematical explanation:\n",
" + Imagine everyone in the room starts with \u00a31\n",
" + Distribute your money evenly to all your friends\n",
" + Round 2, some people may have more or less than \u00a31\n",
" + but again distribute all your money evenly to all your friends.\n",
" + Repeat over and over again.\n",
" + Eventually converges.\n",
"\n",
"http://www.scottaaronson.com/blog/?p=1820\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Iterative Algorithm - One More Variable\n",
"\n",
"\n",
"+ Same scenario as above except:\n",
" + You don't distribute your money evenly\n",
" + You can give your popular friends __disproportionately larger percentage__:\n",
" + __or__ disproportionately __less__."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Iterative Algorithm - Notation\n",
"\n",
"\n",
"+ In this experiement those are controlled by \n",
" + __$\\alpha$__ (article popularity exponent) and \n",
" + __$\\beta$__ (editor portfolio size exponent) levels."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Editors rise and fall over time\n",
"![convergence](paper/Figures/fem_editors_iter_converge.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#End Result of Algorithm\n",
"\n",
"+ A ranking for Editors\n",
"+ A ranking for Articles"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Exogenous Rankings\n",
"\n",
"+ Getting unrelated metrics for:\n",
" + Editors\n",
" + Articles"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Exogenous Editor Rankings\n",
"+ Edit count bad\n",
"+ Use @halfak and @staeiou [__\"Labour Hours\"__](http://www-users.cs.umn.edu/~halfak/publications/Using_Edit_Sessions_to_Measure_Participation_in_Wikipedia/geiger13using-preprint.pdf)\n",
" + Labour Hours: Sum of Edit Sessions\n",
" + Edit Session: The __start__ and __end__ times of __all__ the __edits__ that occur __within 1 hour__ of another edit. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Exogenous Article Rankings\n",
"\n",
"+ Mix of:\n",
" 1. ratio of mark-up to readable text\n",
" 1. number of headings\n",
" 1. article length \n",
" 1. citations per article length \n",
" 1. outgoing intrawiki links."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Calibration\n",
"\n",
"\n",
"+ Find the values of __$\\alpha$__ and __$\\beta$__ which __maximize__:\n",
" + The rank __correlation between model and exogenous__ rankings"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Calibration on Feminist Writers\n",
"\n",
"![](paper/Figures/contour_fem_combined.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#High Correlations\n",
"\n",
"\n",
"+ We find correlations around \n",
" + .6 to .9\n",
"+ Even __better than the Economics GDP papers __around\n",
" + .4"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Snapshotting\n",
"+ Took 13 Snapshots of each Category\n",
"\n",
"![](paper/Figures/cumulative_snapshots_Feminist_Writers_thirteen.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Rank Accuracy\n",
"+ This really works...\n",
"+ Increases over time\n",
"![](paper/Figures/rho_combined.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Most collaborative\n",
"+ Question: in which category do power editors improve article quality?\n",
" 1. American male novelists\n",
" 1. 2013 films\n",
" 1. American women novelists\n",
" 1. Nobel Peace Prize laureates\n",
" 1. Sexual acts\n",
" 1. Economic theories\n",
" 1. Feminist writers\n",
" 1. Yoga\n",
" 1. Military history of the US\n",
" 1. Counterculture festivals\n",
" 1. Computability theory\n",
" 1. Bicycle parts"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Most collaborative\n",
"+ Question: in which category do power editors improve article quality?\n",
" 1. __Military history of the US__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Least collaborative\n",
"+ Question: in which category do power editors hurt article quality?\n",
" 1. American male novelists\n",
" 1. 2013 films\n",
" 1. American women novelists\n",
" 1. Nobel Peace Prize laureates\n",
" 1. Sexual acts\n",
" 1. Economic theories\n",
" 1. Feminist writers\n",
" 1. Yoga\n",
" 1. Military history of the US\n",
" 1. Counterculture festivals\n",
" 1. Computability theory\n",
" 1. Bicycle parts"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Least collaborative\n",
"+ Question: in which category do power editors hurt article quality?\n",
" 1. __Sexual acts__"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Full Category Rankings\n",
"![](paper/Figures/beta_ranks.png)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Edit Count or Touches\n",
"![](paper/Figures/bin_comp.png)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Forest Not Trees\n",
"\n",
"\n",
"+ If you accept this $\\beta$ measure as a collaborativeness measure how can we use it?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Detect dysfunction\n",
"\n",
"\n",
"\n",
"+ For learning\n",
" + Arguing is not neccessarily bad.\n",
"+ For intervention?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Detect Where The Wiki is Working\n",
"\n",
"\n",
"+ At least where your time invested relates to article quality\n",
" + Even superlinearly so\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#A Potential Use\n",
"\n",
"\n",
"+ Make a carousel of friendly places for new users"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#Measuring Editor Collaborativeness With Economic Modelling\n",
"\n",
"##Max Klein [@notconfusing](https://twitter.com/notconfusing)\n",
"##Wikimania 2014"
]
}
],
"metadata": {}
}
]
}