{ "metadata": { "name": "", "signature": "sha256:bcd7a3ed694c62256fad87d40724fcd1a08a575d9477792087f7053263a14507" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Measuring Editor Collaborativeness With Economic Modelling\n", "\n", "##Max Klein [@notconfusing](https://twitter.com/notconfusing)\n", "##Wikimania 2014" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Audience Poll\n", "\n", "+ How many people here are:\n", " + wikipedia researchers?\n", " + familiar with network/graph theory?\n", " + sort of understand pagerank?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#New Editors\n", "+ Leave when they encounter uncooperative Wikipedians\n", "" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Suggestions Need Input\n", "\n", "+ I wanted to make an input-less suggester\n", " + So new editors can browse it." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#What To Show New Editors?\n", "\n", "\n", "+ Can we fit them into the more functional parts of Wikipedia?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Collaboration\n", "\n", "\n", "+ What is collaboration on a Wikipedia page?\n", "+ How do you measure it?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Collaboration\n", "\n", "\n", "+ How more contributing editor exeperience effect article quality?\n", " + More experience not neccessarily better.\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Collaboration Flipside\n", "\n", "+ File this away for later\n", " + How does editing more articles effect editor expertise?\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Economic modelling?\n", "\n", "\n", "+ These two economics papers got me thinking:\n", "1. [The building blocks of economic complexity. Hidalgo & Hausman](http://www.pnas.org/content/106/26/10570.full)\n", "2. [ A Network Analysis of Countries\u2019 Export Flows: Firm Grounds for the Building Blocks of the Economy](http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0047278)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Principle\n", "\n", "\n", "+ They both use an __network science__ algorithm on a _bi-partite_ graph, to __rank countries__ economic perfomance." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Key Insight\n", "\n", "+ Lower GDPs\n", " + Just Agriculture\n", " + __Ubiquitous products only__\n", "+ Switzerland\n", " + Agriculture & Watches\n", " + __Ubiquitous and Rare products__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#So what?\n", "\n", "\n", "+ Infer the GDP rankings of the world economy just by knowing\n", " + Which countries \n", " + export which products\n", " + not even the quantities of the exports" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#It's notconfusing\n", "\n", "\n", "+ \"Unlike laws and sausages, those who like Wikis and Tofu should inquire into how they are being made.\"\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Bipartite Network \n", "\n", "+ A bi-partite network is where there are two distinct types of nodes in a graph.\n", "+ In this case, countries and products.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Basically, it's the Page Rank Algroithm\n", "\n", "\n", "+ __Except__ we have __two node-types__\n", "+ And an __extra variable__ for improtance of highly connected nodes\n", " + I'll explain more later" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Lay terms\n", "\n", "\n", "+ If you know __who__ exports __what__\n", " + Then you can rank Countries (In economics)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Translations\n", "\n", "+ Instead of \n", " + __countries__ exporting __products__\n", "+ What about\n", " + __editors__ writing __articles__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Translations\n", "\n", "+ Instead of \n", " + __rich countries__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Translations\n", "\n", "+ Instead of \n", " + __rich countries__\n", "+ What about\n", " + __super users__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Translations\n", "\n", "+ Instead of \n", " + __ubiqitous products__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Translations\n", "\n", "+ Instead of \n", " + __ubiqitous products__\n", "+ What about\n", " + __highly edited articles__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Translations\n", "\n", "+ Instead of \n", " + __global economy__\n", "+ What about\n", " + __a wikipedia category__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Editor Article Matrix\n", "+ It's Triangular \n", " + the power users are editing most of the articles in the category. \n", "![Feminst Writers](paper/Figures/Category_Feminist_writerstriangle_matrix_corrected.png)\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Iterative Algorithm\n", "\n", "+ A nonmathematical explanation:\n", " + Imagine everyone in the room starts with \u00a31\n", " + Distribute your money evenly to all your friends\n", " + Round 2, some people may have more or less than \u00a31\n", " + but again distribute all your money evenly to all your friends.\n", " + Repeat over and over again.\n", " + Eventually converges.\n", "\n", "http://www.scottaaronson.com/blog/?p=1820\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Iterative Algorithm - One More Variable\n", "\n", "\n", "+ Same scenario as above except:\n", " + You don't distribute your money evenly\n", " + You can give your popular friends __disproportionately larger percentage__:\n", " + __or__ disproportionately __less__." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Iterative Algorithm - Notation\n", "\n", "\n", "+ In this experiement those are controlled by \n", " + __$\\alpha$__ (article popularity exponent) and \n", " + __$\\beta$__ (editor portfolio size exponent) levels." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Editors rise and fall over time\n", "![convergence](paper/Figures/fem_editors_iter_converge.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#End Result of Algorithm\n", "\n", "+ A ranking for Editors\n", "+ A ranking for Articles" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Exogenous Rankings\n", "\n", "+ Getting unrelated metrics for:\n", " + Editors\n", " + Articles" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Exogenous Editor Rankings\n", "+ Edit count bad\n", "+ Use @halfak and @staeiou [__\"Labour Hours\"__](http://www-users.cs.umn.edu/~halfak/publications/Using_Edit_Sessions_to_Measure_Participation_in_Wikipedia/geiger13using-preprint.pdf)\n", " + Labour Hours: Sum of Edit Sessions\n", " + Edit Session: The __start__ and __end__ times of __all__ the __edits__ that occur __within 1 hour__ of another edit. " ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Exogenous Article Rankings\n", "\n", "+ Mix of:\n", " 1. ratio of mark-up to readable text\n", " 1. number of headings\n", " 1. article length \n", " 1. citations per article length \n", " 1. outgoing intrawiki links." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Calibration\n", "\n", "\n", "+ Find the values of __$\\alpha$__ and __$\\beta$__ which __maximize__:\n", " + The rank __correlation between model and exogenous__ rankings" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Calibration on Feminist Writers\n", "\n", "![](paper/Figures/contour_fem_combined.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#High Correlations\n", "\n", "\n", "+ We find correlations around \n", " + .6 to .9\n", "+ Even __better than the Economics GDP papers __around\n", " + .4" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Snapshotting\n", "+ Took 13 Snapshots of each Category\n", "\n", "![](paper/Figures/cumulative_snapshots_Feminist_Writers_thirteen.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Rank Accuracy\n", "+ This really works...\n", "+ Increases over time\n", "![](paper/Figures/rho_combined.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Most collaborative\n", "+ Question: in which category do power editors improve article quality?\n", " 1. American male novelists\n", " 1. 2013 films\n", " 1. American women novelists\n", " 1. Nobel Peace Prize laureates\n", " 1. Sexual acts\n", " 1. Economic theories\n", " 1. Feminist writers\n", " 1. Yoga\n", " 1. Military history of the US\n", " 1. Counterculture festivals\n", " 1. Computability theory\n", " 1. Bicycle parts" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Most collaborative\n", "+ Question: in which category do power editors improve article quality?\n", " 1. __Military history of the US__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Least collaborative\n", "+ Question: in which category do power editors hurt article quality?\n", " 1. American male novelists\n", " 1. 2013 films\n", " 1. American women novelists\n", " 1. Nobel Peace Prize laureates\n", " 1. Sexual acts\n", " 1. Economic theories\n", " 1. Feminist writers\n", " 1. Yoga\n", " 1. Military history of the US\n", " 1. Counterculture festivals\n", " 1. Computability theory\n", " 1. Bicycle parts" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Least collaborative\n", "+ Question: in which category do power editors hurt article quality?\n", " 1. __Sexual acts__" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Full Category Rankings\n", "![](paper/Figures/beta_ranks.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Edit Count or Touches\n", "![](paper/Figures/bin_comp.png)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Forest Not Trees\n", "\n", "\n", "+ If you accept this $\\beta$ measure as a collaborativeness measure how can we use it?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Detect dysfunction\n", "\n", "\n", "\n", "+ For learning\n", " + Arguing is not neccessarily bad.\n", "+ For intervention?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Detect Where The Wiki is Working\n", "\n", "\n", "+ At least where your time invested relates to article quality\n", " + Even superlinearly so\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#A Potential Use\n", "\n", "\n", "+ Make a carousel of friendly places for new users" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#Measuring Editor Collaborativeness With Economic Modelling\n", "\n", "##Max Klein [@notconfusing](https://twitter.com/notconfusing)\n", "##Wikimania 2014" ] } ], "metadata": {} } ] }