{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Desktop Vector Skin Version User Preferences" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[Task](https://phabricator.wikimedia.org/T260149)\n", "\n", "Pending resolution of identified Prefupdate bugs, I reviewed the mediawiki [user_properties table](https://www.mediawiki.org/wiki/Manual:User_properties_table) to determine the total number of users with each vector skin preference set for each of the test wiki. Note: Unlike PrefUpdate, this does not record every change in users preference but stores any current non-default state of the user's preference. As result, this data may included users that enabled and disabled their vectorskin version preference multiple times. \n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Contents\n", "1. [Calculate opt out rate among registered users](#Calculate-opt-out-rate-among-all-registered-users)\n", "2. [Calculate opt out rate among active editors](#Calculate-opt-out-rate-among-active-editors)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Notes:\n", "\n", "This reflects all current nondefault user preferences. If a user has not made any changes to their vector skin version then there will be no record for that user in this table and skin preference is default (i.e. Modern for logged-in users on test wikis)\n", "* Only accounts for logged-in users.\n", "* Data reflects the current state and does not account for users that opt in and opt out multiple times since deployment.\n", "* Based on the context that the new vector skin was deployed as the default setting for all logged-in users on the wikis below, we can assume each of the values mean the following:\n", " * Legacy: These are users that have currently opt'd out of the modern version.\n", " * Modern: These are users that opt'd out and then opt'd back in to the modern version. (Note: This number does not reflect the total number of users that are using the modern skin; only the users that made changes to their default preferences)\n", " * Unknown: There were a few VectorSkinVersion values set to 0 (instead of 1 [legacy] or 2[modern]). I need to further investigate what those values indicate." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Remaining To Dos:\n", " - Investigate what `VectorSkinVersion` set to 0 means.\n", " - Look into options of putting on superset dashboard.\n", " - Pending resolution of PrefUpdate bugs or new schema, show opt-out rate over time." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "\n", "import datetime as dt\n", "\n", "from wmfdata import hive, mariadb" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Calculate opt out rate among all registered users" ] }, { "cell_type": "code", "execution_count": 139, "metadata": {}, "outputs": [], "source": [ "query = \"\"\" \n", "SELECT \n", " up_value as skin,\n", " COUNT(*) as users\n", "FROM user_properties\n", "WHERE \n", " up_property = 'VectorSkinVersion'\n", "GROUP BY up_value \"\"\"" ] }, { "cell_type": "code", "execution_count": 150, "metadata": {}, "outputs": [], "source": [ "#define list of target wikis\n", "wikis = ['euwiki','frwiktionary', 'ptwikiversity', 'fawiki','hewiki', 'frwiki']" ] }, { "cell_type": "code", "execution_count": 141, "metadata": {}, "outputs": [], "source": [ "up_skin=list()\n", "for wiki in wikis:\n", " prefs = mariadb.run(\n", " query.format(),\n", " wiki\n", " )\n", " up_skin.append(prefs)\n", "\n", "skin= pd.concat(up_skin)" ] }, { "cell_type": "code", "execution_count": 142, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of users for whom we have vector skin preferences set in the user_properties table: 226605\n" ] } ], "source": [ "skin_users = skin['users'].sum()\n", "print('Total number of users for whom we have vector skin preferences set in the user_properties table:' , skin_users)" ] }, { "cell_type": "code", "execution_count": 143, "metadata": {}, "outputs": [], "source": [ "skin_aliases = {\n", " \"0\":\"unknown\",\n", " \"1\":\"legacy\",\n", " \"2\":\"modern\"\n", "}\n", "\n", "skin= skin.replace({\"skin\": skin_aliases})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Number of users for each skin type overall" ] }, { "cell_type": "code", "execution_count": 144, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
users
skin
legacy128242
modern98334
unknown29
\n", "
" ], "text/plain": [ " users\n", "skin \n", "legacy 128242\n", "modern 98334\n", "unknown 29" ] }, "execution_count": 144, "metadata": {}, "output_type": "execute_result" } ], "source": [ "user_skin=skin.groupby('skin').sum()\n", "user_skin" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Number of users for each skin type by wiki" ] }, { "cell_type": "code", "execution_count": 145, "metadata": {}, "outputs": [], "source": [ "#List of wikis to correspond to data values \n", "wikis_list = ['euwiki','euwiki', 'frwiktionary', 'frwiktionary', 'frwiktionary', 'ptwikiversity', 'ptwikiversity','fawiki', 'fawiki', 'fawiki','hewiki', 'hewiki', 'hewiki', 'frwiki', 'frwiki', 'frwiki']" ] }, { "cell_type": "code", "execution_count": 146, "metadata": {}, "outputs": [], "source": [ "skin['wiki'] = wikis_list" ] }, { "cell_type": "code", "execution_count": 148, "metadata": {}, "outputs": [], "source": [ "user_skin_bywiki=pd.pivot_table(skin, index=['wiki','skin'],values=['users'],aggfunc=np.sum)" ] }, { "cell_type": "code", "execution_count": 149, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
users
wikiskin
euwikilegacy2067
modern1852
fawikilegacy21442
modern22046
unknown2
frwikilegacy71535
modern49185
unknown19
frwiktionarylegacy4684
modern3899
unknown6
hewikilegacy28034
modern20890
unknown2
ptwikiversitylegacy480
modern462
\n", "
" ], "text/plain": [ " users\n", "wiki skin \n", "euwiki legacy 2067\n", " modern 1852\n", "fawiki legacy 21442\n", " modern 22046\n", " unknown 2\n", "frwiki legacy 71535\n", " modern 49185\n", " unknown 19\n", "frwiktionary legacy 4684\n", " modern 3899\n", " unknown 6\n", "hewiki legacy 28034\n", " modern 20890\n", " unknown 2\n", "ptwikiversity legacy 480\n", " modern 462" ] }, "execution_count": 149, "metadata": {}, "output_type": "execute_result" } ], "source": [ "user_skin_bywiki" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Total Number of Registered Users on Test Wikis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the total number of users on test wikis using the [mediawiki user table](https://www.mediawiki.org/wiki/Manual:User_table) to estimate the opt-out rate." ] }, { "cell_type": "code", "execution_count": 155, "metadata": {}, "outputs": [], "source": [ "# collect total number of users on each wiki\n", "\n", "query = \"\"\" \n", "SELECT \n", " COUNT(DISTINCT user_id) AS num_users\n", "FROM user\"\"\"\n" ] }, { "cell_type": "code", "execution_count": 156, "metadata": {}, "outputs": [], "source": [ "user_count = mariadb.run(commands = query, dbs = wikis, format=\"pandas\")" ] }, { "cell_type": "code", "execution_count": 157, "metadata": {}, "outputs": [], "source": [ "user_count['wiki'] = wikis" ] }, { "cell_type": "code", "execution_count": 158, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
num_userswiki
0115786euwiki
1291868frwiktionary
229455ptwikiversity
3963101fawiki
4685306hewiki
53905482frwiki
\n", "
" ], "text/plain": [ " num_users wiki\n", "0 115786 euwiki\n", "1 291868 frwiktionary\n", "2 29455 ptwikiversity\n", "3 963101 fawiki\n", "4 685306 hewiki\n", "5 3905482 frwiki" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "user_count" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Opt Out Rate for Registered Users\n", "\n", "The opt-out rate was calculated by dividing the total number of users with their vector version preference changed to 'legacy' by the total number of all registered users on the wiki. " ] }, { "cell_type": "code", "execution_count": 166, "metadata": {}, "outputs": [], "source": [ "# Create list of legacy users - these are all users that opt-out assuming modern is the default\n", "\n", "legacy_users = skin[skin['skin']=='legacy']\n", "#rename colums\n", "\n", "legacy_users.columns = ['skin', 'num_legacy_users', 'wiki']\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# join to user_count table to obtain opt-out rate for each wiki\n", "\n", "opt_out_rate = legacy_users.merge(user_count, left_on = 'wiki', right_on = 'wiki')\n" ] }, { "cell_type": "code", "execution_count": 174, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
skinnum_legacy_userswikinum_usersopt_out_ratepct_opt_out_rate
0legacy2067euwiki1157860.0178521.785190
1legacy4684frwiktionary2918680.0160481.604835
2legacy480ptwikiversity294550.0162961.629604
3legacy21442fawiki9631010.0222642.226350
4legacy28034hewiki6853060.0409074.090727
5legacy71535frwiki39054820.0183171.831656
\n", "
" ], "text/plain": [ " skin num_legacy_users wiki num_users opt_out_rate \\\n", "0 legacy 2067 euwiki 115786 0.017852 \n", "1 legacy 4684 frwiktionary 291868 0.016048 \n", "2 legacy 480 ptwikiversity 29455 0.016296 \n", "3 legacy 21442 fawiki 963101 0.022264 \n", "4 legacy 28034 hewiki 685306 0.040907 \n", "5 legacy 71535 frwiki 3905482 0.018317 \n", "\n", " pct_opt_out_rate \n", "0 1.785190 \n", "1 1.604835 \n", "2 1.629604 \n", "3 2.226350 \n", "4 4.090727 \n", "5 1.831656 " ] }, "execution_count": 174, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Calculate opt-out rate\n", "\n", "opt_out_rate['pct_opt_out_rate'] = opt_out_rate['num_legacy_users']/ opt_out_rate['num_users'] * 100\n", "\n", "opt_out_rate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Calculate opt out rate among active editors" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "I reviewed the opt-out rate among active editors (users that had 5 or more content edits overall in the last year from September 2019 to September 2020). This was calculated by finding the percent of active editors for each wiki (obtained using data from [mediawiki history table](https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Edits/MediaWiki_history) that have `VectorSkinVersion` preference set to legacy in the [user properties table](https://www.mediawiki.org/wiki/Manual:User_properties_table/en). \n", "\n", "Since the modern vector version was deployed as default to all of the test wikis in this analysis, it was assumed that any users with a non-default preference recorded as legacy have opt-d out.\n" ] }, { "cell_type": "code", "execution_count": 172, "metadata": {}, "outputs": [], "source": [ "HIVE_SNAPSHOT = \"2020-09\"\n", "START_OF_DATA = \"2019-09-01\"\n", "END_OF_DATA = \"2020-10-01\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Collect number of active users" ] }, { "cell_type": "code", "execution_count": 173, "metadata": {}, "outputs": [], "source": [ "#all active editors from the past one year\n", "\n", "active_editor_query = \"\"\"\n", "\n", "WITH yr_proj_edits as (\n", " select\n", " event_user_text as user,\n", " event_user_id as user_id,\n", " wiki_db as proj,\n", " sum(if(wiki_db = \"wikidatawiki\", 0.1, 1)) as content_edits,\n", " max(event_timestamp) as latest_edit\n", " from wmf.mediawiki_history\n", " where\n", " -- review target wikis\n", " wiki_db IN ('euwiki','frwiktionary', 'ptwikiversity', 'fawiki','hewiki', 'frwiki') and\n", " -- REGISTERED\n", " event_user_is_anonymous = false and\n", " \n", " -- NON-BOT\n", " size(event_user_is_bot_by) = 0 and\n", " not array_contains(event_user_groups, \"bot\") and\n", " \n", " -- CONTENT EDITS\n", " event_entity = \"revision\" and\n", " event_type = \"create\" and\n", " page_namespace_is_content = true and\n", " \n", " -- FROM THE LAST YEAR\n", " event_timestamp >= \"{START_OF_DATA}\" and event_timestamp < \"{END_OF_DATA}\" and\n", " \n", " -- FROM THE LATEST SNAPSHOT\n", " snapshot = \"{hive_snapshot}\"\n", " \n", " -- PER USER, PER WIKI\n", " group by event_user_text, event_user_id, wiki_db\n", ")\n", "\n", "-- FINAL SELECT OF\n", "select \n", " user as user_name,\n", " user_id as user_id,\n", " proj as wiki,\n", " global_edits\n", "\n", "from \n", "-- JOINED TO THEIR HOME WIKI AND GLOBAL EDITS\n", "(\n", " select\n", " user,\n", " user_id,\n", " proj,\n", " -- in the unlikely event that wikis are tied by edit count and latest edit, \n", " -- row_number() will break it somehow\n", " row_number() over (partition by user order by content_edits desc, latest_edit desc) as rank,\n", " sum(content_edits) over (partition by user) as global_edits\n", " from yr_proj_edits\n", ") yr_edits\n", "where\n", "rank = 1\n", "and global_edits>= 5\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 174, "metadata": {}, "outputs": [], "source": [ "active_editor = hive.run(\n", " active_editor_query.format(\n", " hive_snapshot = HIVE_SNAPSHOT,\n", " START_OF_DATA= START_OF_DATA,\n", " END_OF_DATA=END_OF_DATA\n", " )\n", ")" ] }, { "cell_type": "code", "execution_count": 176, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of editors for whom we will be checking vector skin preferences: user_id\n", "wiki \n", "euwiki 788\n", "fawiki 12421\n", "frwiki 35442\n", "frwiktionary 446\n", "hewiki 5958\n", "ptwikiversity 81\n" ] } ], "source": [ "#Total_active_ed = active_editor['user_id'].count()\n", "\n", "Total_active_ed = active_editor.groupby(['wiki'])[['user_id']].count()\n", "\n", "print('Total number of editors for whom we will be checking vector skin preferences:' , Total_active_ed)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Vector Skin Preferences By Active Users" ] }, { "cell_type": "code", "execution_count": 177, "metadata": {}, "outputs": [], "source": [ "#Querying user_properties for getting the skin preferences set by the active editors we got in the above query\n", "\n", "query='''\n", "SELECT \n", " up_value AS skin, \n", " COUNT(*) AS users\n", "FROM user_properties\n", "WHERE up_user in ({users})\n", "AND up_property = \"VectorSkinVersion\"\n", "GROUP BY up_value\n", "'''\n" ] }, { "cell_type": "code", "execution_count": 178, "metadata": {}, "outputs": [], "source": [ "#define list of target wikis\n", "wikis = ['euwiki','frwiktionary', 'ptwikiversity', 'fawiki','hewiki', 'frwiki']\n" ] }, { "cell_type": "code", "execution_count": 179, "metadata": {}, "outputs": [], "source": [ "# Looping through each wiki for the list of users for each skin\n", "\n", "up_skin=list()\n", "for wiki in wikis:\n", " user_ids = active_editor[active_editor['wiki'] == wiki][\"user_id\"]\n", " user_list = ','.join([str(u) for u in user_ids])\n", " prefs = mariadb.run(\n", " query.format(users=user_list),\n", " wiki\n", " )\n", " up_skin.append(prefs)\n", "\n", "skin= pd.concat(up_skin)" ] }, { "cell_type": "code", "execution_count": 180, "metadata": {}, "outputs": [], "source": [ "#List of wikis to correspond to data values \n", "wikis_list = ['euwiki','euwiki', 'frwiktionary', 'frwiktionary', 'ptwikiversity', 'ptwikiversity','fawiki', 'fawiki', 'fawiki','hewiki', 'hewiki', 'frwiki', 'frwiki', 'frwiki']" ] }, { "cell_type": "code", "execution_count": 181, "metadata": {}, "outputs": [], "source": [ "# add wiki column\n", "skin['wiki'] = wikis_list" ] }, { "cell_type": "code", "execution_count": 182, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total number of users for whom we have vector skin preferences set in the user_properties table: users\n", "wiki \n", "euwiki 67\n", "fawiki 3316\n", "frwiki 6399\n", "frwiktionary 93\n", "hewiki 1439\n", "ptwikiversity 11\n" ] } ], "source": [ "# skin_users = skin['users'].sum()\n", "\n", "skin_users = skin.groupby(['wiki']).sum()\n", "\n", "print('Total number of users for whom we have vector skin preferences set in the user_properties table:' , skin_users)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: There are a large number of users who do not have data for vector skin preference in the user_preference table indicating that they are set to the default 'Modern' skin OR due to being deleted from the user_preference table. For the analysis below, let's default them to 'Modern'." ] }, { "cell_type": "code", "execution_count": 183, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_id
wiki
euwiki721
fawiki9105
frwiki29043
frwiktionary353
hewiki4519
ptwikiversity70
\n", "
" ], "text/plain": [ " user_id\n", "wiki \n", "euwiki 721\n", "fawiki 9105\n", "frwiki 29043\n", "frwiktionary 353\n", "hewiki 4519\n", "ptwikiversity 70" ] }, "execution_count": 183, "metadata": {}, "output_type": "execute_result" } ], "source": [ "modern_users=np.subtract(Total_active_ed,skin_users)\n", "modern_users" ] }, { "cell_type": "code", "execution_count": 184, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
skinuserswiki
0modern721euwiki
1modern9105fawiki
2modern29043frwiki
3modern353frwiktionary
4modern4519hewiki
5modern70ptwikiversity
\n", "
" ], "text/plain": [ " skin users wiki\n", "0 modern 721 euwiki\n", "1 modern 9105 fawiki\n", "2 modern 29043 frwiki\n", "3 modern 353 frwiktionary\n", "4 modern 4519 hewiki\n", "5 modern 70 ptwikiversity" ] }, "execution_count": 184, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#crate data frame of modern users\n", "modern_users_df = pd.DataFrame([['modern', 721, 'euwiki'], ['modern', 9105, 'fawiki'], \n", " ['modern', 29043, 'frwiki'],['modern', 353, 'frwiktionary'], ['modern', 4519, 'hewiki'],\n", " ['modern', 70, 'ptwikiversity']],\n", " columns=['skin','users','wiki'])\n", "modern_users_df\n" ] }, { "cell_type": "code", "execution_count": 185, "metadata": {}, "outputs": [], "source": [ "# Define skin type for each property values\n", "skin_aliases = {\n", " \"0\":\"unknown\",\n", " \"1\":\"legacy\",\n", " \"2\":\"modern\"\n", "}\n", "\n", "skin= skin.replace({\"skin\": skin_aliases})" ] }, { "cell_type": "code", "execution_count": 186, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
skinuserswiki
0legacy40euwiki
1modern27euwiki
2legacy52frwiktionary
3modern41frwiktionary
4legacy7ptwikiversity
5modern4ptwikiversity
6unknown1fawiki
7legacy1874fawiki
8modern1441fawiki
9legacy863hewiki
10modern576hewiki
11unknown9frwiki
12legacy4299frwiki
13modern2091frwiki
14modern721euwiki
15modern9105fawiki
16modern29043frwiki
17modern353frwiktionary
18modern4519hewiki
19modern70ptwikiversity
\n", "
" ], "text/plain": [ " skin users wiki\n", "0 legacy 40 euwiki\n", "1 modern 27 euwiki\n", "2 legacy 52 frwiktionary\n", "3 modern 41 frwiktionary\n", "4 legacy 7 ptwikiversity\n", "5 modern 4 ptwikiversity\n", "6 unknown 1 fawiki\n", "7 legacy 1874 fawiki\n", "8 modern 1441 fawiki\n", "9 legacy 863 hewiki\n", "10 modern 576 hewiki\n", "11 unknown 9 frwiki\n", "12 legacy 4299 frwiki\n", "13 modern 2091 frwiki\n", "14 modern 721 euwiki\n", "15 modern 9105 fawiki\n", "16 modern 29043 frwiki\n", "17 modern 353 frwiktionary\n", "18 modern 4519 hewiki\n", "19 modern 70 ptwikiversity" ] }, "execution_count": 186, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# combine modern_users with skin table\n", "skin = skin.append(modern_users_df,ignore_index=True)\n", "skin" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Number of Active Editors for Skin Type " ] }, { "cell_type": "code", "execution_count": 187, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
users
skin
legacy7135
modern47991
unknown10
\n", "
" ], "text/plain": [ " users\n", "skin \n", "legacy 7135\n", "modern 47991\n", "unknown 10" ] }, "execution_count": 187, "metadata": {}, "output_type": "execute_result" } ], "source": [ "user_skin=skin.groupby('skin').sum()\n", "user_skin" ] }, { "cell_type": "code", "execution_count": 188, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
users
wikiskin
euwikilegacy40
modern748
fawikilegacy1874
modern10546
unknown1
frwikilegacy4299
modern31134
unknown9
frwiktionarylegacy52
modern394
hewikilegacy863
modern5095
ptwikiversitylegacy7
modern74
\n", "
" ], "text/plain": [ " users\n", "wiki skin \n", "euwiki legacy 40\n", " modern 748\n", "fawiki legacy 1874\n", " modern 10546\n", " unknown 1\n", "frwiki legacy 4299\n", " modern 31134\n", " unknown 9\n", "frwiktionary legacy 52\n", " modern 394\n", "hewiki legacy 863\n", " modern 5095\n", "ptwikiversity legacy 7\n", " modern 74" ] }, "execution_count": 188, "metadata": {}, "output_type": "execute_result" } ], "source": [ "user_skin_bywiki=pd.pivot_table(skin, index=['wiki','skin'],values=['users'],aggfunc=np.sum)\n", "user_skin_bywiki" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Perentage of Active Editors for Each Skin Type" ] }, { "cell_type": "code", "execution_count": 189, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
users
skin
modern87.0%
legacy12.9%
unknown0.0%
\n", "
" ], "text/plain": [ " users\n", "skin \n", "modern 87.0%\n", "legacy 12.9%\n", "unknown 0.0%" ] }, "execution_count": 189, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# overall\n", "pct_user_skin=(100. * user_skin / user_skin.sum()).round(1).astype(str) + '%'\n", "pct_user_skin.sort_values(by=['users'],ascending=False)" ] }, { "cell_type": "code", "execution_count": 190, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
users
wikiskin
euwikilegacy5.076142
modern94.923858
fawikilegacy15.087352
modern84.904597
unknown0.008051
frwikilegacy12.129677
modern87.844930
unknown0.025394
frwiktionarylegacy11.659193
modern88.340807
hewikilegacy14.484726
modern85.515274
ptwikiversitylegacy8.641975
modern91.358025
\n", "
" ], "text/plain": [ " users\n", "wiki skin \n", "euwiki legacy 5.076142\n", " modern 94.923858\n", "fawiki legacy 15.087352\n", " modern 84.904597\n", " unknown 0.008051\n", "frwiki legacy 12.129677\n", " modern 87.844930\n", " unknown 0.025394\n", "frwiktionary legacy 11.659193\n", " modern 88.340807\n", "hewiki legacy 14.484726\n", " modern 85.515274\n", "ptwikiversity legacy 8.641975\n", " modern 91.358025" ] }, "execution_count": 190, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# by target wiki\n", "\n", "\n", "pct_user_skin_bywiki = user_skin_bywiki.groupby(['wiki', 'skin']).agg({'users': 'sum'})\n", "wiki = user_skin_bywiki.groupby(['wiki']).agg({'users': 'sum'})\n", "pct_user_skin_bywiki.div(wiki, level='wiki') * 100\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The percentage of legacy users listed above for each wiki reflects the opt-out rate as the modern vector was presented as default for all these wikis.\n", "\n", "The opt out rates among active editors for each target wiki are still well below 40%. Persian Wikipedia (fawiki) currently has the highest opt out rate (15.1%) among active editors while Basque Wikipedia (euwiki) has the lowest (5.1%). " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }