{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Automatic discovery of community organizations for long-term package maintenance" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python 3.7.4 (default, Jul 8 2019, 18:31:06) \n", "[GCC 7.4.0]\n", "IPython 7.6.1\n", "\n", "Libraries:\n", "\n", "matplotlib 3.1.1\n", "numpy 1.17.0\n", "pandas 0.25.0\n", "requests 2.22.0\n" ] } ], "source": [ "%matplotlib inline\n", "\n", "import sys\n", "print(f'Python {sys.version}')\n", "\n", "import IPython\n", "print(f'IPython {IPython.__version__}')\n", "\n", "print('\\nLibraries:\\n')\n", "\n", "import matplotlib\n", "import matplotlib.pyplot as plt\n", "print(f'matplotlib {matplotlib.__version__}')\n", "\n", "import numpy as np\n", "print(f'numpy {np.__version__}')\n", "\n", "import pandas as pd\n", "from pandas.plotting import register_matplotlib_converters\n", "print(f'pandas {pd.__version__}')\n", "\n", "import requests\n", "print(f'requests {requests.__version__}')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "api_token = ''\n", "\n", "def send_rest_request(url):\n", " headers = {'Authorization': f'token {api_token}'}\n", " r = requests.get(url=url, headers=headers)\n", " r.raise_for_status() # Abort if unsuccessful request\n", " return r.json()\n", "\n", "def send_graphql_request(query, variables):\n", " headers = {'Authorization': f'token {api_token}'}\n", " url = 'https://api.github.com/graphql'\n", " json = {'query':query, 'variables':variables}\n", " r = requests.post(url=url, json=json, headers=headers)\n", " r.raise_for_status() # Abort if unsuccessful request\n", " return r.json()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Phase 1: get a preliminary list of organizations\n", "\n", "GitHub only provide two APIs to get a list of organization: a REST endpoint that allows to get the full list, but requires many requests, given that there are more than 2,000,000 organizations on GitHub (https://developer.github.com/changes/2015-06-17-organizations-endpoint/) and given that this first type of request will only provide the list of organization logins and descriptions, but nothing more, or the Search API that is limited to browsing 1000 results.\n", "\n", "We choose to use the second to limit the number of requests, but this imposes to find ways of querying for less than 1000 results at a time, using the limited filters that search queries provide.\n", "\n", "Our first restriction will be to limit ourselves to organizations with at least 5 public repositories.\n", "We are aware that this is an arbitrary restriction that will exclude community organizations that are just starting and have not yet reached that number.\n", "\n", "Our second restriction will be to search by keywords.\n", "We list as many keywords as we could think that could appear in the names or the descriptions of this type of organizations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "75" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keywords = [\n", " # To add next time: 'addon', 'addons',\n", " 'app', 'apps', 'application', 'applications',\n", " 'care', 'caring',\n", " 'collab', 'collaboration', 'collaborative',\n", " 'collection', 'collective',\n", " 'common', 'commons',\n", " 'community',\n", " 'component', 'components',\n", " # To add next time: 'contribs'\n", " 'contrib', 'contribution', 'contributions', 'contributing',\n", " 'distribute', 'distribution', 'distributions',\n", " 'ecosystem', 'ecosystems',\n", " 'extension', 'extensions',\n", " 'gather',\n", " 'give', 'giving',\n", " 'group',\n", " 'help', 'helper', 'helpers',\n", " 'library', 'libraries',\n", " 'maintain', 'maintainer', 'maintainers', 'maintenance', 'maintaining',\n", " 'member', 'members',\n", " 'module', 'modules',\n", " 'open source',\n", " 'org', 'organization',\n", " 'package', 'packages',\n", " 'participate', 'participant', 'participants', 'participation',\n", " 'people',\n", " 'place',\n", " 'plugin', 'plugins',\n", " 'projects',\n", " # Not project singular because that would give too many results\n", " # and this is not about organizations focused on a single project\n", " 'quality',\n", " 'repository', 'repositories',\n", " 'reuse', 'reusable',\n", " 'share', 'shared', 'sharing',\n", " 'support', 'supporter', 'supporters', 'supporting',\n", " 'together',\n", " # To add next time: tool, tools\n", " 'unofficial',\n", " 'user', 'users'\n", "]\n", "len(keywords)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For some keywords, this still gives too many results so we additionally partition using language filters:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "language_filters = [\n", " 'language:JavaScript',\n", " 'language:Java',\n", " 'language:Python',\n", " 'language:PHP',\n", " 'language:HTML',\n", " 'language:C#',\n", " 'language:C++',\n", " 'language:C',\n", " 'language:CSS',\n", " '-language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS'\n", "]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "query = '''\n", "query searchOrganizations($query: String!,$cursor: String) {\n", " search(type:USER,query:$query, first: 50, after: $cursor) {\n", " userCount\n", " pageInfo {\n", " endCursor\n", " hasNextPage\n", " }\n", " nodes {\n", " ... on Organization {\n", " login\n", " name\n", " description\n", " websiteUrl\n", " membersWithRole {\n", " totalCount\n", " }\n", " repositories(first: 1, orderBy: {field: STARGAZERS, direction: DESC}) {\n", " totalCount\n", " nodes {\n", " stargazers {\n", " totalCount\n", " }\n", " assignableUsers {\n", " totalCount\n", " }\n", " }\n", " }\n", " }\n", " }\n", " }\n", "}\n", "'''" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "columns = [\n", " 'name',\n", " 'description',\n", " 'url',\n", " 'members', # Number of public members\n", " 'repositories', # Number of public repositories\n", " 'stars', # Number of stars of the most starred repository\n", " 'collaborators' # Number of assignable users of the most starred repository\n", "]\n", "\n", "keyword_columns = list(map(lambda keyword: f'keyword {keyword}', keywords))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values = pd.DataFrame(columns=columns + keyword_columns).astype({\n", " 'members': 'UInt32',\n", " 'repositories': 'UInt32',\n", " 'stars': 'UInt32',\n", " 'collaborators': 'UInt32'\n", "})" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def paged_query(keyword, language=''):\n", " if keyword == 'repository' or keyword == 'user':\n", " exclude = 'NOT aur-archive'\n", " elif keyword == 'collaborative':\n", " exclude = 'NOT GITenberg'\n", " else:\n", " exclude = ''\n", " next_page = True\n", " cursor = None\n", " while next_page:\n", " searchQuery = f'type:organization repos:>=5 {keyword} {exclude} {language}'\n", " print(f'Search query: {searchQuery}')\n", " json = send_graphql_request(\n", " query,\n", " {'query': searchQuery, 'cursor': cursor}\n", " )\n", " search_json = json['data']['search']\n", " nb_results = search_json['userCount']\n", " if nb_results > 1000:\n", " raise ValueError('Query not restricted enough: more than 1000 results.')\n", " page_info = search_json['pageInfo']\n", " next_page = page_info['hasNextPage']\n", " cursor = page_info['endCursor']\n", " for node in search_json['nodes']:\n", " # Index\n", " login = node['login']\n", " # Fields\n", " name = node['name']\n", " values.loc[login, 'name'] = name\n", " values.loc[login, 'description'] = node['description']\n", " values.loc[login, 'url'] = node['websiteUrl']\n", " values.loc[login, 'members'] = node['membersWithRole']['totalCount']\n", " repos_json = node['repositories']\n", " repos_nb = repos_json['totalCount']\n", " values.loc[login, 'repositories'] = repos_nb\n", " if repos_nb > 0:\n", " repo_json = repos_json['nodes'][0]\n", " values.loc[login, 'stars'] = repo_json['stargazers']['totalCount']\n", " values.loc[login, 'collaborators'] = repo_json['assignableUsers']['totalCount']\n", " values.loc[login, f'keyword {keyword}'] = True" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Search query: type:organization repos:>=5 repository NOT aur-archive \n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:JavaScript\n", "Now fetched a total number of 27130 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Java\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Java\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Java\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Java\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Java\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Java\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Java\n", "Now fetched a total number of 27337 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Python\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Python\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Python\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Python\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Python\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Python\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:Python\n", "Now fetched a total number of 27532 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:PHP\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:PHP\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:PHP\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:PHP\n", "Now fetched a total number of 27632 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:HTML\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:HTML\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:HTML\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:HTML\n", "Now fetched a total number of 27728 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C#\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C#\n", "Now fetched a total number of 27785 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C++\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C++\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C++\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C++\n", "Now fetched a total number of 27894 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:C\n", "Now fetched a total number of 27993 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive language:CSS\n", "Now fetched a total number of 28033 organizations.\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repository NOT aur-archive -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Now fetched a total number of 28390 organizations.\n", "Search query: type:organization repos:>=5 repositories \n", "Search query: type:organization repos:>=5 repositories language:JavaScript\n", "Search query: type:organization repos:>=5 repositories language:JavaScript\n", "Search query: type:organization repos:>=5 repositories language:JavaScript\n", "Search query: type:organization repos:>=5 repositories language:JavaScript\n", "Search query: type:organization repos:>=5 repositories language:JavaScript\n", "Now fetched a total number of 28401 organizations.\n", "Search query: type:organization repos:>=5 repositories language:Java\n", "Search query: type:organization repos:>=5 repositories language:Java\n", "Search query: type:organization repos:>=5 repositories language:Java\n", "Search query: type:organization repos:>=5 repositories language:Java\n", "Now fetched a total number of 28421 organizations.\n", "Search query: type:organization repos:>=5 repositories language:Python\n", "Search query: type:organization repos:>=5 repositories language:Python\n", "Search query: type:organization repos:>=5 repositories language:Python\n", "Search query: type:organization repos:>=5 repositories language:Python\n", "Search query: type:organization repos:>=5 repositories language:Python\n", "Now fetched a total number of 28437 organizations.\n", "Search query: type:organization repos:>=5 repositories language:PHP\n", "Search query: type:organization repos:>=5 repositories language:PHP\n", "Now fetched a total number of 28443 organizations.\n", "Search query: type:organization repos:>=5 repositories language:HTML\n", "Search query: type:organization repos:>=5 repositories language:HTML\n", "Now fetched a total number of 28447 organizations.\n", "Search query: type:organization repos:>=5 repositories language:C#\n", "Now fetched a total number of 28447 organizations.\n", "Search query: type:organization repos:>=5 repositories language:C++\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Search query: type:organization repos:>=5 repositories language:C++\n", "Now fetched a total number of 28453 organizations.\n", "Search query: type:organization repos:>=5 repositories language:C\n", "Search query: type:organization repos:>=5 repositories language:C\n", "Search query: type:organization repos:>=5 repositories language:C\n", "Now fetched a total number of 28457 organizations.\n", "Search query: type:organization repos:>=5 repositories language:CSS\n", "Now fetched a total number of 28457 organizations.\n", "Search query: type:organization repos:>=5 repositories -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repositories -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repositories -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repositories -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repositories -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 repositories -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Now fetched a total number of 28480 organizations.\n", "Search query: type:organization repos:>=5 reuse \n", "Search query: type:organization repos:>=5 reuse \n", "Now fetched a total number of 28506 organizations.\n", "Search query: type:organization repos:>=5 reusable \n", "Search query: type:organization repos:>=5 reusable \n", "Now fetched a total number of 28513 organizations.\n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Search query: type:organization repos:>=5 share \n", "Now fetched a total number of 28926 organizations.\n", "Search query: type:organization repos:>=5 shared \n", "Search query: type:organization repos:>=5 shared \n", "Search query: type:organization repos:>=5 shared \n", "Now fetched a total number of 28934 organizations.\n", "Search query: type:organization repos:>=5 sharing \n", "Search query: type:organization repos:>=5 sharing \n", "Search query: type:organization repos:>=5 sharing \n", "Search query: type:organization repos:>=5 sharing \n", "Search query: type:organization repos:>=5 sharing \n", "Now fetched a total number of 28946 organizations.\n", "Search query: type:organization repos:>=5 support \n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Search query: type:organization repos:>=5 support language:JavaScript\n", "Now fetched a total number of 29667 organizations.\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Search query: type:organization repos:>=5 support language:Java\n", "Now fetched a total number of 29957 organizations.\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Search query: type:organization repos:>=5 support language:Python\n", "Now fetched a total number of 30246 organizations.\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Search query: type:organization repos:>=5 support language:PHP\n", "Now fetched a total number of 30700 organizations.\n", "Search query: type:organization repos:>=5 support language:HTML\n", "Search query: type:organization repos:>=5 support language:HTML\n", "Search query: type:organization repos:>=5 support language:HTML\n", "Search query: type:organization repos:>=5 support language:HTML\n", "Now fetched a total number of 30807 organizations.\n", "Search query: type:organization repos:>=5 support language:C#\n", "Search query: type:organization repos:>=5 support language:C#\n", "Search query: type:organization repos:>=5 support language:C#\n", "Search query: type:organization repos:>=5 support language:C#\n", "Now fetched a total number of 30919 organizations.\n", "Search query: type:organization repos:>=5 support language:C++\n", "Search query: type:organization repos:>=5 support language:C++\n", "Search query: type:organization repos:>=5 support language:C++\n", "Search query: type:organization repos:>=5 support language:C++\n", "Now fetched a total number of 31034 organizations.\n", "Search query: type:organization repos:>=5 support language:C\n", "Search query: type:organization repos:>=5 support language:C\n", "Search query: type:organization repos:>=5 support language:C\n", "Now fetched a total number of 31134 organizations.\n", "Search query: type:organization repos:>=5 support language:CSS\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Search query: type:organization repos:>=5 support language:CSS\n", "Search query: type:organization repos:>=5 support language:CSS\n", "Now fetched a total number of 31200 organizations.\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Search query: type:organization repos:>=5 support -language:JavaScript -language:Java -language:Python -language:PHP -language:HTML -language:C# -language:C++ -language:C -language:Ruby -language:CSS\n", "Now fetched a total number of 31751 organizations.\n", "Search query: type:organization repos:>=5 supporter \n", "Now fetched a total number of 31754 organizations.\n", "Search query: type:organization repos:>=5 supporters \n", "Now fetched a total number of 31754 organizations.\n", "Search query: type:organization repos:>=5 supporting \n", "Search query: type:organization repos:>=5 supporting \n", "Search query: type:organization repos:>=5 supporting \n", "Now fetched a total number of 31759 organizations.\n", "Search query: type:organization repos:>=5 together \n", "Search query: type:organization repos:>=5 together \n", "Search query: type:organization repos:>=5 together \n", "Search query: type:organization repos:>=5 together \n", "Search query: type:organization repos:>=5 together \n", "Search query: type:organization repos:>=5 together \n", "Search query: type:organization repos:>=5 together \n", "Search query: type:organization repos:>=5 together \n", "Now fetched a total number of 31941 organizations.\n", "Search query: type:organization repos:>=5 unofficial \n", "Search query: type:organization repos:>=5 unofficial \n", "Search query: type:organization repos:>=5 unofficial \n", "Search query: type:organization repos:>=5 unofficial \n", "Search query: type:organization repos:>=5 unofficial \n", "Now fetched a total number of 32043 organizations.\n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Search query: type:organization repos:>=5 user NOT aur-archive \n", "Now fetched a total number of 32382 organizations.\n", "Search query: type:organization repos:>=5 users \n", "Search query: type:organization repos:>=5 users \n", "Search query: type:organization repos:>=5 users \n", "Search query: type:organization repos:>=5 users \n", "Search query: type:organization repos:>=5 users \n", "Search query: type:organization repos:>=5 users \n", "Search query: type:organization repos:>=5 users \n", "Search query: type:organization repos:>=5 users \n", "Now fetched a total number of 32434 organizations.\n" ] } ], "source": [ "for keyword in keywords[60:]:\n", " try:\n", " paged_query(keyword)\n", " print(f'Now fetched a total number of {len(values)} organizations.')\n", " except ValueError:\n", " for language in language_filters:\n", " paged_query(keyword, language)\n", " print(f'Now fetched a total number of {len(values)} organizations.')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values.to_csv('community-organizations-phase-one.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Phase 2: filter the results and fetch more information\n", "\n", "### Filter\n", "\n", "We start with the organizations that we have fetched in phase 1.\n", "We have fetched more than 32,000 organizations, which is close to 15% of all GitHub organizations with at least 5 repositories." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/nix/store/l2drdy46nqd6kqqz3pv3hfmy4c64ixn9-python3.7-ipython-7.6.1/lib/python3.7/site-packages/IPython/core/interactiveshell.py:3057: DtypeWarning: Columns (8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63) have mixed types. Specify dtype option on import or set low_memory=False.\n", " interactivity=interactivity, compiler=compiler, result=result)\n" ] } ], "source": [ "values = pd.read_csv('community-organizations-phase-one.csv', index_col=0, dtype={\n", " 'members': 'UInt32',\n", " 'repositories': 'UInt32',\n", " 'stars': 'UInt32',\n", " 'collaborators': 'UInt32'\n", "})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Apparently, the search filters are not fully efficient because about 1% of our search results have less than 5 repositories (four of them even having zero repositories):" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.011839427760991552" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values[values['repositories'] < 5]) / len(values)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values = values[values['repositories'] >= 5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because many members can decide to make their membership status private (and in fact this is even the default), the public members are just an inferiror bound on the actual number of members of an organization.\n", "To estimate an upper bound on organization membership, we have also retrieved the number of assignable users of the most starred repository.\n", "\n", "Assignable users are organization members with read access to the repository, or collaborators with write access specifically on this repository.\n", "In theory, it is possible for an organization member to not be an assignable user, if the organization owners have changed the default member permissions from \"read\" to \"none\".\n", "In this case, the number of public members of the organization could be larger than the number of assignable users on the most starred repository, but such situation is quite rare, it represents less than 3% of our dataset:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.021029641185647426" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values[values['members'] > values['collaborators']]) / len(values)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In most organizations, there are strictly more collaborators than public members:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3.0" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.median(values['collaborators'] - values['members'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most organizations are not community organizations.\n", "Community organizations (at least established once) should have a strong membership.\n", "Thus we select organizations with at least 10 public members or collaborators on the most starred repository.\n", "This represents 25% of the remaining organizations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.23051482059282372" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values[(values['members'] >= 10) | (values['collaborators'] >= 10)]) / len(values)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values = values[(values['members'] >= 10) | (values['collaborators'] >= 10)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Most organizations do not maintain any popular projects. Community organization should host several popular projects. Stars are often used as a proxy for popularity on GitHub. It is especially relevant for libraries that are mainly targeted to other developers. We set an arbitrary low limit of 10 stars on the most starred project. This represents about 60% of the remaining organizations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.5929886302111532" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values[values['stars'] >= 10]) / len(values)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values = values[values['stars'] >= 10]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4381" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fetch more information\n", "\n", "For each organization in the remaining list, we fetch the creation date of the organization, and the number of repositories that were created before this date, as an under-approximation of the number of transferred repositories.\n", "The GraphQL API allows us to batch requests and thus to have much fewer requests:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def build_graphql_query(imin):\n", " query = \"\"\"\n", " query {\n", " \"\"\"\n", " if imin + 40 < len(values):\n", " next_imin = imin + 40\n", " isup = next_imin\n", " else:\n", " next_imin = None\n", " isup = len(values)\n", " index = values.index[imin:isup]\n", " for i, owner in enumerate(index):\n", " query += \"\"\"\n", " request%d: organization(login: \"%s\") {\n", " createdAt\n", " }\n", " \"\"\" % (i, owner)\n", " query += \"\"\"\n", " }\n", " \"\"\"\n", " return query, index, next_imin\n", "\n", "def save_testorg_result(json, index):\n", " data = json['data']\n", " i = 0\n", " while f'request{i}' in data:\n", " result = data[f'request{i}']\n", " if result is None:\n", " print(f'Warning: {values.loc[index[i]].name} has been deleted')\n", " else:\n", " values.loc[index[i],'creation date'] = result['createdAt']\n", " i += 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Warning: wedeploy has been deleted\n", "Warning: surging-cloud has been deleted\n", "Warning: ruby-gnome2 has been deleted\n", "Warning: BloomSoftware has been deleted\n", "Warning: SchibstedSpain has been deleted\n", "imin: 4360\r" ] } ], "source": [ "imin = 0\n", "while imin is not None:\n", " sys.stdout.write(f'imin: {imin}\\r')\n", " sys.stdout.flush()\n", " query, index, imin = build_graphql_query(imin)\n", " json = send_graphql_request(query, {})\n", " save_testorg_result(json, index)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def build_graphql_query(imin):\n", " query = \"\"\"\n", " query {\n", " \"\"\"\n", " if imin + 40 < len(values):\n", " next_imin = imin + 40\n", " isup = next_imin\n", " else:\n", " next_imin = None\n", " isup = len(values)\n", " index = values.index[imin:isup]\n", " for i, owner in enumerate(index):\n", " query += \"\"\"\n", " request%d: search(query: \"user:%s created:<%s\", type: REPOSITORY) {\n", " repositoryCount\n", " }\n", " \"\"\" % (i, owner, values.loc[owner, 'creation date'])\n", " query += \"\"\"\n", " }\n", " \"\"\"\n", " return query, index, next_imin\n", "\n", "def save_testorg_result(json, index):\n", " data = json['data']\n", " i = 0\n", " while f'request{i}' in data:\n", " result = data[f'request{i}']\n", " values.loc[index[i],'transferred repositories'] = result['repositoryCount']\n", " i += 1" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "imin: 4360\r" ] } ], "source": [ "imin = 0\n", "while imin is not None:\n", " sys.stdout.write(f'imin: {imin}\\r')\n", " sys.stdout.flush()\n", " query, index, imin = build_graphql_query(imin)\n", " json = send_graphql_request(query, {})\n", " save_testorg_result(json, index)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values[columns + [\n", " 'creation date',\n", " 'transferred repositories'\n", "] + keyword_columns ].to_csv(\n", " 'community-organizations-phase-two.csv'\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Phase 3: browse through organizations with transferred repos" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "values = pd.read_csv('community-organizations-phase-two.csv', index_col=0, parse_dates=['creation date'], dtype={\n", " 'members': 'UInt32',\n", " 'repositories': 'UInt32',\n", " 'stars': 'UInt32',\n", " 'collaborators': 'UInt32'\n", "}).sort_values('transferred repositories', ascending=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Organizations with one transferred repository from before their creation represent 35% of the remaining organizations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.35151791828349693" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values[values['transferred repositories'] > 0]) / len(values)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And organizations with two transferred repositories from before their creation represent about 20% of the same organizations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.21410636840903904" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values[values['transferred repositories'] > 1]) / len(values)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "938" ] }, "execution_count": null, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(values[values['transferred repositories'] > 1])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
\n", " | name | \n", "description | \n", "url | \n", "members | \n", "repositories | \n", "stars | \n", "collaborators | \n", "creation date | \n", "transferred repositories | \n", "keyword app | \n", "... | \n", "keyword shared | \n", "keyword sharing | \n", "keyword support | \n", "keyword supporter | \n", "keyword supporters | \n", "keyword supporting | \n", "keyword together | \n", "keyword unofficial | \n", "keyword user | \n", "keyword users | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
datadesk | \n", "Los Angeles Times Data Desk | \n", "Analysis, applications and automation from a t... | \n", "https://www.latimes.com | \n", "8 | \n", "184 | \n", "313 | \n", "27 | \n", "2010-07-02 02:04:07+00:00 | \n", "6.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
collective | \n", "Collective | \n", "Plone add-ons shared code repositories | \n", "https://collective.github.io | \n", "268 | \n", "1674 | \n", "569 | \n", "628 | \n", "2010-08-13 00:04:43+00:00 | \n", "7.0 | \n", "NaN | \n", "... | \n", "True | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
uncopenweb | \n", "UNC Open Web Group | \n", "NaN | \n", "http://sites.google.com/site/uncopenweb/ | \n", "11 | \n", "23 | \n", "15 | \n", "2 | \n", "2010-09-04 01:22:47+00:00 | \n", "6.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
PerlDancer | \n", "PerlDancer | \n", "The Dancer Developers group | \n", "http://perldancer.org | \n", "10 | \n", "71 | \n", "708 | \n", "15 | \n", "2010-09-21 12:27:49+00:00 | \n", "2.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "True | \n", "
symphonists | \n", "Symphony Community | \n", "NaN | \n", "https://www.getsymphony.com | \n", "12 | \n", "106 | \n", "47 | \n", "13 | \n", "2010-10-21 15:40:12+00:00 | \n", "56.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
libtom | \n", "libtom | \n", "libtom projects | \n", "http://www.libtom.net | \n", "3 | \n", "7 | \n", "859 | \n", "22 | \n", "2010-10-22 09:12:56+00:00 | \n", "5.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
xcore | \n", "XCore open source project | \n", "NaN | \n", "github.xcore.com | \n", "26 | \n", "119 | \n", "75 | \n", "7 | \n", "2011-01-13 14:16:30+00:00 | \n", "3.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
silverstripe-archive | \n", "SilverStripe Archive | \n", "Archive of unsupported SilverStripe modules. I... | \n", "http://silverstripe.org | \n", "10 | \n", "71 | \n", "72 | \n", "11 | \n", "2011-01-17 00:22:34+00:00 | \n", "4.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "True | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
mapbox | \n", "Mapbox | \n", "Mapbox is the location data platform for mobil... | \n", "https://www.mapbox.com | \n", "62 | \n", "812 | \n", "4700 | \n", "458 | \n", "2011-02-04 19:02:13+00:00 | \n", "4.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
openstate | \n", "Open State Foundation | \n", "Open State Foundation promotes digital transpa... | \n", "https://openstate.eu | \n", "18 | \n", "107 | \n", "23 | \n", "13 | \n", "2011-03-15 21:42:43+00:00 | \n", "2.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
charlotte-ruby | \n", "Charlotte Ruby Group | \n", "Charlotte's local Ruby User Group | \n", "http://charlotteruby.org | \n", "10 | \n", "18 | \n", "1277 | \n", "7 | \n", "2011-04-07 15:45:24+00:00 | \n", "3.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "True | \n", "NaN | \n", "
pusher | \n", "Pusher | \n", "Pusher makes communication and collaboration A... | \n", "https://pusher.com/ | \n", "19 | \n", "206 | \n", "1488 | \n", "54 | \n", "2011-04-19 17:16:38+00:00 | \n", "4.0 | \n", "True | \n", "... | \n", "NaN | \n", "NaN | \n", "True | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
cul | \n", "Columbia University Libraries | \n", "NaN | \n", "http://library.columbia.edu | \n", "5 | \n", "168 | \n", "20 | \n", "16 | \n", "2011-04-29 14:08:07+00:00 | \n", "5.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
silexlabs | \n", "Silex Labs | \n", "Silex Labs is a foundation dedicated to helpin... | \n", "http://www.silexlabs.org/ | \n", "6 | \n", "57 | \n", "688 | \n", "13 | \n", "2011-05-22 17:18:30+00:00 | \n", "2.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
bbc | \n", "BBC | \n", "Open source code used on public facing service... | \n", "http://www.bbc.co.uk/opensource/ | \n", "105 | \n", "624 | \n", "1002 | \n", "2036 | \n", "2011-06-04 01:31:11+00:00 | \n", "7.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
haiku | \n", "Haiku | \n", "An open-source operating system that specifica... | \n", "https://www.haiku-os.org | \n", "10 | \n", "19 | \n", "766 | \n", "9 | \n", "2011-06-18 03:22:05+00:00 | \n", "2.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
XCSoar | \n", "XCSoar | \n", "... the open-source glide computer | \n", "https://xcsoar.org/ | \n", "7 | \n", "9 | \n", "134 | \n", "11 | \n", "2011-06-20 09:34:21+00:00 | \n", "2.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
Kozea | \n", "Kozea | \n", "We build open source software that you will love. | \n", "https://community.kozea.fr/ | \n", "13 | \n", "103 | \n", "2921 | \n", "39 | \n", "2011-06-23 10:59:31+00:00 | \n", "2.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
JetBrains | \n", "JetBrains | \n", "JetBrains open source projects. | \n", "https://www.jetbrains.com | \n", "94 | \n", "410 | \n", "28677 | \n", "94 | \n", "2011-06-27 10:06:52+00:00 | \n", "6.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
Automattic | \n", "Automattic | \n", "We are passionate about making the web a bette... | \n", "https://automattic.com | \n", "150 | \n", "548 | \n", "19214 | \n", "892 | \n", "2011-07-01 02:45:15+00:00 | \n", "7.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
mantisbt-plugins | \n", "MantisBT Community Plugins | \n", "NaN | \n", "https://www.mantisbt.org | \n", "25 | \n", "89 | \n", "160 | \n", "54 | \n", "2011-07-12 14:07:02+00:00 | \n", "5.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
neo4j-contrib | \n", "Neo4j Contrib | \n", "Public, Open Source Contributions to the Neo4j... | \n", "http://neo4j.com/developer | \n", "15 | \n", "134 | \n", "915 | \n", "34 | \n", "2011-07-14 23:15:43+00:00 | \n", "10.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
sandstorm | \n", "sandstorm | \n", "sandstorm - building great web applications | \n", "https://sandstorm.de/blog.html | \n", "1 | \n", "84 | \n", "40 | \n", "11 | \n", "2011-08-03 08:20:12+00:00 | \n", "3.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
Kong | \n", "Kong | \n", "Next-Generation API Platform for Microservices... | \n", "https://konghq.com | \n", "25 | \n", "137 | \n", "22961 | \n", "99 | \n", "2011-08-06 02:08:16+00:00 | \n", "6.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
CloudStack-extras | \n", "Collection of additional tools that are useful... | \n", "NaN | \n", "NaN | \n", "18 | \n", "23 | \n", "260 | \n", "18 | \n", "2011-08-26 14:40:35+00:00 | \n", "8.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
hacsoc | \n", "HacSoc | \n", "The organization for all things done by the CW... | \n", "http://hacsoc.org/ | \n", "18 | \n", "42 | \n", "16 | \n", "25 | \n", "2011-09-11 04:47:22+00:00 | \n", "2.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
vim-jp | \n", "vim-jp | \n", "Vim community for Japanese developers and users | \n", "https://vim-jp.org/ | \n", "47 | \n", "33 | \n", "404 | \n", "116 | \n", "2011-09-15 02:44:30+00:00 | \n", "3.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "True | \n", "True | \n", "
SitePen | \n", "SitePen | \n", "Modernizing Apps, Tools & Teams for the Enterp... | \n", "http://www.sitepen.com | \n", "6 | \n", "31 | \n", "614 | \n", "39 | \n", "2011-09-21 08:39:10+00:00 | \n", "3.0 | \n", "True | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
tikalk | \n", "Tikal Knowledge, Ltd. | \n", "NaN | \n", "www.tikalk.com | \n", "5 | \n", "195 | \n", "745 | \n", "99 | \n", "2011-10-06 07:44:56+00:00 | \n", "6.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "True | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
sbt | \n", "sbt | \n", "Community organization for all sbt plugin auth... | \n", "https://www.scala-sbt.org | \n", "36 | \n", "129 | \n", "3911 | \n", "24 | \n", "2011-10-28 14:16:17+00:00 | \n", "13.0 | \n", "NaN | \n", "... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "NaN | \n", "
30 rows × 84 columns
\n", "