{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "gIRXr0NoTJb0" }, "source": [ "# `twitter_blocklist`\n", "\n", "Manage list of blocked users on Twitter.\n", "\n", "## Install the package\n", "\n", "This should only be necessary the first time your run the notebook and once in a while to update it" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 577 }, "id": "Yj4dwchFVslV", "outputId": "33c1cd32-7f02-469a-c218-921b314a09aa" }, "outputs": [], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "!pip install --upgrade twitter_blocklist" ] }, { "cell_type": "markdown", "metadata": { "id": "u3JzfmNuV3oY" }, "source": [ "Ignore the `# NBVAL_IGNORE_OUTPUT` lines, they are just needed for automated testing.\n", "\n", "## Setup authentication\n", "\n", "This software uses the Twitter API to connect to Twitter on your behalf to read and optionally modify the list of blocked users. Therefore it needs to authenticate to Twitter, this is achieved creating an \"App\" and copying special authentication token strings.\n", "\n", "First follow the instructions by the [`python-twitter` project](https://python-twitter.readthedocs.io/en/latest/getting_started.html) on how to create an \"App\", in this phase Twitter might ask you to create a Developer account, in the case, don't worry it is pretty quick.\n", "\n", "Under \"App permissions\", you need to set \"Read and Write\", otherwise you would get the error \"Read-only app\" cannot POST when trying to import a blocklist.\n", "\n", "Once it is done, copy-paste the \"API Key and Secret\" as consumer key and secret, then click on \"Generate\" under \"Access token & Secret\" and paste them below:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "%rm twitter_keys.toml" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "id": "Obbr6d3odJif", "outputId": "7464d4ce-d36a-4c83-acc8-07e8d2a33e86" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Writing twitter_keys.toml\n" ] } ], "source": [ "%%file twitter_keys.toml\n", "\n", "consumer_key='xxxxxxxxxxxxxxxxxxxx'\n", "consumer_secret='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'\n", "access_token_key='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'\n", "access_token_secret='xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Alternatively, you can define the equivalent environment variables (they have precedence over the toml file):\n", "\n", " export TWITTER_CONSUMER_KEY=''\n", " export TWITTER_CONSUMER_SECRET=''\n", " export TWITTER_ACCESS_TOKEN_KEY=''\n", " export TWITTER_ACCESS_TOKEN_SECRET=''" ] }, { "cell_type": "markdown", "metadata": { "id": "oZM0urcIoG3w" }, "source": [ "We can now test if the authentication is working by getting a list of tweets by the authenticated users and their follows:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 55 }, "id": "fVaT_LDYdP3x", "outputId": "6ef2ee75-0a5e-44bc-8c8d-b33f211e0751" }, "outputs": [ { "data": { "text/plain": [ "[Status(ID=1315688655298523136, ScreenName=blocklistbot, Created=Mon Oct 12 16:19:54 +0000 2020, Text='first tweeting test')]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "from twitter_blocklist import authenticate\n", "api = authenticate()\n", "api.GetHomeTimeline()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Notice about rate limits\n", "\n", "Consider that Twitter rate-limits their APIs, I have setup the client to automatically sleep in case of a rate-limiting error, in case that happens, just leave the script running and it will complete at some point. For example exporting the blocks needs to make 1 request every 5000 blocked IDs, so you could hit the limit of 15 requests every 15 minutes, in that case the script will sleep for 15 minutes and then resume." ] }, { "cell_type": "markdown", "metadata": { "id": "oltwYSRZo8rj" }, "source": [ "## Export the current list of blocks to a `csv` file\n", "\n", "The standard format is one block per line, first the integer Twitter profile ID and then the screen name:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "id": "5suX9xQRpKCa" }, "outputs": [], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "!twitter_blocklist --export my_blocks.csv" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 35 }, "id": "nRud_yH5pPYZ", "outputId": "198e1be6-7067-47d8-fa13-a9f2a98ba443" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['6253282,\"TwitterAPI\"\\n', '436266454,\"TwitterMovies\"\\n', '783214,\"Twitter\"\\n']\n" ] } ], "source": [ "with open(\"my_blocks.csv\", \"r\") as f:\n", " print(f.readlines())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Unblock users from a `csv` file\n", "\n", "In case we need to undo a list of blocks done with `twitter_blocklist`, we can perform the opposite operation:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "id": "BPtyMO3DsWbW" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100% (3 of 3) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "!twitter_blocklist --unblock my_blocks.csv" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "api.GetBlocksIDs()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import blocks from a `csv` file\n", "\n", "We can share your blocks with other users, or collaborate together on maintaining a list, or import a list previously exported from BlockTogether:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100% (3 of 3) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "!twitter_blocklist my_blocks.csv" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[6253282, 436266454, 783214]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "api.GetBlocksIDs()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Share a list of blocks with Github or Pastebin\n", "\n", "The easiest way to publish a list of blocks is create an account on and paste the content of the `csv` file there, see for example and then share the \"RAW\" url (right click on the \"RAW\" button and copy the link address), in this case .\n", "\n", "Same with Gist or creating a dedicated Github repository where multiple people can contribute, see for example .\n", "Also on Github, we need to share the \"RAW\" url for `twitter_blocklist`, in the example repository above, it would be:\n", "\n", "* \n", "\n", "We could download those files, edit them locally and then import them using the command above, but `twitter_blocklist` also support downloading them on the fly, so that we can always grab the most updated version:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "blocklists_urls = [\n", " \"https://pastebin.com/raw/NndFiwU3\",\n", " \"https://raw.githubusercontent.com/zonca/example_blocklist/main/a_blocklist.csv\"\n", "]" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100% (2 of 2) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n", "100% (2 of 2) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "for url in blocklists_urls:\n", " !twitter_blocklist $url" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The same can be executed in the terminal as:\n", "\n", " > twitter_blocklist https://pastebin.com/raw/NndFiwU3\n", " > twitter_blocklist https://raw.githubusercontent.com/zonca/example_blocklist/main/a_blocklist.csv" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(api.GetBlocksIDs())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As before, we can undo the operation adding the `--unblock` option:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100% (2 of 2) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n", "100% (2 of 2) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "for url in blocklists_urls:\n", " !twitter_blocklist --unblock $url" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(api.GetBlocksIDs())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Handling lists\n", "\n", "**Notice about lists**: if you create a list of users and then block them, they are removed from the list. So this technique is only useful if you want to block a list maintained by someone else.\n", "\n", "Another possibility for sharing blocks is to use Twitter lists, if you create a list using the Twitter web interface or any app, see this example list:\n", "\n", "https://twitter.com/i/lists/1315715236456853504\n", "\n", "You notice that the URL has a long string of numbers, that is the List ID and we can use that to block/unblock all the accounts in the list:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100% (2 of 2) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "!twitter_blocklist --list 1315715236456853504" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(api.GetBlocksIDs())" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "100% (2 of 2) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00\n" ] } ], "source": [ "# NBVAL_IGNORE_OUTPUT\n", "!twitter_blocklist --unblock --list 1315715236456853504" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(api.GetBlocksIDs())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "colab": { "name": "run-twitter-blocklist.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python [conda env:py36blocklist]", "language": "python", "name": "conda-env-py36blocklist-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.12" } }, "nbformat": 4, "nbformat_minor": 4 }