{ "cells": [ { "cell_type": "markdown", "id": "bec25f1a", "metadata": {}, "source": [ "# Increase the GitHub rate limit (generic)" ] }, { "cell_type": "markdown", "id": "fa7f85d5", "metadata": { "tags": [] }, "source": [ "## Table of content (TOC)\n", "* 1 - Introduction\n", "* 2 - The High-Level procedure\n", " * 2.1 - Create a personal access token\n", " * 2.1.1 - Login on GitHub\n", " * 2.1.2 - On GitHub create the personal access token\n", " * 2.2 - Install the personal token in your environment\n", "* 3 - Notebook version" ] }, { "cell_type": "markdown", "id": "c2c48614-5571-47f8-b70c-28f1ea58f97b", "metadata": { "tags": [] }, "source": [ "# 1 - Introduction \n", "##### [Back to TOC](#TOC)\n", "\n", "GitHub enforces a rate limit of [60 API calls per hour](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#primary-rate-limit-for-unauthenticated-users). for unauthenticated users. When working with Text-Fabric datasets that consist of a large number of files (such as the [CenterBLC/N1904 dataset](https://github.com/CenterBLC/N1904), has at least 60 files), this rate limit can be easily reached during the loading process, effectively blocking loading of the full dataset by any unauthenticated user.\n", "\n", "There are two possible solutions to this problem:\n", "\n", "* During the creation of our dataset we produced a zip file containing the complete dataset and attached it to the release published on GitHub. Whenever a user downloads the [**latest** release](https://github.com/CenterBLC/N1904/releases/latest), the Text-Fabric code downloads this zip file without using the risk of hitting the rate limit. **Hence, most users will not hit this rate limit when using the latest dataset, provided this includes a complete.zip file.** See also the Text-Fabric documentation on how this is done by the dataprovider [using A.zipall()](https://annotation.github.io/text-fabric/tf/advanced/repo.html#github).\n", "\n", "* If you however needs to load another (previous) version, there is a need to raise the rate limit, which can be done by adding authentication, after which the rate limit is raised to [5000 API calls per hour](https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#primary-rate-limit-for-authenticated-users).\n", "\n", "The procedure in this NoteBook provides guidance on how to authenticate as user on GitHub by installing a personal access token." ] }, { "cell_type": "markdown", "id": "76d1044b-e3b6-4a51-b9fa-5f3234a6d08b", "metadata": {}, "source": [ "# 2 - The High-Level procedure\n", "##### [Back to TOC](#TOC)\n", "\n", "The process basicly consist of the following two high level steps:\n", "\n", "* Create a personal access token\n", "* Install the personal access token in your environment\n", "\n", "After performing these steps, the Text-Fabric code finds this personal token and will add it to every GitHub API call it make which will allow for the increased rate limit.\n", "\n", "**As this personal token functions as your personal credential, do not pass it on to others, let them obtain their own!** \n", "\n", "See also [Keeping your personal access tokens secure](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#keeping-your-personal-access-tokens-secure)." ] }, { "cell_type": "markdown", "id": "2e01787a-7480-43df-8ae4-6b73e0805f72", "metadata": {}, "source": [ "## 2.1 - Create a personal access token\n", "\n", "This step consist of the following actions:" ] }, { "cell_type": "markdown", "id": "c4045bf3-ea5f-4fda-a0c5-3b34489c67b7", "metadata": {}, "source": [ "### 2.1.1 - Login on GitHub\n", "\n", "In order to obtain a personal access token it is required to login using your personal login on GitHub.\n", "\n", "Go to [https://github.com/login](https://github.com/login). This will bring up the following login screen:\n", "\n", "\n", "\n", "Login using your credentials or create a new account." ] }, { "cell_type": "markdown", "id": "11b94ac1-7d43-4177-a6dd-da27e37cfb36", "metadata": {}, "source": [ "### 2.1.2 - On GitHub create the personal access token\n", "\n", "After login on GitHub create the personal token (classic) using [this procedure](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens#creating-a-personal-access-token-classic). The image below provides a screendump of the procedure as per August 2024:\n", "\n", "\n", "\n", "**The last step (copy the token) is important. Store the token localy as you need it for the next part of the procedure.**" ] }, { "cell_type": "markdown", "id": "5952f103-49a4-4628-8816-a3c911769002", "metadata": {}, "source": [ "## 2.2 - Install the personal token in your environment\n", "\n", "After you have obtained your personal token en and put it in an environment variable named GHPERS on the system where your app runs. \n", "\n", "The method used to install this personal access token depends on your operating system. A good starting point is to consult [the official Text-Fabric documentation](https://annotation.github.io/text-fabric/tf/advanced/repo.html#token-in-environment-variables)\n", "\n", "See [here](https://www.howtogeek.com/789660/how-to-use-windows-cmd-environment-variables/#autotoc_anchor_2) if you want to set the variabale on windows using the command prompt." ] }, { "cell_type": "markdown", "id": "a5705e69-0488-4dc4-a5fc-d1a3d2c44641", "metadata": {}, "source": [ "# 3 - Notebook version\n", "##### [Back to TOC](#TOC)\n", "\n", "
Author | \n", "Tony Jurg | \n", "
Version | \n", "1.1 | \n", "
Date | \n", "9 October 2024 | \n", "