{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tutorial 2: Getting Dataframes (DataView)\n", "\n", "The WikiWho class also provides the `dv` attribute (`DataView` class), which returns the data in pandas dataFrames." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Create an instance of WikiWho" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from wikiwho_wrapper import WikiWho\n", "ww = WikiWho(lng='en')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Retrieving pandas DataFrames\n", "\n", "The DataView has exactly the same functions (**and parameters**) as the WikiWhoAPI and they are mapped one to one:\n", "\n", "- `all_content()`\n", "- `rev_ids_of_article()`\n", "- `last_rev_content()`\n", "- `specific_rev_content_by_rev_id()`\n", "- `range_rev_content_by_article_title()`\n", "\n", "This is how you use them." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.1 All Content" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# by article title\n", "df = ww.dv.all_content(\n", " article='Bioglass 45S5') \n", "\n", "# or, by article id\n", "# ww.dv.all_content(2161298)\n", "\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.2 Revision IDS" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# by article title\n", "df = ww.dv.rev_ids_of_article(\n", " article='Bioglass 45S5') \n", "\n", "# or, by article id\n", "# ww.dv.rev_ids_of_article(2161298)\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.3 Revision Content" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# by article title\n", "df = ww.dv.last_rev_content(\n", " article='Bioglass 45S5') \n", "\n", "# or, by article id\n", "# ww.dv.last_rev_content(2161298)\n", "\n", "df.head(5)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# by revision id directly\n", "df = ww.dv.specific_rev_content_by_rev_id(\n", " rev_id=189370281) \n", "\n", "# or by also specifying the article_title (it is not documented in Wikipedia that revisions ids are unique\n", "# and don't depend on the article)\n", "# df = ww.dv.specific_rev_content_by_rev_id(article='bioglass', rev_id=189370281)\n", "\n", "df.head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# providing a range (two revision ids)\n", "df = ww.dv.range_rev_content_by_article_title(\n", " article='Bioglass 45S5', start_rev_id=18064039, end_rev_id=207995408)\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2.4 Editor Content" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# by page id\n", "df = ww.dv.edit_persistence(page_id=2161298)\n", "df.head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# by editor id\n", "df = ww.dv.edit_persistence(editor_id=101935)\n", "df.head(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# by page id and editor id\n", "df = ww.dv.edit_persistence(page_id=15745, editor_id=101935)\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Summary\n", "\n", "You just learn how to use the functions of the APIWrapper. The next tutorial will show simple examples how to extract some information of the `all_content()` function" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from utils.notebooks import get_next_notebook\n", "from IPython.display import HTML\n", "try:\n", " display(HTML(f'Go to next workbook'))\n", "except:\n", " HTML('Go to next workbook')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 2 }