{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Open Access versions of articles in Australian HASS journals\n", "\n", "Previously I [attempted some analysis](finding-oa-versions-of-AHS-articles.ipynb) of the open access status of research articles published in _Australian Historical Studies_. I thought it would be interesting to try some comparisons with other Australian HASS subscription-based journals.\n", "\n", "I've simplified the process here to make it easier run an analysis of any journal. The steps are:\n", "\n", "1. Get a list of articles published in the journal by querying the [CrossRef API](https://www.crossref.org/education/retrieve-metadata/rest-api/) with the journal's ISSN.\n", "2. Remove recurring sections such as 'Editorial' and 'Book reviews' from the list of articles.\n", "3. Look up the OA status of each remaining article by querying the [Unpaywall API](https://unpaywall.org/products/api) with the article's DOI.\n", "\n", "I then do some simple analysis of the OA status, and visualise the results over time.\n", "\n", "## Understanding the OA status\n", "\n", "Theh Unpaywall API returns one of five values for the OA status of an article – 'Gold', 'Hybrid', 'Green', 'Bronze', and 'Closed'. There's some [more information on how these are determined](https://support.unpaywall.org/support/solutions/articles/44001777288) on the Unpaywall site. Put simply:\n", "\n", "* **Gold** – the article is freely available, openly licensed, and published in an open access journal\n", "* **Hybrid** – the article is freely available, openly licensed, and published in a subscription journal\n", "* **Green** – a version of the article (usually the Author's Accepted Manuscript) is freely available from a public repository\n", "* **Bronze** – the article is published in a subscription journal, but is freely available from the journal's website \n", "* **Closed** – the article is behind a paywall\n", "\n", "## Caveats\n", "\n", "* The data might not be up-to-date. In particular, I've noticed that some 'bronze' status articles are reported as 'closed'. Presumably this is because the Unpaywall database is running a bit behind changes in the publishers' websites.\n", "* The definition of an 'article' is not consistent. In earlier issues of some journals it seems that things like book reviews are grouped together under a single DOI, while recent issues have a DOI for each review. \n", "\n", "## Journals\n", "\n", "So far I've looked at the following journals (more suggestions welcome):\n", "\n", "* [Australian Historical Studies](#Australian-Historical-Studies)\n", "* [History Australia](#History-Australia)\n", "* [Australian Journal of Politics and History](#Australian-Journal-of-Politics-and-History)\n", "* [Journal of Australian Studies](#Journal-of-Australian-Studies)\n", "* [Australian Archaeology](#Australian-Archaeology)\n", "* [Archives and Manuscripts](#Archives-and-Manuscripts)\n", "* [Journal of the Australian Library and Information Association](#Journal-of-the-Australian-Library-and-Information-Association)\n", "* [Labour History](#Labour-History)\n", "\n", "Of course, this analysis is focused on subscription journals. There are also open access journals like the [Public History Review](https://epress.lib.uts.edu.au/journals/index.php/phrj) where all the articles would be 'Gold'!\n", "\n", "## Results (12 January 2021)\n", "\n", "The results are not good. Articles published in Australia's main subscription history journals are about **94% closed**. This is despite the fact that Green OA policies allow authors to deposit versions of their articles in public repositories (often after an embargo period). \n", "\n", "| Journal | Closed | \n", "|----|----|\n", "|Australian Historical Studies|94.6%|\n", "|History Australia|94.9%|\n", "|Australian Journal of Politics and History|95.7%*|\n", "|Journal of Australian Studies|94.2%|\n", "|Australian Archaeology|83.4%|\n", "|Archives and Manuscripts (2012-)|24.8%|\n", "|Journal of the Australian Library and Information Association|52.5%*|\n", "|Labour History|93.9%|\n", "|* Problems with data noted below|\n", "\n", "**This can be fixed!** If you're in a university, talk to your librarians about depositing a Green OA version of your article in an institutional repository. If not, you can use the [Share your paper](https://shareyourpaper.org/) service to upload a Green OA version to Zenodo. Your research will be easier to find, easier to access, easier to use, and available to everyone – not just those with the luxury of an institutional subscription.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import what we need" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "import requests\n", "from requests.adapters import HTTPAdapter\n", "from requests.packages.urllib3.util.retry import Retry\n", "import requests_cache\n", "from tqdm.auto import tqdm\n", "import pandas as pd\n", "import altair as alt\n", "import collections\n", "\n", "s = requests_cache.CachedSession()\n", "retries = Retry(total=5, backoff_factor=1, status_forcelist=[ 502, 503, 504 ])\n", "s.mount('https://', HTTPAdapter(max_retries=retries))\n", "s.mount('http://', HTTPAdapter(max_retries=retries))\n", "\n", "tqdm.pandas(desc=\"records\")" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [], "source": [ "# the APIs are open, but it's polite to let the APIs know who you are\n", "email = 'tim@discontents.com.au'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Define some functions to do the work" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "def get_total_results(issn):\n", " '''\n", " Get the total number of articles in CrossRef for this journal.\n", " '''\n", " response = s.get(f'https://api.crossref.org/journals/{issn}/works/', params={'rows': 0})\n", " data = response.json()\n", " try:\n", " total_works = data['message']['total-results']\n", " except KeyError:\n", " total_works = 0\n", " return total_works\n", "\n", "def get_title(record):\n", " '''\n", " Titles are in a list – join any values\n", " '''\n", " title = record.get('title')\n", " if isinstance(title, list):\n", " title = ' – '.join(title)\n", " return title\n", "\n", "def harvest_works(issn):\n", " '''\n", " Harvest basic details (DOI, title, date) of articles from the journal with the supplied ISSN from CrossRef.\n", " '''\n", " harvested = 0\n", " works = []\n", " total_results = get_total_results(issn)\n", " params = {\n", " 'rows': 100,\n", " 'offset': 0\n", " }\n", " headers = {\n", " 'User-Agent': f'Jupyter Notebook (mailto:{email})'\n", " }\n", " with tqdm(total=total_results) as pbar:\n", " while harvested <= total_results:\n", " params['offset'] = harvested\n", " response = s.get(f'https://api.crossref.org/journals/{issn}/works/', params=params, headers=headers)\n", " data = response.json()\n", " try:\n", " records = data['message']['items']\n", " except TypeError:\n", " print('TYPEERROR')\n", " print(data)\n", " else:\n", " for record in records:\n", " try:\n", " works.append({'doi': record.get('DOI'), 'title': get_title(record), 'year': record['issued']['date-parts'][0][0]})\n", " except KeyError:\n", " print('KEYERROR')\n", " print(record)\n", " harvested += 100\n", " pbar.update(len(data['message']['items']))\n", " return works\n", "\n", "def get_oa_status(doi):\n", " '''\n", " Get OA status of DOI from the Unpaywall API.\n", " '''\n", " response = s.get(f'https://api.unpaywall.org/v2/{doi}?email={email}')\n", " data = response.json()\n", " return data['oa_status']\n", "\n", "def create_scale(df):\n", " '''\n", " Set colour range to match the OA status types.\n", " '''\n", " scale = []\n", " colours = collections.OrderedDict()\n", " colours['hybrid'] = 'gold'\n", " colours['green'] = 'green'\n", " colours['bronze'] = 'brown'\n", " colours['closed'] = 'lightgrey'\n", " status_values = list(df['oa_status'].unique())\n", " for status, colour in colours.items():\n", " if status in status_values:\n", " scale.append(colour)\n", " return scale\n", "\n", "def chart_oa_status(df, title):\n", " # Adding a numeric order column makes it easy to sort by oa_status\n", " df['order'] = df['oa_status'].replace({val: i for i, val in enumerate(['closed', 'bronze', 'green', 'hybrid'])})\n", " # Get colour values\n", " scale = create_scale(df)\n", " chart = alt.Chart(df).mark_bar().encode(\n", " x=alt.X('year:O', title='Year'),\n", " y=alt.Y('count():Q', title='Number of articles', axis=alt.Axis(tickMinStep=1)),\n", " color=alt.Color('oa_status:N', scale=alt.Scale(range=scale), legend=alt.Legend(title='OA type'), sort=alt.EncodingSortField('order', order='descending')),\n", " order='order',\n", " tooltip=[alt.Tooltip('count():Q', title='Number of articles'), alt.Tooltip('oa_status', title='OA type')]\n", " ).properties(title=title)\n", " display(chart)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Australian Historical Studies\n", "\n", "* ISSN: 1031-461X\n", "* [Website](https://www.tandfonline.com/toc/rahs20/current)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "e2d5cc8d19db4aceb74155bca37b5757", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1548.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_ahs = harvest_works('1031-461X')" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1548, 3)" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ahs = pd.DataFrame(works_ahs)\n", "df_ahs.shape" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1548, 3)" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Make sure there's no duplicates\n", "df_ahs.drop_duplicates(inplace=True)\n", "df_ahs.shape" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Editorial board 36\n", "Books 35\n", "Book notes 30\n", "In this issue 20\n", "Notes on Contributors 16\n", "In This Issue 16\n", "Exhibitions 12\n", "Book reviews 12\n", "Book Notes 10\n", "Exhibition 8\n", "Communications 7\n", "Reviews 6\n", "Exhibition review 6\n", "Editorial Board 6\n", "Exhibition reviews 4\n", "Introduction 4\n", "Editorial 4\n", "Book Note 4\n", "Notes on contributors 3\n", "BOOKS 2\n", "Communication 2\n", "‘A study corner in the kitchen’: Australian graduate women negotiate family, nation and work in the 1950s and early 1960s1 1\n", "The Snub: Robert Menzies and the Melbourne Club 1\n", "Historical Thinking for History Teachers: A New Approach to Engaging Students and Developing Historical Consciousness 1\n", "Australian Soldiers in Asia-Pacific in World War II. 1\n", "Name: title, dtype: int64" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Show repeated titles\n", "df_ahs['title'].value_counts()[:25]" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1305, 3)" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Get rid of titles that appear more than once\n", "df_ahs_unique = df_ahs.copy().drop_duplicates(subset='title', keep=False)\n", "df_ahs_unique.shape" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7242ee09cebd456eb7ceadae379826d0", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=1305.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_ahs_unique['oa_status'] = df_ahs_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Results" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 1235\n", "green 37\n", "bronze 28\n", "hybrid 5\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ahs_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 94.6%\n", "green 2.8%\n", "bronze 2.1%\n", "hybrid 0.4%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ahs_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_ahs_unique, title='Australian Historical Studies')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## History Australia\n", "\n", "* ISSN: 1449-0854\n", "* [Website](https://www.tandfonline.com/toc/raha20/current)\n", "* [Archived issues in Trove](https://webarchive.nla.gov.au/tep/46522)" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "71e2e8cc8d274a80a3a36c18db0c7596", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1249.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_ha = harvest_works('1449-0854')" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1249, 3)" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha = pd.DataFrame(works_ha)\n", "df_ha.shape" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1249, 3)" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha.drop_duplicates(inplace=True)\n", "df_ha.shape" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doititleyear
10710.2104/ha.2007.4.issue-2None2007
22010.2104/ha.2006.3.issue-1None2006
76410.2104/ha.2008.5.issue-2None2008
87310.2104/ha.2008.5.issue-3None2008
100010.2104/ha.2006.3.issue-2None2006
104610.2104/ha.2007.4.issue-1None2007
\n", "
" ], "text/plain": [ " doi title year\n", "107 10.2104/ha.2007.4.issue-2 None 2007\n", "220 10.2104/ha.2006.3.issue-1 None 2006\n", "764 10.2104/ha.2008.5.issue-2 None 2008\n", "873 10.2104/ha.2008.5.issue-3 None 2008\n", "1000 10.2104/ha.2006.3.issue-2 None 2006\n", "1046 10.2104/ha.2007.4.issue-1 None 2007" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha.loc[df_ha['title'].isnull()]" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1243, 3)" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha.dropna(subset=['title'], inplace=True)\n", "df_ha.shape" ] }, { "cell_type": "code", "execution_count": 51, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "From the President 46\n", "From the Editors 35\n", "AHA Honour Roll 15\n", "Exhibition Reviews 14\n", "AHA Calendar of Events 12\n", "Book Reviews 11\n", "From the Editor 10\n", "AHA Prize and Award Winners 9\n", "Australian Historical Association (AHA) 5\n", "Film and Radio Reviews 4\n", "AHA Code of Conduct 4\n", "AHA Affiliates/Network 4\n", "From the editors 3\n", "Imprint information 3\n", "From the Guest Editors 3\n", "Review Policy for History Australia 3\n", "Film, Television, Radio and Theatre Reviews 3\n", "AHA Prizes and Awards 3\n", "AboutHistory Australia 3\n", "Introduction 2\n", "AHA Prizes 2008 2\n", "AHA Prizes 2009–10 in Brief 2\n", "Submitting a manuscript toHistory Australia 2\n", "Prizes and Awards 2\n", "Submitting a manuscript to History Australia 2\n", "AHA Prizes 2006 and Beyond 2\n", "Historical Novels Challenging the National Story 1\n", "Parallels on the Periphery: The Exploration of Aboriginal History by Local Historical Societies in New South Wales, 1960s-1970s 1\n", "Review of Australianscreen and Moving History: 60 Years of Film Australia 1\n", "The spoils of opportunity: Janet Mitchell and Australian internationalism in the interwar Pacific 1\n", "Name: title, dtype: int64" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha['title'].value_counts()[:30]" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1039, 3)" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha_unique = df_ha.copy().drop_duplicates(subset='title', keep=False)\n", "df_ha_unique.shape" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1b8655ae800b455ab316c1b3bba92964", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=1039.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_ha_unique['oa_status'] = df_ha_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Results" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 986\n", "green 27\n", "bronze 25\n", "hybrid 1\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 55, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 94.9%\n", "green 2.6%\n", "bronze 2.4%\n", "hybrid 0.1%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ha_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_ha_unique, title='History Australia')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Australian Journal of Politics and History\n", "\n", "* ISSN: 1467-8497\n", "* [Website](https://onlinelibrary.wiley.com/journal/14678497)\n", "\n", "There's clearly some problems with dates in the CrossRef data." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1022a8bd5bb949ee97a83e917cc677f8", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1944.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_ajph = harvest_works('1467-8497')" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1944, 3)" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph = pd.DataFrame(works_ajph)\n", "df_ajph.shape" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1944, 3)" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph.drop_duplicates(inplace=True)\n", "df_ajph.shape" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doititleyear
5810.1111/ajph.2008.54.issue-4None2008
6510.1111/ajph.2009.55.issue-1None2009
8210.1111/ajph.2008.54.issue-3None2008
8810.1111/ajph.2000.46.issue-1None2000
8910.1111/ajph.2002.48.issue-1None2002
............
181910.1111/ajph.v66.1None2020
186910.1111/ajph.v66.4None2020
187010.1111/ajph.v65.4None2019
190810.1111/ajph.v66.2None2020
192810.1111/ajph.v66.3None2020
\n", "

157 rows × 3 columns

\n", "
" ], "text/plain": [ " doi title year\n", "58 10.1111/ajph.2008.54.issue-4 None 2008\n", "65 10.1111/ajph.2009.55.issue-1 None 2009\n", "82 10.1111/ajph.2008.54.issue-3 None 2008\n", "88 10.1111/ajph.2000.46.issue-1 None 2000\n", "89 10.1111/ajph.2002.48.issue-1 None 2002\n", "... ... ... ...\n", "1819 10.1111/ajph.v66.1 None 2020\n", "1869 10.1111/ajph.v66.4 None 2020\n", "1870 10.1111/ajph.v65.4 None 2019\n", "1908 10.1111/ajph.v66.2 None 2020\n", "1928 10.1111/ajph.v66.3 None 2020\n", "\n", "[157 rows x 3 columns]" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph.loc[df_ajph['title'].isnull()]" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1787, 3)" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph.dropna(subset=['title'], inplace=True)\n", "df_ajph.shape" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Book Reviews 106\n", "Book Notes 52\n", "QUEENSLAND 18\n", "TASMANIA 17\n", "VICTORIA 17\n", "SOUTH AUSTRALIA 17\n", "Political Chronicles 16\n", "NEW SOUTH WALES 15\n", "WESTERN AUSTRALIA 15\n", "BOOK REVIEWS 14\n", "Journal Notes 11\n", "Issues in Australian Foreign Policy 10\n", "Issue Information 8\n", "Review Article 8\n", "Australian Political Chronicle 5\n", "Introduction 4\n", "Political Chronicle: Australia and Papua New Guinea 4\n", "Problems of Australian Foreign Policy 4\n", "THE COMMONWEALTH 4\n", "Queensland 3\n", "Books Received 3\n", "THE TERRITORY OF PAPUA AND NEW GUINEA 3\n", "Western Australia 3\n", "Northern Territory 3\n", "ERRATA 3\n", "Other Books Received 2\n", "Commonwealth 2\n", "NORTHERN TERRITORY 2\n", "Victoria 2\n", "Tasmania 2\n", "PAPUA NEW GUINEA 2\n", "Rejoinder 2\n", "Foreword 2\n", "BOOK NOTES 2\n", "Volume Index 2\n", "Problems in Australian Foreign Policy, July-December 1994 2\n", "South Australia 2\n", "Reflections on the Role of the Military in Civilian Politics: the Case of Sierra Leone* 1\n", "South Australia July to December 1997 1\n", "HITLER AND THE SPANISH CIVIL WAR. A CASE STUDY OF NAZI FOREIGN POLICY. 1\n", "Name: title, dtype: int64" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph['title'].value_counts()[:40]" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1400, 3)" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph_unique = df_ajph.copy().drop_duplicates(subset='title', keep=False)\n", "df_ajph_unique.shape" ] }, { "cell_type": "code", "execution_count": 64, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "3814d0974b304490ad3fc247bcec816b", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=1400.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_ajph_unique['oa_status'] = df_ajph_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Results" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 1340\n", "bronze 36\n", "green 22\n", "hybrid 2\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 95.7%\n", "bronze 2.6%\n", "green 1.6%\n", "hybrid 0.1%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_ajph_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_ajph_unique, title='Australian Journal of Politics and History')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Journal of Australian Studies\n", "\n", "* ISSN: 1444-3058\n", "* [Website](https://www.tandfonline.com/toc/rjau20/current)" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "303e47252b5d46dbbe382247b619995d", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=2113.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_jas = harvest_works('1444-3058')" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2113, 3)" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas = pd.DataFrame(works_jas)\n", "df_jas.shape" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2113, 3)" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas.drop_duplicates(inplace=True)\n", "df_jas.shape" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doititleyear
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [doi, title, year]\n", "Index: []" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas.loc[df_jas['title'].isnull()]" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2113, 3)" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas.dropna(subset=['title'], inplace=True)\n", "df_jas.shape" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Editorial board 49\n", "Notes on contributors 40\n", "Notes 32\n", "Contributors 31\n", "Reviews 28\n", "Notes on Contributors 28\n", "Book reviews 27\n", "NOTES ON CONTRIBUTORS 19\n", "Introduction 16\n", "BOOK REVIEWS 16\n", "Short reviews and notices 12\n", "Editorial 12\n", "JAS review of books 11\n", "John Barrett prize in Australian studies 10\n", "Erratum 7\n", "The John Barrett Award for Australian Studies 6\n", "Shorter notices and reviews 5\n", "Editorial Board 5\n", "Acknowledgements 5\n", "Book Reviews 5\n", "Foreword 3\n", "The John Barrett prize 2\n", "Book review 2\n", "Acknowledgments 2\n", "Shorter reviews and notices 2\n", "Australian studies report 2\n", "Short notices and reviews 2\n", "Biographical notes on contributors 2\n", "Health, Medicine and the Sea: Australian Voyages c.1815–1860 1\n", "‘O Brave new social order’: The controversy over planning in Australia and Britain in the 1940s 1\n", "Name: title, dtype: int64" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas['title'].value_counts()[:30]" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1732, 3)" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas_unique = df_jas.copy().drop_duplicates(subset='title', keep=False)\n", "df_jas_unique.shape" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d1bd7f30daa844c597483e0763a0fd25", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=1732.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_jas_unique['oa_status'] = df_jas_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Results" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 1632\n", "green 71\n", "bronze 26\n", "hybrid 3\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 94.2%\n", "green 4.1%\n", "bronze 1.5%\n", "hybrid 0.2%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jas_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_jas_unique, title='Journal of Australian Studies')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Australian Archaeology\n", "\n", "* ISSN: 0312-2417\n", "* [Website](https://www.tandfonline.com/toc/raaa20/current)" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "016f5e2527c544cd80a3c265df8b4533", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1485.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_aa = harvest_works('0312-2417')" ] }, { "cell_type": "code", "execution_count": 80, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1485, 3)" ] }, "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa = pd.DataFrame(works_aa)\n", "df_aa.shape" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1485, 3)" ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa.drop_duplicates(inplace=True)\n", "df_aa.shape" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doititleyear
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [doi, title, year]\n", "Index: []" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa.loc[df_aa['title'].isnull()]" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1485, 3)" ] }, "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa.dropna(subset=['title'], inplace=True)\n", "df_aa.shape" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Editorial 57\n", "Book Reviews 34\n", "Front Matter 27\n", "Thesis Abstracts 26\n", "Backfill 23\n", "editorial 8\n", "debitage 7\n", "Debitage 4\n", "Forthcoming Fieldwork 3\n", "Obituary 3\n", "Fieldwork Calendar 3\n", "Archaeologists and Aborigines 3\n", "Excavation Calendar 2\n", "Front matter 2\n", "The Aboriginal People of Tasmania, by Julia Clark 2\n", "Obituaries 2\n", "Honours Theses in Prehistory 2\n", "backfill 2\n", "A Technological Analysis Of Stone Artefacts From Big Foot Art Site, Cania Gorge, Central Queensland 2\n", "Thesis Abstract 2\n", "Trench Shoring For Archaeologists And The Randwick Grave Digging Course 2\n", "Useless graduates?: Why do we all think that something has gone wrong with Australian archaeological training? 1\n", "Broadcasting, listening and the mysteries of public engagement: an investigation of the AAA online audience 1\n", "Gendered Archaeology 1\n", "Bottles For Jam? An Example Of Recycling From A Post-Contact Archaeological Site 1\n", "The Patina of Nostalgia 1\n", "Department of Archaeology La Trobe University 1\n", "Colonial Archaeology in Australia 1\n", "Birriwilk rockshelter: A mid- to late Holocene site in ManilikarrCountry, southwest Arnhem Land, Northern Territory 1\n", "Apology from UNSW Press to Don Ranson 1\n", "Name: title, dtype: int64" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa['title'].value_counts()[:30]" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1269, 3)" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa_unique = df_aa.copy().drop_duplicates(subset='title', keep=False)\n", "df_aa_unique.shape" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "164c6e3f62084ab294d4825f022101fa", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=1269.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_aa_unique['oa_status'] = df_aa_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Results" ] }, { "cell_type": "code", "execution_count": 87, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 1058\n", "green 188\n", "bronze 21\n", "hybrid 2\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 88, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 83.4%\n", "green 14.8%\n", "bronze 1.7%\n", "hybrid 0.2%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_aa_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 89, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_aa_unique, title='Australian Archaeology')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Archives and Manuscripts\n", "\n", "* ISSN: 0157-6895\n", "* [Website](https://www.tandfonline.com/toc/raam20/current)\n", "\n", "Note that articles published before 2012 are available through an [open access repository](https://publications.archivists.org.au/index.php/asa)." ] }, { "cell_type": "code", "execution_count": 90, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "4faaae0faf7644c8921c325173f17b0c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=341.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_am = harvest_works('0157-6895')" ] }, { "cell_type": "code", "execution_count": 91, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(341, 3)" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am = pd.DataFrame(works_am)\n", "df_am.shape" ] }, { "cell_type": "code", "execution_count": 92, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(341, 3)" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am.drop_duplicates(inplace=True)\n", "df_am.shape" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doititleyear
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [doi, title, year]\n", "Index: []" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am.loc[df_am['title'].isnull()]" ] }, { "cell_type": "code", "execution_count": 94, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(341, 3)" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am.dropna(subset=['title'], inplace=True)\n", "df_am.shape" ] }, { "cell_type": "code", "execution_count": 95, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Editorial 14\n", "Editorial Board 3\n", "Archival Anxiety and the Vocational Calling 2\n", "Corrigendum 2\n", "Records and Information Management 2\n", "Negotiating the born-digital: a problem of search 1\n", "The Australian Register: UNESCO Memory of the World Program 1\n", "Taking archives to the people: an examination of public programs in the National Archives of the Eastern and Southern Africa Regional Branch of the International Council on Archives 1\n", "Provocations on the pleasures of archived paper 1\n", "Recordkeeping issues arising from the public hearings of the Royal Commission into Institutional Responses to Child Sexual Abuse 1\n", "Decolonising the archives: languages as enablers and barriers to accessing public archives in South Africa 1\n", "Elizabeth H Dow 1\n", "Note: ASA submissions to Royal Commission into Institutional Responses to Child Sexual Abuse, 2012–2016 1\n", "Living Traces – an archive of place: Parramatta Girls Home 1\n", "Here, there and everywhere: an analysis of reference services in academic archives 1\n", "Sigrid McCausland, 1953–2016 1\n", "Innovation Study: Challenges and Opportunities for Australia’s Galleries, Libraries, Archives and Museums 1\n", "Australian War Memorial, ANZAC Voices, Canberra, November 2013 - November 2014. 1\n", "Full docs or it didn’t happen 1\n", "Preserving Archives 1\n", "The No-Nonsense Guide to Archives and Recordkeeping 1\n", "The development of recordkeeping systems in the British Empire and Commonwealth, 1870s–1960s 1\n", "Indigenous archives: the making and unmaking of Aboriginal art 1\n", "Observing the author–editor relationship: recordkeeping and literary scholarship in dialogue 1\n", "Factors influencing the integration of digital archival resources: a constructivist grounded theory approach 1\n", "Shaping and reshaping cultural identity and memory: maximising human rights through a participatory archive 1\n", "Victorian Women’s Liberation and Lesbian Feminist Archives Inc 1\n", "Linked Data for Libraries, Archives and Museums: How to Clean, Link and Publish Your Metadata 1\n", "Give me a serve of data with that 1\n", "Perspectives on Women’s Archives, Chicago. Society of American Archivists 1\n", "Name: title, dtype: int64" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am['title'].value_counts()[:30]" ] }, { "cell_type": "code", "execution_count": 96, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(318, 3)" ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am_unique = df_am.copy().drop_duplicates(subset='title', keep=False)\n", "df_am_unique.shape" ] }, { "cell_type": "code", "execution_count": 97, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "23b424df8b2149498b06e8af9ebcbc78", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=318.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_am_unique['oa_status'] = df_am_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Results" ] }, { "cell_type": "code", "execution_count": 98, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bronze 205\n", "closed 79\n", "green 19\n", "hybrid 15\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 99, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "bronze 64.5%\n", "closed 24.8%\n", "green 6.0%\n", "hybrid 4.7%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_am_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 100, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_am_unique, title='Archives and Manuscripts')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Journal of the Australian Library and Information Association\n", "\n", "* ISSN: 2475-0158\n", "* [Website](https://www.tandfonline.com/toc/ualj21/current)\n", "\n", "Previously _Australian Academic and Research Libraries_ \n", "\n", "* ISSN: 0004-8623\n", "* [Website](https://www.tandfonline.com/toc/uarl20/current)\n", "* [Archived issues available in Trove](https://webarchive.nla.gov.au/awa/20130209041025/http://pandora.nla.gov.au/pan/128690/20130208-0850/www.alia.org.au/publishing/aarl/index.html)\n", "\n", "Note that most of AARL seems to be 'bronze', but is not being accurately reported by the Unpaywall API." ] }, { "cell_type": "code", "execution_count": 101, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "9c680d89915c44e8be6ae81911f6e4ac", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=1335.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_aarn = harvest_works('0004-8623')" ] }, { "cell_type": "code", "execution_count": 102, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "81a5cc0244964081af8c2666b048c9e7", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=334.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_jalia = harvest_works('2475-0158')" ] }, { "cell_type": "code", "execution_count": 103, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1669, 3)" ] }, "execution_count": 103, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia = pd.concat([pd.DataFrame(works_aarn), pd.DataFrame(works_jalia)]) \n", "df_jalia.shape" ] }, { "cell_type": "code", "execution_count": 104, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1669, 3)" ] }, "execution_count": 104, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia.drop_duplicates(inplace=True)\n", "df_jalia.shape" ] }, { "cell_type": "code", "execution_count": 105, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doititleyear
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [doi, title, year]\n", "Index: []" ] }, "execution_count": 105, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia.loc[df_jalia['title'].isnull()]" ] }, { "cell_type": "code", "execution_count": 106, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1669, 3)" ] }, "execution_count": 106, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia.dropna(subset=['title'], inplace=True)\n", "df_jalia.shape" ] }, { "cell_type": "code", "execution_count": 107, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Publications Received 66\n", "Book Reviews 55\n", "Editorial 50\n", "Reviews 26\n", "Front Matter 26\n", "Review Article 14\n", "Conference Report 8\n", "Conference Reports 8\n", "Book reviews 6\n", "Foreword 5\n", "New Australian Reference Books 5\n", "Obituary 5\n", "Obituaries 5\n", "Publications received 5\n", "Letters 5\n", "Review Articles 4\n", "News 4\n", "Book Review 4\n", "Letter 4\n", "Letter to the Editor 3\n", "Reports 2\n", "CAUL annual survey of electronic retrieval systems 2\n", "Converging Technologies, Divergent Applications: The Future of Information Services to the Academic Community 2\n", "The Special Collections Handbook 2\n", "Collection evaluation and the conspectus 2\n", "Corrigendum 2\n", "Council of Australian University Librarians 2\n", "CAUL Report 2\n", "Editorial Board 2\n", "Preface 2\n", "Teaching and Learning Spaces; Refurbishment of the W K Hancock Science Library at the Australian National University 2011 2\n", "Processing the past: Contesting authority in history and the archives 2\n", "Building the Sustainable Library at Macquarie University 2\n", "Facilitating access to the web of data: A guide for librarians 2\n", "Information users and usability in the digital age 2\n", "Expert Internet Searching 2\n", "Conferences 2\n", "Getting started with cloud computing 2\n", "Information Literacy Research: Dimensions of the Emerging Collective Consciousness 2\n", "63 Ready-to-Use Maker Projects 1\n", "Name: title, dtype: int64" ] }, "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia['title'].value_counts()[:40]" ] }, { "cell_type": "code", "execution_count": 108, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1323, 3)" ] }, "execution_count": 108, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia_unique = df_jalia.copy().drop_duplicates(subset='title', keep=False)\n", "df_jalia_unique.shape" ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ce40c4c38ac744378ddc00b316b5a336", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=1323.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_jalia_unique['oa_status'] = df_jalia_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Results" ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 695\n", "bronze 561\n", "green 66\n", "hybrid 1\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 110, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 52.5%\n", "bronze 42.4%\n", "green 5.0%\n", "hybrid 0.1%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 111, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_jalia_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 112, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_jalia_unique, title='Journal of the Australian Library and Information Association')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Labour History\n", "\n", "* ISSN: 0023-6942\n", "* [Website](https://www.liverpooluniversitypress.co.uk/journals/id/55https://www.liverpooluniversitypress.co.uk/journals/id/55)" ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "920d296de5c241fcaa93144abf703fd2", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=2792.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "works_lh = harvest_works('0023-6942')" ] }, { "cell_type": "code", "execution_count": 114, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2792, 3)" ] }, "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh = pd.DataFrame(works_lh)\n", "df_lh.shape" ] }, { "cell_type": "code", "execution_count": 115, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2792, 3)" ] }, "execution_count": 115, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh.drop_duplicates(inplace=True)\n", "df_lh.shape" ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
doititleyear
\n", "
" ], "text/plain": [ "Empty DataFrame\n", "Columns: [doi, title, year]\n", "Index: []" ] }, "execution_count": 116, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh.loc[df_lh['title'].isnull()]" ] }, { "cell_type": "code", "execution_count": 117, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2792, 3)" ] }, "execution_count": 117, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh.dropna(subset=['title'], inplace=True)\n", "df_lh.shape" ] }, { "cell_type": "code", "execution_count": 118, "metadata": {}, "outputs": [ { "data": { "text/plain": [ " 280\n", "Review 54\n", "Front Matter 19\n", "Back Matter 17\n", "EDITORIAL 8\n", "Editorial 8\n", "Introduction 4\n", "The Labor Government in the Second World War: A Memoir 3\n", "Notice Board 3\n", "Book Note 3\n", "The Emigration to Valparaiso in 1843 2\n", "The Workers' Union 2\n", "Masters and Servants 2\n", "Special Notice 2\n", "Keep Moving 2\n", "Australian Popular Culture 2\n", "The Australian Society for the Study of Labour History: International Links 2\n", "Rejoinder 2\n", "The Eureka Stockade 2\n", "Paradise Mislaid: In Search of the Australian Tribe of Paraguay 1\n", "Company Boats, Sailing Dinghies and Passenger Fish: Fathoming Torres Strait Islander Participation in the Maritime Economy 1\n", "The Unlucky Australians 1\n", "'& So We are \"Slave owners\"!': Employers and the NSW Aborigines Protection Board Trust Funds 1\n", "James Duhig 1\n", "Joseph Symes and the Australasian Secular Association 1\n", "Voluntary Work and Labour History 1\n", "The Communist Party of Australia and the Palestinian Revolution, 1967-1976 1\n", "Blood on the Rails: The Cairns-Kuranda Railway Construction and the Queensland Employers' Liability Act 1\n", "Two Lives, One Sheet of Paper and the “Great War”: A Moment in the Lives of Doris and Maurice Blackburn 1\n", "'A Song for the Future': A Response to Paul Pickering 1\n", "Name: title, dtype: int64" ] }, "execution_count": 118, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh['title'].value_counts()[:30]" ] }, { "cell_type": "code", "execution_count": 119, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2375, 3)" ] }, "execution_count": 119, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh_unique = df_lh.copy().drop_duplicates(subset='title', keep=False)\n", "df_lh_unique.shape" ] }, { "cell_type": "code", "execution_count": 120, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "ef269c6b3b114329824f6cb44482c0a6", "version_major": 2, "version_minor": 0 }, "text/plain": [ "HBox(children=(HTML(value='records'), FloatProgress(value=0.0, max=2375.0), HTML(value='')))" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "\n" ] } ], "source": [ "df_lh_unique['oa_status'] = df_lh_unique['doi'].progress_apply(get_oa_status)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Results" ] }, { "cell_type": "code", "execution_count": 121, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 2229\n", "green 146\n", "Name: oa_status, dtype: int64" ] }, "execution_count": 121, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh_unique['oa_status'].value_counts()" ] }, { "cell_type": "code", "execution_count": 122, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "closed 93.9%\n", "green 6.1%\n", "Name: oa_status, dtype: object" ] }, "execution_count": 122, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df_lh_unique['oa_status'].value_counts(normalize=True).mul(100).round(1).astype(str) + '%'" ] }, { "cell_type": "code", "execution_count": 123, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "chart_oa_status(df_lh_unique, title='Labour History')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "----\n", "\n", "Created by [Tim Sherratt](https://timsherratt.org) \n", "This work is licensed under a Creative Commons Attribution 4.0 International License." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }