{ "cells": [ { "cell_type": "markdown", "id": "75a6920e", "metadata": {}, "source": [ "# 5. Journals indexed in Scopus by their disciplinary distribution" ] }, { "cell_type": "markdown", "id": "36d5a09d", "metadata": {}, "source": [ "### Notebook objectives:\n", "1. Determine the disciplinary distribution of Scopus journals for the sake of comparison to OJS.\n", "*Updated 9/22/2022" ] }, { "cell_type": "code", "execution_count": 1, "id": "950a5897", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Python 3.10.5\r\n" ] } ], "source": [ "!python --version" ] }, { "cell_type": "code", "execution_count": 2, "id": "b48bbcdc", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import pandas as pd\n", "import requests\n", "import json\n", "import os\n", "import re\n", "from easynmt import EasyNMT\n", "from tqdm import tqdm" ] }, { "cell_type": "code", "execution_count": 3, "id": "5b396012", "metadata": {}, "outputs": [], "source": [ "scopus = pd.read_excel(os.path.join('data', 'scopus_jan2021.xlsx'))\n", "scopus = scopus.drop_duplicates(subset=[\"Print-ISSN\", \"E-ISSN\"])" ] }, { "cell_type": "markdown", "id": "82907140", "metadata": {}, "source": [ "### Education isn't a defined subject area in the Scopus data, so I approximate the number of Education journals using a multilngual string search for \"education,\" \"teach,\" and \"learn\" in journal titles:" ] }, { "cell_type": "code", "execution_count": 4, "id": "cc3e4750", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "49 \n", " ['AFR', 'ARA', 'ARM', 'AZE', 'BAQ', 'BOS', 'BUL', 'CAT', 'CHI', 'CHN', 'CZE', 'DAN', 'DUT', 'ENF', 'ENG', 'EST', 'FIN', 'FRE', 'GER', 'GLE', 'GLG', 'GRE', 'HEB', 'HUN', 'ICE', 'IND', 'ITA', 'JPN', 'KOR', 'LAV', 'LIT', 'MAC', 'MAO', 'MAY', 'NOR', 'PER', 'POL', 'POR', 'RUM', 'RUS', 'SCC', 'SCR', 'SLO', 'SLV', 'SPA', 'SWE', 'THA', 'TUR', 'UKR']\n" ] } ], "source": [ "langs = scopus['Article language in source (three-letter ISO language codes)'].unique().tolist()\n", "langs = [re.split(r\"[^a-zA-Z]+\", l) for l in langs if isinstance(l, str)]\n", "langs = sorted(list(set([l for subl in langs for l in subl])))\n", "print(len(langs), \"\\n\", langs)" ] }, { "cell_type": "code", "execution_count": 5, "id": "0d91bc16", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['AFR: 13', 'ARA: 24', 'ARM: 1', 'AZE: 3', 'BAQ: 6', 'BOS: 9', 'BUL: 18', 'CAT: 33', 'CHI: 562', 'CHN: 1', 'CZE: 131', 'DAN: 14', 'DUT: 71', 'ENF: 1', 'ENG: 27125', 'EST: 19', 'FIN: 18', 'FRE: 1197', 'GER: 997', 'GLE: 5', 'GLG: 2', 'GRE: 38', 'HEB: 7', 'HUN: 44', 'ICE: 3', 'IND: 4', 'ITA: 535', 'JPN: 199', 'KOR: 69', 'LAV: 8', 'LIT: 19', 'MAC: 2', 'MAO: 1', 'MAY: 12', 'NOR: 20', 'PER: 46', 'POL: 190', 'POR: 474', 'RUM: 53', 'RUS: 412', 'SCC: 15', 'SCR: 109', 'SLO: 64', 'SLV: 61', 'SPA: 1348', 'SWE: 23', 'THA: 3', 'TUR: 134', 'UKR: 26']\n" ] } ], "source": [ "print([f\"{lang}: {scopus.iloc[:, 7].str.contains(lang).sum()}\" for lang in langs])" ] }, { "cell_type": "markdown", "id": "c54d7503", "metadata": {}, "source": [ "#### Save a list of ISO-639-1 language codes with Latin scripts, because the Scopus data only feature titles written in Latin scripts:\n", "(Transliteration is used by Scopus, but transliterating \"Education\" to Chinese \"Jiaoyu\" returns no titles. I will skip transliteration because the success rate seems so low.)" ] }, { "cell_type": "code", "execution_count": 6, "id": "9c723f0f", "metadata": {}, "outputs": [], "source": [ "iso639_1 = [\"af\", \"ca\", \"cs\", \"da\", \"nl\", \"et\", \"fi\", \"fr\", \"de\", \"hu\", \"id\", \"it\", \"lv\", \"lt\", \"no\", \"pl\",\n", " \"pt\", \"ro\", \"sr\", \"sk\", \"sl\", \"es\", \"sv\", \"tr\"]" ] }, { "cell_type": "code", "execution_count": 7, "id": "e9cc392b", "metadata": {}, "outputs": [], "source": [ "model = EasyNMT(\"opus-mt\")" ] }, { "cell_type": "code", "execution_count": 8, "id": "a31a362e", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/jball/opt/anaconda3/envs/tmp/lib/python3.10/site-packages/transformers/models/marian/tokenization_marian.py:194: UserWarning: Recommended: pip install sacremoses.\n", " warnings.warn(\"Recommended: pip install sacremoses.\")\n", "/Users/jball/opt/anaconda3/envs/tmp/lib/python3.10/site-packages/transformers/generation_utils.py:1227: UserWarning: Neither `max_length` nor `max_new_tokens` has been set, `max_length` will default to 512 (`self.config.max_length`). Controlling `max_length` via the config is deprecated and `max_length` will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.\n", " warnings.warn(\n", "Exception: Helsinki-NLP/opus-mt-en-lv is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n", "Exception: Helsinki-NLP/opus-mt-en-lt is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n", "Exception: Helsinki-NLP/opus-mt-en-no is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n", "Exception: Helsinki-NLP/opus-mt-en-pl is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n", "Exception: Helsinki-NLP/opus-mt-en-pt is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n", "Exception: Helsinki-NLP/opus-mt-en-sr is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n", "Exception: Helsinki-NLP/opus-mt-en-sl is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n", "Exception: Helsinki-NLP/opus-mt-en-tr is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\n", "If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "['Opvoeding', 'leer', 'onderrig', 'educació', 'learn', 'ensenyeu', 'vzdělávání', 'učit se', 'Učit', 'uddannelse', 'lær', 'underviser', 'onderwijs', 'leren', 'lesgeven', 'haridus', 'õpi', 'õpetamine', 'koulutus', 'opi', 'opettaa', 'éducation', 'apprendre', 'enseigner', 'Bildung', 'lernen', 'Unterricht', 'oktatás', 'tanulj!', 'Tanárnő!', 'pendidikan', 'belajar', 'mengajar', 'istruzione', 'imparare', 'Insegna', 'educaţie', 'Învaţă', 'Predă', 'vzdelávanie', 'učiť sa', 'vyučovať', 'Educación', 'aprender', 'enseñar', 'utbildning', 'lära dig', 'lära ut']\n" ] } ], "source": [ "doc = [\"education\",\n", " \"learn\",\n", " \"teach\"]\n", "edu = []\n", "\n", "for code in iso639_1:\n", " try:\n", " edu.extend(model.translate(doc, target_lang=code))\n", " except OSError:\n", " continue\n", "print(edu)" ] }, { "cell_type": "code", "execution_count": 9, "id": "73aaa63c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1097" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scopus.iloc[:, 1].str.contains(\n", " \"ducat|teach|learn|opvoeding|onderrig|educaci|enseny|vzdela|ucit se|uddanel|undervis|onderwijs|leren|lesgev|haridus|opetami|koulutus|opettaa|apprend|enseign|bildung|lernen|unterricht|oktatas|tanulj|tanarno|pendidikan|belajar|mengajar|istruzione|imparare|insenga|invata|vyuco|aprend|ensena|utbild|lara dig|lara ut|jiaoyu\", \n", " regex=True, case=False).sum()" ] }, { "cell_type": "code", "execution_count": 10, "id": "21091c5c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Education journals: 2.6%\n" ] } ], "source": [ "ed = 1097 / 41957\n", "print(f\"Education journals: {round(ed*100, 1)}%\")" ] }, { "cell_type": "code", "execution_count": 11, "id": "48676a34", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Index(['Top level:\\n\\nLife Sciences', 'Top level:\\n\\nSocial Sciences',\n", " 'Top level:\\n\\nPhysical Sciences', 'Top level:\\n\\nHealth Sciences',\n", " '1000 \\nGeneral', '1100\\nAgricultural and Biological Sciences',\n", " '1200\\nArts and Humanities',\n", " '1300\\nBiochemistry, Genetics and Molecular Biology',\n", " '1400\\nBusiness, Management and Accounting',\n", " '1500\\nChemical Engineering', '1600\\nChemistry',\n", " '1700\\nComputer Science', '1800\\nDecision Sciences',\n", " '1900\\nEarth and Planetary Sciences',\n", " '2000\\nEconomics, Econometrics and Finance', '2100\\nEnergy',\n", " '2200\\nEngineering', '2300\\nEnvironmental Science',\n", " '2400\\nImmunology and Microbiology', '2500\\nMaterials Science',\n", " '2600\\nMathematics', '2700\\nMedicine', '2800\\nNeuroscience',\n", " '2900\\nNursing', '3000\\nPharmacology, Toxicology and Pharmaceutics',\n", " '3100\\nPhysics and Astronomy', '3200\\nPsychology',\n", " '3300\\nSocial Sciences', '3400\\nVeterinary', '3500\\nDentistry',\n", " '3600\\nHealth Professions'],\n", " dtype='object')\n", "\n", "Int64Index: 41958 entries, 0 to 42473\n", "Data columns (total 31 columns):\n", " # Column Non-Null Count Dtype \n", "--- ------ -------------- ----- \n", " 0 Top level:\n", "\n", "Life Sciences 7666 non-null object\n", " 1 Top level:\n", "\n", "Social Sciences 13940 non-null object\n", " 2 Top level:\n", "\n", "Physical Sciences 14039 non-null object\n", " 3 Top level:\n", "\n", "Health Sciences 14744 non-null object\n", " 4 1000 \n", "General 148 non-null object\n", " 5 1100\n", "Agricultural and Biological Sciences 3031 non-null object\n", " 6 1200\n", "Arts and Humanities 5424 non-null object\n", " 7 1300\n", "Biochemistry, Genetics and Molecular Biology 3105 non-null object\n", " 8 1400\n", "Business, Management and Accounting 2007 non-null object\n", " 9 1500\n", "Chemical Engineering 1081 non-null object\n", " 10 1600\n", "Chemistry 1312 non-null object\n", " 11 1700\n", "Computer Science 2274 non-null object\n", " 12 1800\n", "Decision Sciences 513 non-null object\n", " 13 1900\n", "Earth and Planetary Sciences 2250 non-null object\n", " 14 2000\n", "Economics, Econometrics and Finance 1412 non-null object\n", " 15 2100\n", "Energy 717 non-null object\n", " 16 2200\n", "Engineering 5236 non-null object\n", " 17 2300\n", "Environmental Science 2636 non-null object\n", " 18 2400\n", "Immunology and Microbiology 902 non-null object\n", " 19 2500\n", "Materials Science 1917 non-null object\n", " 20 2600\n", "Mathematics 1929 non-null object\n", " 21 2700\n", "Medicine 13779 non-null object\n", " 22 2800\n", "Neuroscience 793 non-null object\n", " 23 2900\n", "Nursing 897 non-null object\n", " 24 3000\n", "Pharmacology, Toxicology and Pharmaceutics 1231 non-null object\n", " 25 3100\n", "Physics and Astronomy 1542 non-null object\n", " 26 3200\n", "Psychology 1566 non-null object\n", " 27 3300\n", "Social Sciences 8704 non-null object\n", " 28 3400\n", "Veterinary 317 non-null object\n", " 29 3500\n", "Dentistry 257 non-null object\n", " 30 3600\n", "Health Professions 702 non-null object\n", "dtypes: object(31)\n", "memory usage: 10.2+ MB\n" ] } ], "source": [ "print(scopus.columns[23:])\n", "scopus.iloc[:, 23:].info()" ] }, { "cell_type": "code", "execution_count": 12, "id": "d543d992", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(6752, 54)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scopus[(scopus[\"1700\\nComputer Science\"].notnull()) | \n", " (scopus[\"2200\\nEngineering\"].notnull())].shape" ] }, { "cell_type": "code", "execution_count": 13, "id": "a4b176db", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "% CS & Engineering journals: 16.1%\n" ] } ], "source": [ "csen = 6752 / 41957\n", "print(f\"% CS & Engineering journals: {round(csen*100, 1)}%\")" ] }, { "cell_type": "code", "execution_count": 14, "id": "f071aeb3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1929, 54)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "scopus[scopus[\"2600\\nMathematics\"].notnull()].shape" ] }, { "cell_type": "code", "execution_count": 15, "id": "1139f13f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "% Math journals: 4.6%\n" ] } ], "source": [ "math = 1929 / 41957\n", "print(f\"% Math journals: {round(math*100, 1)}%\")" ] }, { "cell_type": "code", "execution_count": 16, "id": "6523d922", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "% Med-Health journals: 35.1%\n" ] } ], "source": [ "medh = scopus[scopus[\"Top level:\\n\\nHealth Sciences\"].notnull()].shape[0] / 41957\n", "print(f\"% Med-Health journals: {round(medh*100, 1)}%\")" ] }, { "cell_type": "markdown", "id": "e087c8b1", "metadata": {}, "source": [ "### Use OpenAlex to try and disaggregate the \"Social Sciences\" journals:" ] }, { "cell_type": "code", "execution_count": 17, "id": "d16348d8", "metadata": {}, "outputs": [], "source": [ "e_socsci = scopus[(scopus[\"Top level:\\n\\nSocial Sciences\"].notnull()) &\n", " (scopus[\"E-ISSN\"].notnull())][\"E-ISSN\"]\n", "e_socsci = list(zip(e_socsci.index, [str(issn)[:4] + \"-\" + str(issn)[4:] for issn in e_socsci]))" ] }, { "cell_type": "code", "execution_count": 18, "id": "b5b2a84c", "metadata": {}, "outputs": [], "source": [ "print_socsci = scopus[(scopus[\"Top level:\\n\\nSocial Sciences\"].notnull()) &\n", " (scopus[\"Print-ISSN\"].notnull())][\"Print-ISSN\"]\n", "print_socsci = list(zip(print_socsci.index, [str(issn)[:4] + \"-\" + str(issn)[4:] for issn in print_socsci]))" ] }, { "cell_type": "code", "execution_count": 19, "id": "d6855385", "metadata": {}, "outputs": [], "source": [ "ss_issns = e_socsci + print_socsci" ] }, { "cell_type": "code", "execution_count": 20, "id": "620495d5", "metadata": {}, "outputs": [], "source": [ "def get_subjects(list_of_tuples):\n", " \n", " idx2subject = []\n", " error_issns = []\n", " \n", " for i, v in tqdm(list_of_tuples):\n", " query = \"https://api.openalex.org/venues/issn:\" + v\n", " \n", " try:\n", " response = json.loads(\n", " requests.get(query).content.decode()\n", " )\n", " subject = response[\"x_concepts\"][0][\"display_name\"]\n", " except:\n", " error_issns.append(v)\n", " \n", " idx2subject.append(\n", " (i, subject)\n", " )\n", " \n", " return idx2subject, error_issns" ] }, { "cell_type": "code", "execution_count": 21, "id": "0acdd037", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "100%|███████████████████████████████████| 19908/19908 [2:39:52<00:00, 2.08it/s]\n" ] } ], "source": [ "idx2subject, errors = get_subjects(ss_issns)" ] }, { "cell_type": "code", "execution_count": 22, "id": "e5273ea4", "metadata": {}, "outputs": [], "source": [ "with open(os.path.join(\"data\", \"idx2subject_ss.json\"), \"w\") as outfile:\n", " json.dump(idx2subject, outfile)" ] }, { "cell_type": "code", "execution_count": 23, "id": "a06bc5ee", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "{'History', 'Ecology', 'Art', 'Linguistics', 'Genetics', 'Humanities', 'Common value auction', 'Business', 'Macroeconomics', 'Economics', 'Chemistry', 'Visual arts', 'Biology', 'Physics', 'Demographic economics', 'Finance', 'Materials science', 'Economic geography', 'Population', 'Computer science', 'Psychology', 'Law', 'Nanotechnology', 'Thermodynamics', 'Outbreak', 'Geology', 'Environmental science', 'Astronomy', 'Poison control', 'Geophysics', 'Political science', 'Medicine', 'Nursing', 'Mathematics', 'Geography', 'Archaeology', 'Philosophy', 'Engineering', 'Monetary policy', 'Sociology'}\n" ] } ], "source": [ "print(set([t[1] for t in idx2subject]))" ] }, { "cell_type": "code", "execution_count": 25, "id": "9f60d919", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "% Philosophy journals: 5.6%\n", "% Linguistics journals: 0.0%\n", "% Humanities journals: 0.1%\n", "Phil: 2365, Ling: 3, Hum: 23\n" ] } ], "source": [ "d = {}\n", "phil = 0\n", "ling = 0\n", "hum = 0\n", "\n", "for idx, subject in idx2subject:\n", " \n", " if idx not in d:\n", " d[idx] = subject\n", " \n", " match subject:\n", " \n", " case \"Philosophy\":\n", " phil += 1\n", " \n", " case \"Linguistics\":\n", " ling += 1\n", " \n", " case \"Humanities\":\n", " hum += 1\n", "\n", "print(f\"% Philosophy journals: {round(phil / 41957 * 100, 1)}%\")\n", "print(f\"% Linguistics journals: {round(ling / 41957 * 100, 1)}%\")\n", "print(f\"% Humanities journals: {round(hum / 41957 * 100, 1)}%\")\n", "print(f\"Phil: {phil}, Ling: {ling}, Hum: {hum}\")" ] }, { "cell_type": "code", "execution_count": 26, "id": "1bad2e92", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "13940\n" ] } ], "source": [ "print(len(d))" ] }, { "cell_type": "code", "execution_count": 28, "id": "7f03481d", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "% History journals: 1.9%\n", "% Art + Visual Arts journals: 3.4%\n", "% Sociology journals: 1.6%\n", "Hist: 792, Art: 1437, Vis Art: 1, Soc: 656\n" ] } ], "source": [ "hist = 0\n", "art = 0\n", "visa = 0\n", "soc = 0\n", "\n", "for idx, subject in d.items():\n", " \n", " match subject:\n", " case \"History\":\n", " hist += 1\n", " case \"Art\":\n", " art += 1\n", " case \"Visual arts\":\n", " visa += 1\n", " case \"Sociology\":\n", " soc += 1\n", " \n", "print(f\"% History journals: {round(hist / 41957 * 100, 1)}%\")\n", "print(f\"% Art + Visual Arts journals: {round((art+visa) / 41957 * 100, 1)}%\")\n", "print(f\"% Sociology journals: {round(soc / 41957 * 100, 1)}%\")\n", "print(f\"Hist: {hist}, Art: {art}, Vis Art: {visa}, Soc: {soc}\")" ] }, { "cell_type": "markdown", "id": "d2819b41", "metadata": {}, "source": [ "### Final step: I need to give a rough estimate of the proportions of Scopus-indexed journals falling under the rubrics of \"Language, communication, and culture\" and \"Philosophy and religion.\"" ] }, { "cell_type": "markdown", "id": "1be088cc", "metadata": {}, "source": [ "First, I will assume that a \"Philosophy\" classification from OpenAlex genuinely indicates a philosophy journal only if the journal is also classified by Scopus as \"Arts and Humanities\":" ] }, { "cell_type": "code", "execution_count": 30, "id": "017e7339", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(13940, 54)\n" ] } ], "source": [ "alex = scopus[scopus.index.isin(\n", " list(set([t[0] for t in idx2subject]))\n", ")]\n", "print(alex.shape)" ] }, { "cell_type": "code", "execution_count": 32, "id": "72d0bc7e", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2365\n" ] } ], "source": [ "phil_indices = [idx for idx, subject in d.items() if subject == \"Philosophy\"]\n", "print(len(phil_indices))" ] }, { "cell_type": "code", "execution_count": 34, "id": "8d810e9f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1694\n", "% Philosophy journals, double checked: 4.0\n" ] } ], "source": [ "philn = alex[\n", " (alex.index.isin(phil_indices)) & (alex[\"1200\\nArts and Humanities\"].notnull())\n", "].shape[0]\n", "print(philn)\n", "print(f\"% Philosophy journals, double checked: {round(philn / 41957 * 100, 1)}\")" ] }, { "cell_type": "markdown", "id": "d079275f", "metadata": {}, "source": [ "As for Scopus's \"Arts and Humanities\" journals which aren't labeled \"Philosophy,\" \"Arts,\" or \"Visual arts\" by OpenAlex, those provide a rough estimate for the number of journals in \"Language, communication, and culture.\" That is to say, these are just \"Humanities\" journals:" ] }, { "cell_type": "code", "execution_count": 41, "id": "0aff6dcd", "metadata": {}, "outputs": [], "source": [ "art_indices = [idx for idx, subject in d.items() if subject == \"Art\" or subject == \"Visual arts\"]" ] }, { "cell_type": "code", "execution_count": 42, "id": "aa86535f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1438\n" ] } ], "source": [ "print(len(art_indices))" ] }, { "cell_type": "code", "execution_count": 45, "id": "36df1cc7", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2493\n", "% Language, communication, and culture journals, double checked: 5.9\n" ] } ], "source": [ "lcc = alex[\n", " (alex[\"1200\\nArts and Humanities\"].notnull()) &\n", " (~alex.index.isin(art_indices)) & \n", " (~alex.index.isin(phil_indices))\n", "].shape[0]\n", "print(lcc)\n", "print(f\"% Language, communication, and culture journals, double checked: {round(lcc / 41957 * 100, 1)}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Py10 Temp", "language": "python", "name": "tmp" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" } }, "nbformat": 4, "nbformat_minor": 5 }