{ "metadata": { "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5-final" }, "orig_nbformat": 2, "kernelspec": { "name": "Python 3.8.5 64-bit", "display_name": "Python 3.8.5 64-bit", "metadata": { "interpreter": { "hash": "1ee38ef4a5a9feb55287fd749643f13d043cb0a7addaab2a9c224cbe137c0062" } } } }, "nbformat": 4, "nbformat_minor": 2, "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# DataCite Corpus descriptive Statistics\n", "\n", "Bibliometricians need to have an overall picture of the DataCite data corpus in order to do the analysis. In order to provide this, we will create a notebook with descriptive statistics about the Datacite data corpus broken up by the different dimensions of interest (viz. Discipline, career status, usage, and citations). This document aims to make the first step into creating that notebook by looking directly at the DOI index and exploring the descriptive statistics that will be used.\n", "\n", "I have broken the descriptive statistics by the two main sections one per each use case: discipline and career status. Each section then breaks down the data by different dimensions: citations and usage.\n", "\n", "I found that we have a limited number of datasets with disciplinary information that have citations, views, and downloads. We must implement methods to enrich our metadata to have significant a sample. The proxy approach is a good way to enrich the metadata in terms of discipline and it would help to get a larger data corpus to the bibliometricians. However, the fact that none of the disciplinary repositories is sending usage reports about the datasets might limit the usefulness of that data. \n" ] }, { "source": [ "## Set up\n", "\n", "Installing and importing packages." ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%%capture\n", "# Install required Python packages\n", "!pip install dfply altair altair_saver vega altair_viewer dash==1.16.3 " ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "import json\n", "import numpy as np\n", "from dfply import *\n", "import altair.vega.v5 as alt\n", "from altair_saver import save\n", "import altair.vegalite.v4 as lite\n", "# import plotly.graph_objects as go\n", "import pandas as pd\n", "import plotly.graph_objects as go\n" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "# Prepare the GraphQL client\n", "import requests\n", "from IPython.display import display, Markdown\n", "from gql import gql, Client\n", "from gql.transport.requests import RequestsHTTPTransport\n", "\n", "_transport = RequestsHTTPTransport(\n", " url='https://api.datacite.org/graphql',\n", " use_json=True,\n", ")\n", "\n", "client = Client(\n", " transport=_transport,\n", " fetch_schema_from_transport=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fetching Data\n", "\n", "We obtain all the data from the DataCite GraphQL API. All the queries are for datasets DOIs that include Field of Science information in their metadata. We have three different queries:\n", "\n", "- DOIs with citations\n", "- DOIs with views\n", "- DOIs with downloads\n" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ " # Generate the GraphQL query to retrieve up to 100 outputs of University of Oxford, with at least 100 views each.\n", "\n", "query_params = {\n", " \"query\" : \"subjects.subjectScheme:\\\"Fields of Science and Technology (FOS)\\\"\",\n", "}\n", "\n", "datasetsQuery = gql(\"\"\"query \n", "{\n", " datasets {\n", " totalCount\n", " }\n", "}\n", "\"\"\")\n", "\n", "\n", "fOSQuery = gql(\"\"\"query getOutputs($query: String)\n", "{\n", " datasets(query: $query) {\n", " totalCount\n", " }\n", "}\n", "\"\"\")\n", "\n", "\n", "citationsQuery = gql(\"\"\"query getOutputs($query: String)\n", "{\n", " datasets(query: $query, hasCitations:1) {\n", " totalCount\n", " fieldsOfScience{\n", " title\n", " count\n", " }\n", " published{\n", " title\n", " count\n", " }\n", " licenses{\n", " title\n", " count\n", " }\n", " affiliations{\n", " title\n", " count\n", " }\n", " }\n", "}\n", "\"\"\")\n", "\n", "viewsQuery = gql(\"\"\"query getOutputs($query: String)\n", "{\n", " datasets(query:$query, hasViews:1) {\n", " totalCount\n", " fieldsOfScience{\n", " title\n", " count\n", " }\n", " published{\n", " title\n", " count\n", " }\n", " licenses{\n", " title\n", " count\n", " }\n", " affiliations{\n", " title\n", " count\n", " }\n", " }\n", "}\n", "\n", "\"\"\")\n", "\n", "downloadsQuery = gql(\"\"\"query getOutputs($query: String)\n", "{\n", " datasets(query:$query, hasDownloads:1) {\n", " totalCount\n", " fieldsOfScience{\n", " title\n", " count\n", " }\n", " published{\n", " title\n", " count\n", " }\n", " licenses{\n", " title\n", " count\n", " }\n", " affiliations{\n", " title\n", " count\n", " }\n", " }\n", "}\n", "\n", "\"\"\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "def get_data(type):\n", " \"\"\"Gets the data from the graphql api into an object\n", "\n", " Parameters:\n", " type (string): Controlled vocabulary for type of data\n", "\n", " Returns:\n", " object:Returning object reponse\n", "\n", " \"\"\"\n", " if type == \"citations\":\n", " return client.execute(citationsQuery, variable_values=json.dumps(query_params))[\"datasets\"]\n", " elif type == \"views\":\n", " return client.execute(viewsQuery, variable_values=json.dumps(query_params))[\"datasets\"]\n", " elif type == \"downloads\":\n", " return client.execute(downloadsQuery, variable_values=json.dumps(query_params))[\"datasets\"]\n", " elif type == \"fos\":\n", " return client.execute(fOSQuery, variable_values=json.dumps(query_params))[\"datasets\"]\n", " else:\n", " return client.execute(datasetsQuery, variable_values=json.dumps(query_params))[\"datasets\"]\n" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "usage = get_data(\"views\")\n", "citations = get_data(\"citations\")\n", "datasets = get_data(\"datasets\")\n", "fos = get_data(\"fos\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Transformation\n", "\n", "Simple transformations are performed to convert the graphql response into an dataframe that can be used in visulisations and tables." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "def transform_distributions(dataframe, total):\n", " \"\"\"Modifies each item to include attributes needed for the node visulisation\n", "\n", " Parameters:\n", " dataframe (dataframe): A dataframe with all the itemss\n", " parent (int): The id of the parent node\n", "\n", " Returns:\n", " dataframe:Returning vthe same dataframe with new attributes\n", "\n", " \"\"\"\n", " # dataframe = {title: \"Other\", count: total}\n", " if (dataframe) is None:\n", " return pd.DataFrame() \n", " else: \n", " return (dataframe >>\n", " mutate(\n", " perc = (X['count']/total)*100\n", " ) \n", " )\n", " " ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "def processTable(data, type):\n", " # data = get_data(\"citations\")\n", " if len(data[type]) == 0:\n", " return None\n", " else:\n", " table = pd.DataFrame(data[type],columns=data[type][0].keys())\n", " return transform_distributions(table, data['totalCount']) " ] }, { "source": [ "## Descriptive Statistics Visulisation\n" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "output_type": "display_data", "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "domain": { "x": [ 0, 1 ], "y": [ 0, 1 ] }, "mode": "number+delta", "title": { "text": "Datasets" }, "type": "indicator", "value": 7627926 } ], "layout": { "paper_bgcolor": "lightgray", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } } } } }, "metadata": {} } ], "source": [ "fig = go.Figure(go.Indicator(\n", " mode = \"number+delta\",\n", " value = datasets[\"totalCount\"],\n", " title= {'text': \"Datasets\"},\n", " domain = {'x': [0, 1], 'y': [0, 1]}))\n", "fig.update_layout(paper_bgcolor = \"lightgray\")\n", "fig.show()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "output_type": "display_data", "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "domain": { "x": [ 0, 1 ], "y": [ 0, 1 ] }, "mode": "number+delta", "title": { "text": "Datasets with FOS" }, "type": "indicator", "value": 455944 } ], "layout": { "paper_bgcolor": "lightgray", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } } } } }, "metadata": {} } ], "source": [ "fig = go.Figure(go.Indicator(\n", " mode = \"number+delta\",\n", " value = fos[\"totalCount\"],\n", " title= {'text': \"Datasets with FOS\"},\n", " domain = {'x': [0, 1], 'y': [0, 1]}))\n", "fig.update_layout(paper_bgcolor = \"lightgray\")\n", "fig.show()" ] }, { "source": [ "Questions:\n", "\n", "- How can we enrich the metadata of the Datasets DOIs to include discplinary information?\n", " - On the best estimate,(viz, using the repository's discipline as a proxy for dataset discipline) we can increase the sample size from ~455K to ~1.8M DOIs with disciplinary metadata. Unfortunately, none of those 1.8M DOIs has “usage” information (none of the disciplinary repositories is sending usage reports), and only ~9K out of the 1.8M has at least a citation. \n" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "I have broken the descriptive statistics by the two main sections one per each type of data: citations and usage." ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Citations stats with disciplinary information" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "output_type": "display_data", "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "domain": { "x": [ 0, 1 ], "y": [ 0, 1 ] }, "mode": "number+delta", "title": { "text": "Cited Datasets (0.18%)" }, "type": "indicator", "value": 831 } ], "layout": { "paper_bgcolor": "lightgray", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } } } } }, "metadata": {} } ], "source": [ "\n", "perc = 100*(citations[\"totalCount\"]/fos[\"totalCount\"])\n", "\n", "\n", "fig = go.Figure(go.Indicator(\n", " mode = \"number+delta\",\n", " value = citations[\"totalCount\"],\n", " title= {'text': f\"Cited Datasets ({perc:.2f}%)\"},\n", " domain = {'x': [0, 1], 'y': [0, 1]}))\n", "fig.update_layout(paper_bgcolor = \"lightgray\")\n", "fig.show()" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 Rice University 5 0.601685\n", "1 University of California, Berkeley 5 0.601685\n", "2 University of Melbourne 5 0.601685\n", "3 Utah State University 4 0.481348\n", "4 University of California System 4 0.481348\n", "5 French National Centre for Scientific Research 4 0.481348\n", "6 University of Florida 4 0.481348\n", "7 University of Arizona 4 0.481348\n", "8 Cornell University 4 0.481348\n", "9 University of Sheffield 4 0.481348" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
0Rice University50.601685
1University of California, Berkeley50.601685
2University of Melbourne50.601685
3Utah State University40.481348
4University of California System40.481348
5French National Centre for Scientific Research40.481348
6University of Florida40.481348
7University of Arizona40.481348
8Cornell University40.481348
9University of Sheffield40.481348
\n
" }, "metadata": {}, "execution_count": 20 } ], "source": [ "processTable(citations, \"affiliations\")" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 Earth and related environmental sciences 349 41.997593\n", "1 Sociology 146 17.569194\n", "2 Biological sciences 142 17.087846\n", "3 Social sciences 55 6.618532\n", "4 Clinical medicine 30 3.610108\n", "5 Computer and information sciences 23 2.767750\n", "6 Health sciences 21 2.527076\n", "7 Languages and literature 16 1.925391\n", "8 Psychology 15 1.805054\n", "9 Physical sciences 12 1.444043" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
0Earth and related environmental sciences34941.997593
1Sociology14617.569194
2Biological sciences14217.087846
3Social sciences556.618532
4Clinical medicine303.610108
5Computer and information sciences232.767750
6Health sciences212.527076
7Languages and literature161.925391
8Psychology151.805054
9Physical sciences121.444043
\n
" }, "metadata": {}, "execution_count": 21 } ], "source": [ "processTable(citations, \"fieldsOfScience\") " ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 2020 150 18.050542\n", "1 2019 67 8.062575\n", "2 2018 78 9.386282\n", "3 2017 45 5.415162\n", "4 2016 65 7.821901\n", "5 2015 51 6.137184\n", "6 2014 29 3.489771\n", "7 2013 21 2.527076\n", "8 2012 19 2.286402\n", "9 2011 105 12.635379\n", "10 2010 17 2.045728" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
0202015018.050542
12019678.062575
22018789.386282
32017455.415162
42016657.821901
52015516.137184
62014293.489771
72013212.527076
82012192.286402
9201110512.635379
102010172.045728
\n
" }, "metadata": {}, "execution_count": 22 } ], "source": [ "processTable(citations, \"published\") \n" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 CC0-1.0 210 25.270758\n", "1 CC-BY-4.0 129 15.523466\n", "2 CC-BY-3.0 4 0.481348\n", "3 cc-by-nd-2.0 3 0.361011\n", "4 MIT 2 0.240674\n", "5 CC-BY-NC-4.0 1 0.120337\n", "6 CC-BY-NC-ND-4.0 1 0.120337\n", "7 CC-BY-NC-SA-4.0 1 0.120337\n", "8 CC-BY-SA-4.0 1 0.120337\n", "9 GPL-3.0 1 0.120337" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
0CC0-1.021025.270758
1CC-BY-4.012915.523466
2CC-BY-3.040.481348
3cc-by-nd-2.030.361011
4MIT20.240674
5CC-BY-NC-4.010.120337
6CC-BY-NC-ND-4.010.120337
7CC-BY-NC-SA-4.010.120337
8CC-BY-SA-4.010.120337
9GPL-3.010.120337
\n
" }, "metadata": {}, "execution_count": 26 } ], "source": [ "processTable(citations, \"licenses\") \n" ] }, { "source": [ "Questions:\n", "\n", "- Why are there so few Datasets DOIs with citations?\n", " - There are obviously social reasons but we can focus on the technical reasons here:\n", " - There citations events in EventData still need to reach our DOI index. [Link](https://github.com/datacite/datacite/issues/1082) \n", " - Citations counts from/to Crossref DOIs can be lowered if we have not indexed Crossref Metadata. [Link](https://github.com/datacite/datacite/issues/1082) \n" ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Usage stats with disciplinary information" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "output_type": "display_data", "data": { "application/vnd.plotly.v1+json": { "config": { "plotlyServerURL": "https://plot.ly" }, "data": [ { "domain": { "x": [ 0, 1 ], "y": [ 0, 1 ] }, "mode": "number+delta", "title": { "text": "Viewed Datasets (0.05%)" }, "type": "indicator", "value": 244 } ], "layout": { "paper_bgcolor": "lightgray", "template": { "data": { "bar": [ { "error_x": { "color": "#2a3f5f" }, "error_y": { "color": "#2a3f5f" }, "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "bar" } ], "barpolar": [ { "marker": { "line": { "color": "#E5ECF6", "width": 0.5 } }, "type": "barpolar" } ], "carpet": [ { "aaxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "baxis": { "endlinecolor": "#2a3f5f", "gridcolor": "white", "linecolor": "white", "minorgridcolor": "white", "startlinecolor": "#2a3f5f" }, "type": "carpet" } ], "choropleth": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "choropleth" } ], "contour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "contour" } ], "contourcarpet": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "contourcarpet" } ], "heatmap": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmap" } ], "heatmapgl": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "heatmapgl" } ], "histogram": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "histogram" } ], "histogram2d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2d" } ], "histogram2dcontour": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "histogram2dcontour" } ], "mesh3d": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "type": "mesh3d" } ], "parcoords": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "parcoords" } ], "pie": [ { "automargin": true, "type": "pie" } ], "scatter": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter" } ], "scatter3d": [ { "line": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatter3d" } ], "scattercarpet": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattercarpet" } ], "scattergeo": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergeo" } ], "scattergl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattergl" } ], "scattermapbox": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scattermapbox" } ], "scatterpolar": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolar" } ], "scatterpolargl": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterpolargl" } ], "scatterternary": [ { "marker": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "type": "scatterternary" } ], "surface": [ { "colorbar": { "outlinewidth": 0, "ticks": "" }, "colorscale": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "type": "surface" } ], "table": [ { "cells": { "fill": { "color": "#EBF0F8" }, "line": { "color": "white" } }, "header": { "fill": { "color": "#C8D4E3" }, "line": { "color": "white" } }, "type": "table" } ] }, "layout": { "annotationdefaults": { "arrowcolor": "#2a3f5f", "arrowhead": 0, "arrowwidth": 1 }, "coloraxis": { "colorbar": { "outlinewidth": 0, "ticks": "" } }, "colorscale": { "diverging": [ [ 0, "#8e0152" ], [ 0.1, "#c51b7d" ], [ 0.2, "#de77ae" ], [ 0.3, "#f1b6da" ], [ 0.4, "#fde0ef" ], [ 0.5, "#f7f7f7" ], [ 0.6, "#e6f5d0" ], [ 0.7, "#b8e186" ], [ 0.8, "#7fbc41" ], [ 0.9, "#4d9221" ], [ 1, "#276419" ] ], "sequential": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ], "sequentialminus": [ [ 0, "#0d0887" ], [ 0.1111111111111111, "#46039f" ], [ 0.2222222222222222, "#7201a8" ], [ 0.3333333333333333, "#9c179e" ], [ 0.4444444444444444, "#bd3786" ], [ 0.5555555555555556, "#d8576b" ], [ 0.6666666666666666, "#ed7953" ], [ 0.7777777777777778, "#fb9f3a" ], [ 0.8888888888888888, "#fdca26" ], [ 1, "#f0f921" ] ] }, "colorway": [ "#636efa", "#EF553B", "#00cc96", "#ab63fa", "#FFA15A", "#19d3f3", "#FF6692", "#B6E880", "#FF97FF", "#FECB52" ], "font": { "color": "#2a3f5f" }, "geo": { "bgcolor": "white", "lakecolor": "white", "landcolor": "#E5ECF6", "showlakes": true, "showland": true, "subunitcolor": "white" }, "hoverlabel": { "align": "left" }, "hovermode": "closest", "mapbox": { "style": "light" }, "paper_bgcolor": "white", "plot_bgcolor": "#E5ECF6", "polar": { "angularaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "radialaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "scene": { "xaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "yaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" }, "zaxis": { "backgroundcolor": "#E5ECF6", "gridcolor": "white", "gridwidth": 2, "linecolor": "white", "showbackground": true, "ticks": "", "zerolinecolor": "white" } }, "shapedefaults": { "line": { "color": "#2a3f5f" } }, "ternary": { "aaxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "baxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" }, "bgcolor": "#E5ECF6", "caxis": { "gridcolor": "white", "linecolor": "white", "ticks": "" } }, "title": { "x": 0.05 }, "xaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 }, "yaxis": { "automargin": true, "gridcolor": "white", "linecolor": "white", "ticks": "", "title": { "standoff": 15 }, "zerolinecolor": "white", "zerolinewidth": 2 } } } } } }, "metadata": {} } ], "source": [ "perc = 100*(usage[\"totalCount\"]/fos[\"totalCount\"])\n", "\n", "\n", "fig = go.Figure(go.Indicator(\n", " mode = \"number+delta\",\n", " value = usage[\"totalCount\"],\n", " title= {'text': f\"Viewed Datasets ({perc:.2f}%)\"},\n", " # delta = {'position': \"top\", 'reference': usage[\"published\"][0][\"count\"]},\n", " domain = {'x': [0, 1], 'y': [0, 1]}))\n", "\n", "fig.update_layout(paper_bgcolor = \"lightgray\")\n", "\n", "fig.show()" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 Sociology 166 68.032787\n", "1 Biological sciences 39 15.983607\n", "2 Clinical medicine 13 5.327869\n", "3 Health sciences 9 3.688525\n", "4 Computer and information sciences 4 1.639344\n", "5 Chemical engineering 2 0.819672\n", "6 Earth and related environmental sciences 2 0.819672\n", "7 Languages and literature 2 0.819672\n", "8 Medical biotechnology 2 0.819672\n", "9 Chemical sciences 1 0.409836" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
0Sociology16668.032787
1Biological sciences3915.983607
2Clinical medicine135.327869
3Health sciences93.688525
4Computer and information sciences41.639344
5Chemical engineering20.819672
6Earth and related environmental sciences20.819672
7Languages and literature20.819672
8Medical biotechnology20.819672
9Chemical sciences10.409836
\n
" }, "metadata": {}, "execution_count": 28 } ], "source": [ "processTable(usage, \"fieldsOfScience\") " ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 University of California, Berkeley 7 2.868852\n", "1 Rice University 5 2.049180\n", "2 Utah State University 5 2.049180\n", "3 University of Melbourne 5 2.049180\n", "4 Harvard University 5 2.049180\n", "5 University of Helsinki 5 2.049180\n", "6 University of Sheffield 5 2.049180\n", "7 University of Montana 4 1.639344\n", "8 Princeton University 4 1.639344\n", "9 University of California System 4 1.639344" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
0University of California, Berkeley72.868852
1Rice University52.049180
2Utah State University52.049180
3University of Melbourne52.049180
4Harvard University52.049180
5University of Helsinki52.049180
6University of Sheffield52.049180
7University of Montana41.639344
8Princeton University41.639344
9University of California System41.639344
\n
" }, "metadata": {}, "execution_count": 29 } ], "source": [ "processTable(usage, \"affiliations\")" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 2020 23 9.426230\n", "1 2019 53 21.721311\n", "2 2018 31 12.704918\n", "3 2017 21 8.606557\n", "4 2016 37 15.163934\n", "5 2015 26 10.655738\n", "6 2014 19 7.786885\n", "7 2013 11 4.508197\n", "8 2012 13 5.327869\n", "9 2011 4 1.639344\n", "10 2010 2 0.819672" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
02020239.426230
120195321.721311
220183112.704918
32017218.606557
420163715.163934
520152610.655738
62014197.786885
72013114.508197
82012135.327869
9201141.639344
10201020.819672
\n
" }, "metadata": {}, "execution_count": 30 } ], "source": [ "processTable(usage, \"published\")" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ " title count perc\n", "0 CC0-1.0 226 92.622951\n", "1 CC-BY-4.0 2 0.819672" ], "text/html": "
\n\n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n
titlecountperc
0CC0-1.022692.622951
1CC-BY-4.020.819672
\n
" }, "metadata": {}, "execution_count": 31 } ], "source": [ "processTable(usage, \"licenses\")" ] }, { "source": [ "Questions:\n", "\n", "- Why are there so few Datasets DOIs with usage information?\n", " - Currently just a handful of repositories have been sending usage statistics." ], "cell_type": "markdown", "metadata": {} }, { "source": [ "## Further Visulisation [WIP]\n" ], "cell_type": "markdown", "metadata": {} }, { "cell_type": "code", "execution_count": 32, "metadata": { "tags": [] }, "outputs": [], "source": [ "def vega_donut_template(data):\n", " \"\"\"Injects data into the vega specification\n", "\n", " Parameters:\n", " data (array): Array of nodes\n", "\n", " Returns:\n", " VegaSpec:Specification with data\n", "\n", " \"\"\"\n", " return \"\"\"\n", "{\n", " \"$schema\": \"https://vega.github.io/schema/vega-lite/v4.json\",\n", " \"description\": \"A simple donut chart with embedded data.\",\n", " \"padding\": {\"left\": 55, \"top\": 10, \"right\": 10, \"bottom\": 10},\n", " \"width\": 200,\n", " \"height\": 200,\n", " \"data\": {\n", " \"values\": \"\"\" + data + \"\"\"\n", " },\n", " \"layer\": [\n", " {\n", " \"mark\": {\n", " \"type\": \"arc\",\n", " \"innerRadius\": 68,\n", " \"outerRadius\": 90,\n", " \"cursor\": \"pointer\",\n", " \"tooltip\": true\n", " },\n", " \"encoding\": {\n", " \"theta\": {\n", " \"field\": \"count\",\n", " \"type\": \"quantitative\",\n", " \"sort\": \"descending\"\n", " },\n", " \"color\": {\n", " \"field\": \"title\",\n", " \"type\": \"nominal\",\n", " \"title\": \"type\",\n", " \"scale\": {\n", " \"range\": [\n", " \"#fccde5\",\n", " \"#fdb462\",\n", " \"#fb8072\",\n", " \"#fb8072\",\n", " \"#b3de69\",\n", " \"#bc80bd\",\n", " \"#fccde5\",\n", " \"#8dd3c7\",\n", " \"#ffed6f\",\n", " \"#d9d9d9\",\n", " \"#ffffb3\",\n", " \"#bebada\",\n", " \"#80b1d3\",\n", " \"#ccebc5\",\n", " \"#d9d9d9\"\n", " ],\n", " \"domain\": [\n", " \"2020\",\n", " \"2019\",\n", " \"2018\",\n", " \"2017\",\n", " \"2016\",\n", " \"2015\",\n", " \"2014\",\n", " \"Model\",\n", " \"Physical Object\",\n", " \"Service\",\n", " \"Sound\",\n", " \"Software\",\n", " \"Text\",\n", " \"Workflow\",\n", " \"Other\"\n", " ]\n", " }\n", " }\n", " }\n", " },\n", " {\n", " \"mark\": {\n", " \"type\": \"text\",\n", " \"fill\": \"#767676\",\n", " \"align\": \"center\",\n", " \"baseline\": \"middle\",\n", " \"fontSize\": 27\n", " },\n", " \"encoding\": {\"text\": {\"value\": \"33\"}}\n", " }\n", " ],\n", " \"view\": {\"stroke\": \"none\"}\n", "}\n", " \"\"\"" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "tags": [] }, "outputs": [], "source": [ "def vega_grid_template(data):\n", " \"\"\"Injects data into the vega specification\n", "\n", " Parameters:\n", " data (array): Array of nodes\n", "\n", " Returns:\n", " VegaSpec:Specification with data\n", "\n", " \"\"\"\n", " return \"\"\"\n", "{\n", " \"$schema\": \"https://vega.github.io/schema/vega-lite/v4.json\",\n", " \"description\": \"Two vertically concatenated charts that show a histogram of precipitation in Seattle and the relationship between min and max temperature.\",\n", " \"data\": {\n", " \"url\": \"data/weather.csv\"\n", " },\n", "\n", " \"vconcat\": [\n", " {\n", " \"\"\" + total + \"\"\"\n", " },\n", " {\n", " \"\"\" + discipline_distribution + \"\"\"\n", " },\n", " {\n", " \"\"\" + affiliation_distribution + \"\"\"\n", " },\n", " ]\n", "}\n", "\n", " \"\"\"" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "output_type": "error", "ename": "KeyError", "evalue": "'published'", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mchart\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlite\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mVegaLite\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mjson\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mloads\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mvega_donut_template\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mjson\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdumps\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mget_data\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"published\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mchart\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mKeyError\u001b[0m: 'published'" ] } ], "source": [ "chart = lite.VegaLite(json.loads(vega_donut_template(json.dumps(get_data(\"\")[\"published\"]))))\n", "chart" ] }, { "cell_type": "code", "execution_count": 195, "metadata": {}, "outputs": [], "source": [ "def vega_hist_template(data):\n", " \"\"\"Injects data into the vega specification\n", "\n", " Parameters:\n", " data (array): Array of nodes\n", "\n", " Returns:\n", " VegaSpec:Specification with data\n", "\n", " \"\"\"\n", " return \"\"\"\n", "{\n", " \"$schema\": \"https://vega.github.io/schema/vega-lite/v4.json\",\n", " \"data\": {\"values\": \"\"\" + data + \"\"\"},\n", " \"padding\": {\"left\": 5, \"top\": 5, \"right\": 5, \"bottom\": 5},\n", " \"transform\": [\n", " {\"calculate\": \"toNumber(datum.title)\", \"as\": \"period\"},\n", " {\"calculate\": \"toNumber(datum.title)+1\", \"as\": \"bin_end\"},\n", " {\"filter\": \"toNumber(datum.title) >= 2010\"}\n", " ],\n", " \"width\": 242,\n", " \"mark\": {\"type\": \"bar\", \"cursor\": \"pointer\", \"tooltip\": true},\n", " \"selection\": {\n", " \"highlight\": {\"type\": \"single\", \"empty\": \"none\", \"on\": \"mouseover\"}\n", " },\n", " \"encoding\": {\n", " \"x\": {\n", " \"field\": \"period\",\n", " \"bin\": {\"binned\": true, \"step\": 1, \"maxbins\": 11},\n", " \"type\": \"quantitative\",\n", " \"axis\": {\n", " \"format\": \"1\"\n", " },\n", " \"scale\": {\"domain\": [2010, 2021]}\n", " },\n", " \"x2\": {\"field\": \"bin_end\"},\n", " \"y\": {\n", " \"field\": \"count\",\n", " \"type\": \"quantitative\",\n", " \"axis\": {\"format\": \",f\", \"tickMinStep\": 1}\n", " },\n", " \"color\": {\n", " \"field\": \"count\",\n", " \"scale\": {\"range\": [\"#1abc9c\"]},\n", " \"type\": \"nominal\",\n", " \"legend\": null,\n", " \"condition\": [{\"selection\": \"highlight\", \"value\": \"#34495e\"}]\n", " }\n", " },\n", " \"config\": {\n", " \"view\": {\"stroke\": null},\n", " \"axis\": {\"grid\": false, \"title\": \"donut\", \"labelFlush\": false}\n", " }\n", "}\n", "\n", "\"\"\"" ] }, { "cell_type": "code", "execution_count": 197, "metadata": {}, "outputs": [ { "output_type": "execute_result", "data": { "text/html": "\n
\n", "text/plain": [ "" ] }, "metadata": {}, "execution_count": 197 } ], "source": [ "chart = lite.VegaLite(json.loads(vega_hist_template(json.dumps(get_data(\"\")[\"published\"]))))\n", "chart" ] } ] }