{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this lab exercise, you will learn how to perform scientometric network analysis in Python. We will start with practicalities on some basic data handling and import. We then move on to creating a network and cover some basic analysis. In the next session, we will be using more advanced techniques." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
Ctrl-Enter
while selecting the code cell below. Alternatively, you can press the \"Play\" button at the top of the screen. This also moves to the next cell at the same time. Using Shift-Enter
instead of Ctrl-Enter
will also execute the code and move to the next cell at the same time.\n",
"1
. While the code in a cell is being executed it is marked by an asterisk *
. Each cell of executed code will be numbered in the order in which you execute it. If you execute it again, it will be numbered 2
, et cetera.\n",
"data_files/wos
. At the end of this notebook, you will be asked to download your own data. If you want to load that data instead, use the path to that data.\n",
"\\
to separate directories, in Python you can also use the forward slash /
, which is usually more convenient for a number of reasons.\n",
"TI
), abstract (AB
), journal (SO
) and publication year (PY
) for rows 200-210.\n",
"Enter
\n",
"publications_df
data frame, and see how many rows it has.\n",
"PY
) and count the number of paper from each year.\n",
"Tab
. For example, you can type publications_df.
, including the .
and then press Tab
(make sure the cursor is located after the .
). If you then start typing the name of the function you are looking for and press Tab
again, Python will automatically finish it as much as possible. This is something general: whenever you press Tab
Python will try to autocomplete whatever you are typing.\n",
"\n",
"One other trick: if you have selected a function and press Shift-Tab
you will get documentation of what this function does. You can press the +
to find out more.\n",
"igraph
and call it ig
.\n",
"coupling
set it to 1
and then sum this attribute when simplifying the network.\n",
"Tab
and Shift-Tab
to find out more about possible functions.\n",
"'Van Marck, E'
to 'Migchelsen, S'
. Who is in between?\n",
"'Van Marck, E'
?\n",
"n_joint_papers_frac
over all co-authors? Then shouldn't the strength sum up to a whole number? Why isn't that the case here? (Hint: look at the authors of publication 'WOS:000242241600004'
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"publications_df.loc['WOS:000242241600004', 'AU']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Betweenness centrality"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Betweenness centrality is much more elaborate, and gives an indication of the number of times a node is on the shortest path from one node to another node.\n",
"\n",
"As you can imagine, this can take quite some time to calculate for all nodes. We will therefore use the somewhat smaller bibliographic coupling network of journals."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"G_coauthorship
? (Hint: checkout the degree.)\n",
"pandas
\n",
"