{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this lab exercise, you will learn how to perform scientometric network analysis in Python. We will start with practicalities on some basic data handling and import. We then move on to creating a network and cover some basic analysis. In the next session, we will be using more advanced techniques." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
Ctrl-Enter while selecting the code cell below. Alternatively, you can press the \"Play\" button at the top of the screen. This also moves to the next cell at the same time. Using Shift-Enter instead of Ctrl-Enter will also execute the code and move to the next cell at the same time.\n",
"1. While the code in a cell is being executed it is marked by an asterisk *. Each cell of executed code will be numbered in the order in which you execute it. If you execute it again, it will be numbered 2, et cetera.\n",
"data_files/wos. At the end of this notebook, you will be asked to download your own data. If you want to load that data instead, use the path to that data.\n",
"\\ to separate directories, in Python you can also use the forward slash /, which is usually more convenient for a number of reasons.\n",
"TI), abstract (AB), journal (SO) and publication year (PY) for rows 200-210.\n",
"Enter\n",
"publications_df data frame, and see how many rows it has.\n",
"PY) and count the number of paper from each year.\n",
"Tab. For example, you can type publications_df., including the . and then press Tab (make sure the cursor is located after the .). If you then start typing the name of the function you are looking for and press Tab again, Python will automatically finish it as much as possible. This is something general: whenever you press Tab Python will try to autocomplete whatever you are typing.\n",
"\n",
"One other trick: if you have selected a function and press Shift-Tab you will get documentation of what this function does. You can press the + to find out more.\n",
"igraph and call it ig.\n",
"coupling set it to 1 and then sum this attribute when simplifying the network.\n",
"Tab and Shift-Tab to find out more about possible functions.\n",
"'Van Marck, E' to 'Migchelsen, S'. Who is in between?\n",
"'Van Marck, E'?\n",
"n_joint_papers_frac over all co-authors? Then shouldn't the strength sum up to a whole number? Why isn't that the case here? (Hint: look at the authors of publication 'WOS:000242241600004'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"publications_df.loc['WOS:000242241600004', 'AU']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Betweenness centrality"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Betweenness centrality is much more elaborate, and gives an indication of the number of times a node is on the shortest path from one node to another node.\n",
"\n",
"As you can imagine, this can take quite some time to calculate for all nodes. We will therefore use the somewhat smaller bibliographic coupling network of journals."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"G_coauthorship? (Hint: checkout the degree.)\n",
"pandas\n",
"