{
"cells": [
{
"cell_type": "markdown",
"id": "cloudy-monaco",
"metadata": {
"papermill": {
"duration": 0.009396,
"end_time": "2021-03-22T10:46:32.894106",
"exception": false,
"start_time": "2021-03-22T10:46:32.884710",
"status": "completed"
},
"tags": []
},
"source": [
"# Case study 2: Identification of potential drug targets"
]
},
{
"cell_type": "markdown",
"id": "dress-lucas",
"metadata": {
"papermill": {
"duration": 0.010342,
"end_time": "2021-03-22T10:46:32.915194",
"exception": false,
"start_time": "2021-03-22T10:46:32.904852",
"status": "completed"
},
"tags": []
},
"source": [
"Systematic MR of molecular phenotypes such as proteins and expression of transcript levels offer enormous potential to prioritise drug targets for further investigation. \n",
"However, many genes and gene products are not easily druggable, so some potentially important causal genes may not offer an obvious route to intervention. \n",
"\n",
"A parallel problem is that current GWASes of molecular phenotypes have limited sample sizes and limited protein coverages. \n",
"A potential way to address both these problems is to use protein-protein interaction information to identify druggable targets which are linked to a non-druggable, but robustly causal target. \n",
"Their relationship to the causal target increases our confidence in their potential causal role even if the initial evidence of effect is below our multiple-testing threshold. \n",
"\n",
"Here in case study 2 we demonstrate an approach to use data in EpiGraphDB to\n",
"prioritise potential alternative drug targets in the same PPI network, as follows:\n",
"\n",
"- For an existing drug target of interests, we use PPI networks to search for its directly interacting genes that are evidenced to be druggable.\n",
"- We then examine the causal evidence of these candidate genes on the disease. \n",
"- We also examine the literature evidence of these candidate genes on the disease.\n",
"\n",
"The triangulation of MR evidence and literature evidence as available from EpiGraphDB regarding these candidate genes will greatly enhance our confidence in identifying potential viable drug targets."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "regulation-filename",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:32.938782Z",
"iopub.status.busy": "2021-03-22T10:46:32.938420Z",
"iopub.status.idle": "2021-03-22T10:46:33.179123Z",
"shell.execute_reply": "2021-03-22T10:46:33.178730Z"
},
"papermill": {
"duration": 0.254028,
"end_time": "2021-03-22T10:46:33.179216",
"exception": false,
"start_time": "2021-03-22T10:46:32.925188",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"import pandas as pd\n",
"import requests"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "anonymous-fault",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:33.201346Z",
"iopub.status.busy": "2021-03-22T10:46:33.200984Z",
"iopub.status.idle": "2021-03-22T10:46:33.202780Z",
"shell.execute_reply": "2021-03-22T10:46:33.203117Z"
},
"papermill": {
"duration": 0.014102,
"end_time": "2021-03-22T10:46:33.203216",
"exception": false,
"start_time": "2021-03-22T10:46:33.189114",
"status": "completed"
},
"tags": [
"parameters"
]
},
"outputs": [],
"source": [
"# Default parameters\n",
"API_URL = \"https://api.epigraphdb.org\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "every-investor",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:33.225298Z",
"iopub.status.busy": "2021-03-22T10:46:33.224940Z",
"iopub.status.idle": "2021-03-22T10:46:33.226689Z",
"shell.execute_reply": "2021-03-22T10:46:33.227008Z"
},
"papermill": {
"duration": 0.014499,
"end_time": "2021-03-22T10:46:33.227101",
"exception": false,
"start_time": "2021-03-22T10:46:33.212602",
"status": "completed"
},
"tags": [
"injected-parameters"
]
},
"outputs": [],
"source": [
"# Parameters\n",
"API_URL = \"https://api.epigraphdb.org\"\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "institutional-terminal",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:33.252062Z",
"iopub.status.busy": "2021-03-22T10:46:33.251686Z",
"iopub.status.idle": "2021-03-22T10:46:33.403181Z",
"shell.execute_reply": "2021-03-22T10:46:33.403728Z"
},
"papermill": {
"duration": 0.166887,
"end_time": "2021-03-22T10:46:33.403915",
"exception": false,
"start_time": "2021-03-22T10:46:33.237028",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"https://api.epigraphdb.org\n"
]
},
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(API_URL)\n",
"requests.get(f\"{API_URL}/ping\").json()"
]
},
{
"cell_type": "markdown",
"id": "abroad-print",
"metadata": {
"papermill": {
"duration": 0.022752,
"end_time": "2021-03-22T10:46:33.445566",
"exception": false,
"start_time": "2021-03-22T10:46:33.422814",
"status": "completed"
},
"tags": []
},
"source": [
"## Parameters\n",
"\n",
"We illustrate this approach using IL23R, an established drug target for inflammatory bowel disease (IBD) (Duerr et al., 2006; Momozawa et al., 2011).\n",
"\n",
"While specific IL23R interventions are still undergoing trials, there is a possibility that these therapies may not be effective for all or even the majority of patients. This case study therefore explores potential alternative drug targets. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "adapted-venue",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:33.494475Z",
"iopub.status.busy": "2021-03-22T10:46:33.493903Z",
"iopub.status.idle": "2021-03-22T10:46:33.496301Z",
"shell.execute_reply": "2021-03-22T10:46:33.495825Z"
},
"papermill": {
"duration": 0.0241,
"end_time": "2021-03-22T10:46:33.496404",
"exception": false,
"start_time": "2021-03-22T10:46:33.472304",
"status": "completed"
},
"tags": []
},
"outputs": [],
"source": [
"GENE_NAME = \"IL23R\"\n",
"OUTCOME_TRAIT = \"Inflammatory bowel disease\""
]
},
{
"cell_type": "markdown",
"id": "northern-easter",
"metadata": {
"papermill": {
"duration": 0.012387,
"end_time": "2021-03-22T10:46:33.522017",
"exception": false,
"start_time": "2021-03-22T10:46:33.509630",
"status": "completed"
},
"tags": []
},
"source": [
"## Using PPI networks for alternative drug targets search"
]
},
{
"cell_type": "markdown",
"id": "flexible-compilation",
"metadata": {
"papermill": {
"duration": 0.015417,
"end_time": "2021-03-22T10:46:33.549590",
"exception": false,
"start_time": "2021-03-22T10:46:33.534173",
"status": "completed"
},
"tags": []
},
"source": [
"The assumption here is that the most likely alternative targets are either directly interacting with IL23R or somewhere in the PPI network. \n",
"In this example, we consider only genes that were found to interact with IL23R via direct protein-protein interactions, and require that those **interacting proteins** should also be **druggable**. \n",
"\n",
"The thousands of genes are classified with regard to their druggability by Finan et al. 2017, where the **Tier 1** category refers to approved drugs or those in clinical testing while for other tier categories the druggability confidence drops in order **Tier 2** and then **Tier 3**.\n",
"\n",
"Here we use the \n",
"[GET /gene/druggability/ppi](http://docs.epigraphdb.org/api/api-endpoints/#get-genedruggabilityppi)\n",
"endpoint to get data on the druggable alternative genes."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "allied-verse",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:33.580826Z",
"iopub.status.busy": "2021-03-22T10:46:33.580381Z",
"iopub.status.idle": "2021-03-22T10:46:33.776182Z",
"shell.execute_reply": "2021-03-22T10:46:33.776480Z"
},
"papermill": {
"duration": 0.214088,
"end_time": "2021-03-22T10:46:33.776582",
"exception": false,
"start_time": "2021-03-22T10:46:33.562494",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" g1.name | \n",
" p1.uniprot_id | \n",
" p2.uniprot_id | \n",
" g2.name | \n",
" g2.druggability_tier | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P04141 | \n",
" CSF2 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 1 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P01562 | \n",
" IFNA1 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 2 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P01579 | \n",
" IFNG | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 3 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P22301 | \n",
" IL10 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 4 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P29460 | \n",
" IL12B | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 5 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P42701 | \n",
" IL12RB1 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 6 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P35225 | \n",
" IL13 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 7 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P40933 | \n",
" IL15 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 8 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q16552 | \n",
" IL17A | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 9 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q96PD4 | \n",
" IL17F | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 10 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P60568 | \n",
" IL2 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 11 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q9GZX6 | \n",
" IL22 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 12 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q9NPF7 | \n",
" IL23A | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 13 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P05112 | \n",
" IL4 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 14 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P05113 | \n",
" IL5 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 15 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P05231 | \n",
" IL6 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 16 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P15248 | \n",
" IL9 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 17 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P23458 | \n",
" JAK1 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 18 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" O60674 | \n",
" JAK2 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 19 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P19838 | \n",
" NFKB1 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 20 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P42336 | \n",
" PIK3CA | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 21 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P51449 | \n",
" RORC | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 22 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P40763 | \n",
" STAT3 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 23 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q969D9 | \n",
" TSLP | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 24 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P29597 | \n",
" TYK2 | \n",
" Tier 1 | \n",
"
\n",
" \n",
" 25 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P51684 | \n",
" CCR6 | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 26 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P25963 | \n",
" NFKBIA | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 27 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q9HC29 | \n",
" NOD2 | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 28 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P27986 | \n",
" PIK3R1 | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 29 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q04206 | \n",
" RELA | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 30 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P42224 | \n",
" STAT1 | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 31 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P42229 | \n",
" STAT5A | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 32 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P42226 | \n",
" STAT6 | \n",
" Tier 2 | \n",
"
\n",
" \n",
" 33 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P09919 | \n",
" CSF3 | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 34 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q9NZ08 | \n",
" ERAP1 | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 35 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P29459 | \n",
" IL12A | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 36 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q8TAD2 | \n",
" IL17D | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 37 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q9UHD0 | \n",
" IL19 | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 38 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q9HBE4 | \n",
" IL21 | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 39 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" Q13007 | \n",
" IL24 | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 40 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" P13232 | \n",
" IL7 | \n",
" Tier 3A | \n",
"
\n",
" \n",
" 41 | \n",
" IL23R | \n",
" Q5VWK5 | \n",
" O00421 | \n",
" CCRL2 | \n",
" Tier 3B | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" g1.name p1.uniprot_id p2.uniprot_id g2.name g2.druggability_tier\n",
"0 IL23R Q5VWK5 P04141 CSF2 Tier 1\n",
"1 IL23R Q5VWK5 P01562 IFNA1 Tier 1\n",
"2 IL23R Q5VWK5 P01579 IFNG Tier 1\n",
"3 IL23R Q5VWK5 P22301 IL10 Tier 1\n",
"4 IL23R Q5VWK5 P29460 IL12B Tier 1\n",
"5 IL23R Q5VWK5 P42701 IL12RB1 Tier 1\n",
"6 IL23R Q5VWK5 P35225 IL13 Tier 1\n",
"7 IL23R Q5VWK5 P40933 IL15 Tier 1\n",
"8 IL23R Q5VWK5 Q16552 IL17A Tier 1\n",
"9 IL23R Q5VWK5 Q96PD4 IL17F Tier 1\n",
"10 IL23R Q5VWK5 P60568 IL2 Tier 1\n",
"11 IL23R Q5VWK5 Q9GZX6 IL22 Tier 1\n",
"12 IL23R Q5VWK5 Q9NPF7 IL23A Tier 1\n",
"13 IL23R Q5VWK5 P05112 IL4 Tier 1\n",
"14 IL23R Q5VWK5 P05113 IL5 Tier 1\n",
"15 IL23R Q5VWK5 P05231 IL6 Tier 1\n",
"16 IL23R Q5VWK5 P15248 IL9 Tier 1\n",
"17 IL23R Q5VWK5 P23458 JAK1 Tier 1\n",
"18 IL23R Q5VWK5 O60674 JAK2 Tier 1\n",
"19 IL23R Q5VWK5 P19838 NFKB1 Tier 1\n",
"20 IL23R Q5VWK5 P42336 PIK3CA Tier 1\n",
"21 IL23R Q5VWK5 P51449 RORC Tier 1\n",
"22 IL23R Q5VWK5 P40763 STAT3 Tier 1\n",
"23 IL23R Q5VWK5 Q969D9 TSLP Tier 1\n",
"24 IL23R Q5VWK5 P29597 TYK2 Tier 1\n",
"25 IL23R Q5VWK5 P51684 CCR6 Tier 2\n",
"26 IL23R Q5VWK5 P25963 NFKBIA Tier 2\n",
"27 IL23R Q5VWK5 Q9HC29 NOD2 Tier 2\n",
"28 IL23R Q5VWK5 P27986 PIK3R1 Tier 2\n",
"29 IL23R Q5VWK5 Q04206 RELA Tier 2\n",
"30 IL23R Q5VWK5 P42224 STAT1 Tier 2\n",
"31 IL23R Q5VWK5 P42229 STAT5A Tier 2\n",
"32 IL23R Q5VWK5 P42226 STAT6 Tier 2\n",
"33 IL23R Q5VWK5 P09919 CSF3 Tier 3A\n",
"34 IL23R Q5VWK5 Q9NZ08 ERAP1 Tier 3A\n",
"35 IL23R Q5VWK5 P29459 IL12A Tier 3A\n",
"36 IL23R Q5VWK5 Q8TAD2 IL17D Tier 3A\n",
"37 IL23R Q5VWK5 Q9UHD0 IL19 Tier 3A\n",
"38 IL23R Q5VWK5 Q9HBE4 IL21 Tier 3A\n",
"39 IL23R Q5VWK5 Q13007 IL24 Tier 3A\n",
"40 IL23R Q5VWK5 P13232 IL7 Tier 3A\n",
"41 IL23R Q5VWK5 O00421 CCRL2 Tier 3B"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def get_drug_targets_ppi(gene_name):\n",
" endpoint = \"/gene/druggability/ppi\"\n",
" url = f\"{API_URL}{endpoint}\"\n",
" params = {\"gene_name\": gene_name}\n",
" r = requests.get(url, params=params)\n",
" r.raise_for_status()\n",
" df = pd.json_normalize(r.json()[\"results\"])\n",
" return df\n",
"\n",
"\n",
"ppi_df = get_drug_targets_ppi(gene_name=GENE_NAME)\n",
"ppi_df"
]
},
{
"cell_type": "markdown",
"id": "powered-asbestos",
"metadata": {
"papermill": {
"duration": 0.009914,
"end_time": "2021-03-22T10:46:33.797280",
"exception": false,
"start_time": "2021-03-22T10:46:33.787366",
"status": "completed"
},
"tags": []
},
"source": [
"For further analysis we select the gene of interest (**IL23R**)\n",
"as well as its interacting genes with Tier 1 druggability."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "medical-belle",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:33.830408Z",
"iopub.status.busy": "2021-03-22T10:46:33.829934Z",
"iopub.status.idle": "2021-03-22T10:46:33.837508Z",
"shell.execute_reply": "2021-03-22T10:46:33.837814Z"
},
"papermill": {
"duration": 0.025579,
"end_time": "2021-03-22T10:46:33.837917",
"exception": false,
"start_time": "2021-03-22T10:46:33.812338",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"['IL23R',\n",
" 'CSF2',\n",
" 'IFNA1',\n",
" 'IFNG',\n",
" 'IL10',\n",
" 'IL12B',\n",
" 'IL12RB1',\n",
" 'IL13',\n",
" 'IL15',\n",
" 'IL17A',\n",
" 'IL17F',\n",
" 'IL2',\n",
" 'IL22',\n",
" 'IL23A',\n",
" 'IL4',\n",
" 'IL5',\n",
" 'IL6',\n",
" 'IL9',\n",
" 'JAK1',\n",
" 'JAK2',\n",
" 'NFKB1',\n",
" 'PIK3CA',\n",
" 'RORC',\n",
" 'STAT3',\n",
" 'TSLP',\n",
" 'TYK2']"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def get_gene_list(ppi_df, include_primary_gene: bool = True):\n",
" if include_primary_gene:\n",
" gene_list = list(ppi_df[\"g1.name\"].drop_duplicates()) + list(\n",
" ppi_df.query(\"`g2.druggability_tier` == 'Tier 1'\")[\"g2.name\"]\n",
" )\n",
" else:\n",
" gene_list = list(ppi_df.query(\"`g2.druggability_tier` == 'Tier 1'\")[\"g2.name\"])\n",
" return gene_list\n",
"\n",
"\n",
"gene_list = get_gene_list(ppi_df)\n",
"gene_list"
]
},
{
"cell_type": "markdown",
"id": "heated-while",
"metadata": {
"papermill": {
"duration": 0.010155,
"end_time": "2021-03-22T10:46:33.858476",
"exception": false,
"start_time": "2021-03-22T10:46:33.848321",
"status": "completed"
},
"tags": []
},
"source": [
"## Using Mendelian randomization results for causal effect estimation"
]
},
{
"cell_type": "markdown",
"id": "fundamental-dance",
"metadata": {
"papermill": {
"duration": 0.009993,
"end_time": "2021-03-22T10:46:33.879510",
"exception": false,
"start_time": "2021-03-22T10:46:33.869517",
"status": "completed"
},
"tags": []
},
"source": [
"The next step is to find out whether any of these genes have a comparable and statistically plausable effect on IBD. \n",
"\n",
"Here we search EpiGraphDB for the Mendelian randomization (MR) results for these genes and IBD from the recent study by Zheng et al, 2019 (http://epigraphdb.org/xqtl/) via the\n",
"[GET /xqtl/single-snp-mr](http://docs.epigraphdb.org/api/api-endpoints/#get-xqtlsingle-snp-mr)\n",
"endpoint."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "rising-support",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:33.912519Z",
"iopub.status.busy": "2021-03-22T10:46:33.912157Z",
"iopub.status.idle": "2021-03-22T10:46:41.671795Z",
"shell.execute_reply": "2021-03-22T10:46:41.672664Z"
},
"papermill": {
"duration": 7.783167,
"end_time": "2021-03-22T10:46:41.672887",
"exception": false,
"start_time": "2021-03-22T10:46:33.889720",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" gene.ensembl_id | \n",
" gene.name | \n",
" gwas.id | \n",
" gwas.trait | \n",
" r.beta | \n",
" r.se | \n",
" r.p | \n",
" r.rsid | \n",
" qtl_type | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" ENSG00000162594 | \n",
" IL23R | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" 1.500821 | \n",
" 0.054592 | \n",
" 2.212578e-166 | \n",
" rs11581607 | \n",
" pQTL | \n",
"
\n",
" \n",
" 1 | \n",
" ENSG00000113302 | \n",
" IL12B | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" 0.417605 | \n",
" 0.034490 | \n",
" 9.590000e-34 | \n",
" rs4921484 | \n",
" pQTL | \n",
"
\n",
" \n",
" 2 | \n",
" ENSG00000162594 | \n",
" IL23R | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" 0.886712 | \n",
" 0.064420 | \n",
" 4.165652e-43 | \n",
" rs2064689 | \n",
" eQTL | \n",
"
\n",
" \n",
" 3 | \n",
" ENSG00000164136 | \n",
" IL15 | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" -1.421625 | \n",
" 0.197131 | \n",
" 5.530616e-13 | \n",
" rs75301646 | \n",
" eQTL | \n",
"
\n",
" \n",
" 4 | \n",
" ENSG00000113520 | \n",
" IL4 | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" 0.459848 | \n",
" 0.084050 | \n",
" 4.471537e-08 | \n",
" rs2070874 | \n",
" eQTL | \n",
"
\n",
" \n",
" 5 | \n",
" ENSG00000096968 | \n",
" JAK2 | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" -1.896710 | \n",
" 0.203808 | \n",
" 1.322967e-20 | \n",
" rs4788084 | \n",
" eQTL | \n",
"
\n",
" \n",
" 6 | \n",
" ENSG00000109320 | \n",
" NFKB1 | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" 0.973556 | \n",
" 0.173893 | \n",
" 2.160849e-08 | \n",
" rs4766578 | \n",
" eQTL | \n",
"
\n",
" \n",
" 7 | \n",
" ENSG00000143365 | \n",
" RORC | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" -0.994991 | \n",
" 0.116343 | \n",
" 1.207271e-17 | \n",
" rs4845604 | \n",
" eQTL | \n",
"
\n",
" \n",
" 8 | \n",
" ENSG00000168610 | \n",
" STAT3 | \n",
" ieu-a-294 | \n",
" Inflammatory bowel disease | \n",
" 0.597473 | \n",
" 0.075700 | \n",
" 2.958269e-15 | \n",
" rs1053004 | \n",
" eQTL | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" gene.ensembl_id gene.name gwas.id gwas.trait r.beta \\\n",
"0 ENSG00000162594 IL23R ieu-a-294 Inflammatory bowel disease 1.500821 \n",
"1 ENSG00000113302 IL12B ieu-a-294 Inflammatory bowel disease 0.417605 \n",
"2 ENSG00000162594 IL23R ieu-a-294 Inflammatory bowel disease 0.886712 \n",
"3 ENSG00000164136 IL15 ieu-a-294 Inflammatory bowel disease -1.421625 \n",
"4 ENSG00000113520 IL4 ieu-a-294 Inflammatory bowel disease 0.459848 \n",
"5 ENSG00000096968 JAK2 ieu-a-294 Inflammatory bowel disease -1.896710 \n",
"6 ENSG00000109320 NFKB1 ieu-a-294 Inflammatory bowel disease 0.973556 \n",
"7 ENSG00000143365 RORC ieu-a-294 Inflammatory bowel disease -0.994991 \n",
"8 ENSG00000168610 STAT3 ieu-a-294 Inflammatory bowel disease 0.597473 \n",
"\n",
" r.se r.p r.rsid qtl_type \n",
"0 0.054592 2.212578e-166 rs11581607 pQTL \n",
"1 0.034490 9.590000e-34 rs4921484 pQTL \n",
"2 0.064420 4.165652e-43 rs2064689 eQTL \n",
"3 0.197131 5.530616e-13 rs75301646 eQTL \n",
"4 0.084050 4.471537e-08 rs2070874 eQTL \n",
"5 0.203808 1.322967e-20 rs4788084 eQTL \n",
"6 0.173893 2.160849e-08 rs4766578 eQTL \n",
"7 0.116343 1.207271e-17 rs4845604 eQTL \n",
"8 0.075700 2.958269e-15 rs1053004 eQTL "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def extract_mr(outcome_trait, gene_list, qtl_type):\n",
" endpoint = \"/xqtl/single-snp-mr\"\n",
" url = f\"{API_URL}{endpoint}\"\n",
"\n",
" def per_gene(gene_name):\n",
" params = {\n",
" \"exposure_gene\": gene_name,\n",
" \"outcome_trait\": outcome_trait,\n",
" \"qtl_type\": qtl_type,\n",
" \"pval_threshold\": 1e-5,\n",
" }\n",
" r = requests.get(url, params=params)\n",
" try:\n",
" r.raise_for_status()\n",
" df = pd.json_normalize(r.json()[\"results\"])\n",
" return df\n",
" except:\n",
" return None\n",
"\n",
" res_df = pd.concat(\n",
" [per_gene(gene_name=gene_name) for gene_name in gene_list]\n",
" ).reset_index(drop=True)\n",
" return res_df\n",
"\n",
"\n",
"# Search for both pqtl and eqtl\n",
"xqtl_df = pd.concat(\n",
" [\n",
" extract_mr(\n",
" outcome_trait=OUTCOME_TRAIT, gene_list=gene_list, qtl_type=qtl_type\n",
" ).assign(qtl_type=qtl_type)\n",
" for qtl_type in [\"pQTL\", \"eQTL\"]\n",
" ]\n",
").reset_index(drop=True)\n",
"xqtl_df"
]
},
{
"cell_type": "markdown",
"id": "collectible-banking",
"metadata": {
"papermill": {
"duration": 0.019627,
"end_time": "2021-03-22T10:46:41.713000",
"exception": false,
"start_time": "2021-03-22T10:46:41.693373",
"status": "completed"
},
"tags": []
},
"source": [
"## Using literature evidence for results enrichment and triangulation"
]
},
{
"cell_type": "markdown",
"id": "restricted-lottery",
"metadata": {
"papermill": {
"duration": 0.019326,
"end_time": "2021-03-22T10:46:41.752229",
"exception": false,
"start_time": "2021-03-22T10:46:41.732903",
"status": "completed"
},
"tags": []
},
"source": [
"Can we find evidence in the literature where these genes are found to be associated with IBD to increase our level of confidence in MR results or to provide alternative evidence where MR results to not exist?\n",
"\n",
"We can use the\n",
"[GET /gene/literature](http://docs.epigraphdb.org/api/api-endpoints/#get-geneliterature)\n",
"endpoint to get data on the literature evidence for the set of genes."
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "weird-monkey",
"metadata": {
"execution": {
"iopub.execute_input": "2021-03-22T10:46:41.781434Z",
"iopub.status.busy": "2021-03-22T10:46:41.781072Z",
"iopub.status.idle": "2021-03-22T10:46:54.985312Z",
"shell.execute_reply": "2021-03-22T10:46:54.984858Z"
},
"papermill": {
"duration": 13.221524,
"end_time": "2021-03-22T10:46:54.985425",
"exception": false,
"start_time": "2021-03-22T10:46:41.763901",
"status": "completed"
},
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" pubmed_id | \n",
" gene.name | \n",
" lt.id | \n",
" lt.name | \n",
" lt.type | \n",
" st.predicate | \n",
" literature_count | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" [23131344] | \n",
" IL23R | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" PREDISPOSES | \n",
" 1 | \n",
"
\n",
" \n",
" 1 | \n",
" [21155887, 17484863] | \n",
" IL23R | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" NEG_ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 2 | \n",
" [31728561] | \n",
" IL23R | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" CAUSES | \n",
" 1 | \n",
"
\n",
" \n",
" 3 | \n",
" [21155887, 18383521, 18383363, 25159710, 18341... | \n",
" IL23R | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 21 | \n",
"
\n",
" \n",
" 4 | \n",
" [27852544] | \n",
" IL23R | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 1 | \n",
"
\n",
" \n",
" 5 | \n",
" [21557945, 19030026] | \n",
" CSF2 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 6 | \n",
" [17206685] | \n",
" CSF2 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 1 | \n",
"
\n",
" \n",
" 7 | \n",
" [23891915] | \n",
" IFNA1 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" TREATS | \n",
" 1 | \n",
"
\n",
" \n",
" 8 | \n",
" [24975266] | \n",
" IFNA1 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" PREVENTS | \n",
" 1 | \n",
"
\n",
" \n",
" 9 | \n",
" [20951137, 28174758, 9836081] | \n",
" IFNA1 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 3 | \n",
"
\n",
" \n",
" 10 | \n",
" [19519446] | \n",
" IFNG | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" TREATS | \n",
" 1 | \n",
"
\n",
" \n",
" 11 | \n",
" [3139380] | \n",
" IFNG | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" CAUSES | \n",
" 1 | \n",
"
\n",
" \n",
" 12 | \n",
" [19740775, 18452147] | \n",
" IFNG | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 13 | \n",
" [10403730] | \n",
" IFNG | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 1 | \n",
"
\n",
" \n",
" 14 | \n",
" [16573780, 27917223, 19184348, 28551707, 25999... | \n",
" IL10 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 13 | \n",
"
\n",
" \n",
" 15 | \n",
" [27468578, 25296012] | \n",
" IL10 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 2 | \n",
"
\n",
" \n",
" 16 | \n",
" [27468578] | \n",
" IL10 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" PREDISPOSES | \n",
" 1 | \n",
"
\n",
" \n",
" 17 | \n",
" [11271474] | \n",
" IL10 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" NEG_PREDISPOSES | \n",
" 1 | \n",
"
\n",
" \n",
" 18 | \n",
" [24519095, 29023267, 17628614] | \n",
" IL10 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" CAUSES | \n",
" 3 | \n",
"
\n",
" \n",
" 19 | \n",
" [18383521, 23573954, 30541240, 22479607, 19817... | \n",
" IL12B | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 6 | \n",
"
\n",
" \n",
" 20 | \n",
" [11023669, 22741617] | \n",
" IL13 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 21 | \n",
" [9609761, 11023669] | \n",
" IL15 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 22 | \n",
" [30193869, 21576383] | \n",
" IL17A | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 23 | \n",
" [30193869, 21994045, 18088064, 21576383] | \n",
" IL17F | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 4 | \n",
"
\n",
" \n",
" 24 | \n",
" [6607860, 6237813, 1587419] | \n",
" IL2 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 3 | \n",
"
\n",
" \n",
" 25 | \n",
" [19201773] | \n",
" IL22 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 1 | \n",
"
\n",
" \n",
" 26 | \n",
" [30193869, 27029486, 18753178, 18499066] | \n",
" IL23A | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 4 | \n",
"
\n",
" \n",
" 27 | \n",
" [10477546] | \n",
" IL4 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" DISRUPTS | \n",
" 1 | \n",
"
\n",
" \n",
" 28 | \n",
" [7806044, 8964392, 9389741] | \n",
" IL4 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 3 | \n",
"
\n",
" \n",
" 29 | \n",
" [15766556] | \n",
" IL6 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" TREATS | \n",
" 1 | \n",
"
\n",
" \n",
" 30 | \n",
" [11204808] | \n",
" IL6 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" NEG_ASSOCIATED_WITH | \n",
" 1 | \n",
"
\n",
" \n",
" 31 | \n",
" [25145003] | \n",
" IL6 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" CAUSES | \n",
" 1 | \n",
"
\n",
" \n",
" 32 | \n",
" [11204808, 7683293] | \n",
" IL6 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 33 | \n",
" [24120915] | \n",
" IL6 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 1 | \n",
"
\n",
" \n",
" 34 | \n",
" [29788053] | \n",
" IL9 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" CAUSES | \n",
" 1 | \n",
"
\n",
" \n",
" 35 | \n",
" [28652656, 11515847] | \n",
" IL9 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 36 | \n",
" [31158699] | \n",
" IL9 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 1 | \n",
"
\n",
" \n",
" 37 | \n",
" [31069840, 19817673, 20627814, 22269120] | \n",
" JAK2 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 4 | \n",
"
\n",
" \n",
" 38 | \n",
" [27852544] | \n",
" JAK2 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 1 | \n",
"
\n",
" \n",
" 39 | \n",
" [17600378, 9882195] | \n",
" NFKB1 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 40 | \n",
" [21637825, 20004201] | \n",
" PIK3CA | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 41 | \n",
" [30006408] | \n",
" RORC | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 1 | \n",
"
\n",
" \n",
" 42 | \n",
" [28770550] | \n",
" STAT3 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" CAUSES | \n",
" 1 | \n",
"
\n",
" \n",
" 43 | \n",
" [21733838] | \n",
" STAT3 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AUGMENTS | \n",
" 1 | \n",
"
\n",
" \n",
" 44 | \n",
" [21631466, 25132422, 28785144, 19817673, 20627... | \n",
" STAT3 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 9 | \n",
"
\n",
" \n",
" 45 | \n",
" [27852544, 21994179] | \n",
" STAT3 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 2 | \n",
"
\n",
" \n",
" 46 | \n",
" [27697608] | \n",
" TSLP | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" CAUSES | \n",
" 1 | \n",
"
\n",
" \n",
" 47 | \n",
" [21318591, 27697608] | \n",
" TSLP | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" ASSOCIATED_WITH | \n",
" 2 | \n",
"
\n",
" \n",
" 48 | \n",
" [26432894, 26432894] | \n",
" TYK2 | \n",
" C0021390 | \n",
" Inflammatory Bowel Diseases | \n",
" [dsyn] | \n",
" AFFECTS | \n",
" 2 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" pubmed_id gene.name lt.id \\\n",
"0 [23131344] IL23R C0021390 \n",
"1 [21155887, 17484863] IL23R C0021390 \n",
"2 [31728561] IL23R C0021390 \n",
"3 [21155887, 18383521, 18383363, 25159710, 18341... IL23R C0021390 \n",
"4 [27852544] IL23R C0021390 \n",
"5 [21557945, 19030026] CSF2 C0021390 \n",
"6 [17206685] CSF2 C0021390 \n",
"7 [23891915] IFNA1 C0021390 \n",
"8 [24975266] IFNA1 C0021390 \n",
"9 [20951137, 28174758, 9836081] IFNA1 C0021390 \n",
"10 [19519446] IFNG C0021390 \n",
"11 [3139380] IFNG C0021390 \n",
"12 [19740775, 18452147] IFNG C0021390 \n",
"13 [10403730] IFNG C0021390 \n",
"14 [16573780, 27917223, 19184348, 28551707, 25999... IL10 C0021390 \n",
"15 [27468578, 25296012] IL10 C0021390 \n",
"16 [27468578] IL10 C0021390 \n",
"17 [11271474] IL10 C0021390 \n",
"18 [24519095, 29023267, 17628614] IL10 C0021390 \n",
"19 [18383521, 23573954, 30541240, 22479607, 19817... IL12B C0021390 \n",
"20 [11023669, 22741617] IL13 C0021390 \n",
"21 [9609761, 11023669] IL15 C0021390 \n",
"22 [30193869, 21576383] IL17A C0021390 \n",
"23 [30193869, 21994045, 18088064, 21576383] IL17F C0021390 \n",
"24 [6607860, 6237813, 1587419] IL2 C0021390 \n",
"25 [19201773] IL22 C0021390 \n",
"26 [30193869, 27029486, 18753178, 18499066] IL23A C0021390 \n",
"27 [10477546] IL4 C0021390 \n",
"28 [7806044, 8964392, 9389741] IL4 C0021390 \n",
"29 [15766556] IL6 C0021390 \n",
"30 [11204808] IL6 C0021390 \n",
"31 [25145003] IL6 C0021390 \n",
"32 [11204808, 7683293] IL6 C0021390 \n",
"33 [24120915] IL6 C0021390 \n",
"34 [29788053] IL9 C0021390 \n",
"35 [28652656, 11515847] IL9 C0021390 \n",
"36 [31158699] IL9 C0021390 \n",
"37 [31069840, 19817673, 20627814, 22269120] JAK2 C0021390 \n",
"38 [27852544] JAK2 C0021390 \n",
"39 [17600378, 9882195] NFKB1 C0021390 \n",
"40 [21637825, 20004201] PIK3CA C0021390 \n",
"41 [30006408] RORC C0021390 \n",
"42 [28770550] STAT3 C0021390 \n",
"43 [21733838] STAT3 C0021390 \n",
"44 [21631466, 25132422, 28785144, 19817673, 20627... STAT3 C0021390 \n",
"45 [27852544, 21994179] STAT3 C0021390 \n",
"46 [27697608] TSLP C0021390 \n",
"47 [21318591, 27697608] TSLP C0021390 \n",
"48 [26432894, 26432894] TYK2 C0021390 \n",
"\n",
" lt.name lt.type st.predicate literature_count \n",
"0 Inflammatory Bowel Diseases [dsyn] PREDISPOSES 1 \n",
"1 Inflammatory Bowel Diseases [dsyn] NEG_ASSOCIATED_WITH 2 \n",
"2 Inflammatory Bowel Diseases [dsyn] CAUSES 1 \n",
"3 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 21 \n",
"4 Inflammatory Bowel Diseases [dsyn] AFFECTS 1 \n",
"5 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"6 Inflammatory Bowel Diseases [dsyn] AFFECTS 1 \n",
"7 Inflammatory Bowel Diseases [dsyn] TREATS 1 \n",
"8 Inflammatory Bowel Diseases [dsyn] PREVENTS 1 \n",
"9 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 3 \n",
"10 Inflammatory Bowel Diseases [dsyn] TREATS 1 \n",
"11 Inflammatory Bowel Diseases [dsyn] CAUSES 1 \n",
"12 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"13 Inflammatory Bowel Diseases [dsyn] AFFECTS 1 \n",
"14 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 13 \n",
"15 Inflammatory Bowel Diseases [dsyn] AFFECTS 2 \n",
"16 Inflammatory Bowel Diseases [dsyn] PREDISPOSES 1 \n",
"17 Inflammatory Bowel Diseases [dsyn] NEG_PREDISPOSES 1 \n",
"18 Inflammatory Bowel Diseases [dsyn] CAUSES 3 \n",
"19 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 6 \n",
"20 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"21 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"22 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"23 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 4 \n",
"24 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 3 \n",
"25 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 1 \n",
"26 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 4 \n",
"27 Inflammatory Bowel Diseases [dsyn] DISRUPTS 1 \n",
"28 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 3 \n",
"29 Inflammatory Bowel Diseases [dsyn] TREATS 1 \n",
"30 Inflammatory Bowel Diseases [dsyn] NEG_ASSOCIATED_WITH 1 \n",
"31 Inflammatory Bowel Diseases [dsyn] CAUSES 1 \n",
"32 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"33 Inflammatory Bowel Diseases [dsyn] AFFECTS 1 \n",
"34 Inflammatory Bowel Diseases [dsyn] CAUSES 1 \n",
"35 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"36 Inflammatory Bowel Diseases [dsyn] AFFECTS 1 \n",
"37 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 4 \n",
"38 Inflammatory Bowel Diseases [dsyn] AFFECTS 1 \n",
"39 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"40 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"41 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 1 \n",
"42 Inflammatory Bowel Diseases [dsyn] CAUSES 1 \n",
"43 Inflammatory Bowel Diseases [dsyn] AUGMENTS 1 \n",
"44 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 9 \n",
"45 Inflammatory Bowel Diseases [dsyn] AFFECTS 2 \n",
"46 Inflammatory Bowel Diseases [dsyn] CAUSES 1 \n",
"47 Inflammatory Bowel Diseases [dsyn] ASSOCIATED_WITH 2 \n",
"48 Inflammatory Bowel Diseases [dsyn] AFFECTS 2 "
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def extract_literature(outcome_trait, gene_list):\n",
" def per_gene(gene_name):\n",
" endpoint = \"/gene/literature\"\n",
" url = f\"{API_URL}{endpoint}\"\n",
" params = {\"gene_name\": gene_name, \"object_name\": outcome_trait.lower()}\n",
" r = requests.get(url, params=params)\n",
" try:\n",
" r.raise_for_status()\n",
" res_df = pd.json_normalize(r.json()[\"results\"])\n",
" if len(res_df) > 0:\n",
" res_df = res_df.assign(\n",
" literature_count=lambda df: df[\"pubmed_id\"].apply(lambda x: len(x))\n",
" )\n",
" return res_df\n",
" except:\n",
" return None\n",
"\n",
" res_df = pd.concat(\n",
" [per_gene(gene_name=gene_name) for gene_name in gene_list]\n",
" ).reset_index(drop=True)\n",
" return res_df\n",
"\n",
"\n",
"literature_df = extract_literature(outcome_trait=OUTCOME_TRAIT, gene_list=gene_list)\n",
"\n",
"literature_df"
]
},
{
"cell_type": "markdown",
"id": "traditional-poison",
"metadata": {
"papermill": {
"duration": 0.018152,
"end_time": "2021-03-22T10:46:55.019486",
"exception": false,
"start_time": "2021-03-22T10:46:55.001334",
"status": "completed"
},
"tags": []
},
"source": [
"## Reference\n",
"\n",
"- Duerr RH, Taylor KD, Brant SR, Rioux JD, Silverberg MS, Daly MJ, Steinhart AH, Abraham C, Regueiro M, Griffiths A, others. 2006. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science 314:1461–1463.\n",
"\n",
"- Finan C, Gaulton A, Kruger FA, Lumbers RT, Shah T, Engmann J, Galver L, Kelley R, Karlsson A, Santos R, others. 2017. The druggable genome and support for target identification and validation in drug development. Science translational medicine 9:eaag1166.\n",
"\n",
"- Momozawa Y, Mni M, Nakamura K, Coppieters W, Almer S, Amininejad L, Cleynen I, Colombel J-F, De Rijk P, Dewit O, others. 2011. Resequencing of positional candidates identifies low frequency IL23R coding variants protecting against inflammatory bowel disease. Nature genetics 43:43–47.\n",
"\n",
"- Zheng J, Brumpton BM, Bronson PG, Liu Y, Haycock P, Elsworth B, Haberland V, Baird D, Walker V, Robinson JW, John S, Prins B, Runz H, Nelson MR, Hurle M, Hemani G, Asvold BO, Butterworth A, Smith GD, Scott RA, Gaunt TR. 2019. Systematic Mendelian randomization and colocalization analyses of the plasma proteome and blood transcriptome to prioritize drug targets for complex disease."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.7"
},
"papermill": {
"default_parameters": {},
"duration": 24.193169,
"end_time": "2021-03-22T10:46:56.306581",
"environment_variables": {},
"exception": null,
"input_path": "paper-case-studies/case-2-alt-drug-target.ipynb",
"output_path": "paper-case-studies/case-2-alt-drug-target.ipynb",
"parameters": {
"API_URL": "https://api.epigraphdb.org"
},
"start_time": "2021-03-22T10:46:32.113412",
"version": "2.3.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}