{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# X2K API Tutorial Notebook\n",
"February 25st, 2019\n",
"\n",
"This Jupyter Notebook contains an interactive tutorial for **running the Expression2Kinases (X2K) API** using Python 3.\n",
"\n",
"### Table of Contents\n",
"The notebook contains the following sections:\n",
"1. **API Documentation** - shows how to programmatically analyze your gene list in Python.\n",
"2. **Using the X2K API** - overview of the input parameters and output of the API.\n",
"3. **Interpreting the results** - gives an overview of the structure and meaning of the analysis results.\n",
" * **Transcription Factor Enrichment Analysis** (ChEA)\n",
" * **Protein-Protein Interaction Expansion** (G2N)\n",
" * **Kinase Enrichment Analysis** (KEA)\n",
" * **Expression2Kinases** (X2K)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Using the X2K API\n",
"The X2K API allows for programmatic analysis of an input gene list.\n",
"\n",
"The `run_X2K()` function displayed below can be used to analyze a gene list and load the results in a Python dictionary by performing a **POST request**.\n",
"\n",
"The function requires only one input, `input_genes`, **a list of gene symbols ** to be analyzed. Additional optional parameters can be specified with the `options` parameters."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# Import modules\n",
"import requests\n",
"import json\n",
"\n",
"\n",
"##### Function to run X2K\n",
"### Input: a Python list of gene symbols\n",
"### Output: a dictionary containing the results of X2K, ChEA, G2N, KEA.\n",
"\n",
"def run_X2K(input_genes, options={}):\n",
" # Set default options\n",
" all_options = {'included_organisms': 'both',\n",
" 'TF-target gene background database used for enrichment': 'ChEA & ENCODE Consensus',\n",
" 'sort transcription factors by': 'p-value',\n",
" 'min_network_size': 10,\n",
" 'number of top TFs': 10,\n",
" 'path_length': 2,\n",
" 'min_number_of_articles_supporting_interaction': 0,\n",
" 'max_number_of_interactions_per_protein': 200,\n",
" 'max_number_of_interactions_per_article': 100,\n",
" 'enable_BioGRID': True,\n",
" 'enable_IntAct': True,\n",
" 'enable_MINT': True,\n",
" 'enable_ppid': True,\n",
" 'enable_Stelzl': True,\n",
" 'kinase interactions to include': 'kea 2018',\n",
" 'sort kinases by': 'p-value'}\n",
"\n",
" # Override defaults with options\n",
" all_options.update(options)\n",
" all_options['text-genes'] = '\\n'.join(input_genes)\n",
"\n",
" # Perform request & get response\n",
" res = requests.post(\n",
" 'https://maayanlab.cloud/X2K/api',\n",
" files=[(k, (None, v)) for k, v in default_options.items()],\n",
" )\n",
"\n",
" # Read response\n",
" data = res.json()\n",
"\n",
" # Convert to dictionary\n",
" x2k_results = {key: json.loads(value) if key != 'input' else value for key, value in data.items()}\n",
"\n",
" # Clean results\n",
" x2k_results['ChEA'] = x2k_results['ChEA']['tfs']\n",
" x2k_results['G2N'] = x2k_results['G2N']['network']\n",
" x2k_results['KEA'] = x2k_results['KEA']['kinases']\n",
" x2k_results['X2K'] = x2k_results['X2K']['network']\n",
"\n",
" # Return results\n",
" return x2k_results\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"dict_keys(['X2K', 'ChEA', 'KEA', 'G2N', 'input'])"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Get input genes\n",
"input_genes = ['Nsun3', 'Polrmt', 'Nlrx1', 'Sfxn5', 'Zc3h12c', 'Slc25a39', 'Arsg', 'Defb29', 'Ndufb6', 'Zfand1',\n",
" 'Tmem77', '5730403B10Rik', 'Tlcd1', 'Psmc6', 'Slc30a6', 'LOC100047292', 'Lrrc40', 'Orc5l', 'Mpp7',\n",
" 'Unc119b', 'Prkaca', 'Tcn2', 'Psmc3ip', 'Pcmtd2', 'Acaa1a', 'Lrrc1', '2810432D09Rik', 'Sephs2', 'Sac3d1',\n",
" 'Tmlhe', 'LOC623451', 'Tsr2', 'Plekha7', 'Gys2', 'Arhgef12', 'Hibch', 'Lyrm2', 'Zbtb44', 'Entpd5',\n",
" 'Rab11fip2', 'Lipt1', 'Intu', 'Anxa13', 'Klf12', 'Sat2', 'Gal3st2', 'Vamp8', 'Fkbpl', 'Aqp11', 'Trap1',\n",
" 'Pmpcb', 'Tm7sf3', 'Rbm39', 'Bri3', 'Kdr', 'Zfp748', 'Nap1l1', 'Dhrs1', 'Lrrc56', 'Wdr20a', 'Stxbp2',\n",
" 'Klf1', 'Ufc1', 'Ccdc16', '9230114K14Rik', 'Rwdd3', '2610528K11Rik', 'Aco1', 'Cables1', 'LOC100047214',\n",
" 'Yars2', 'Lypla1', 'Kalrn', 'Gyk', 'Zfp787', 'Zfp655', 'Rabepk', 'Zfp650', '4732466D17Rik', 'Exosc4',\n",
" 'Wdr42a', 'Gphn', '2610528J11Rik', '1110003E01Rik', 'Mdh1', '1200014M14Rik', 'AW209491', 'Mut',\n",
" '1700123L14Rik', '2610036D13Rik', 'Cox15', 'Tmem30a', 'Nsmce4a', 'Tm2d2', 'Rhbdd3', 'Atxn2', 'Nfs1',\n",
" '3110001I20Rik', 'BC038156', 'LOC100047782', '2410012H22Rik', 'Rilp', 'A230062G08Rik', 'Pttg1ip', 'Rab1',\n",
" 'Afap1l1', 'Lyrm5', '2310026E23Rik', 'C330002I19Rik', 'Zfyve20', 'Poli', 'Tomm70a', 'Slc7a6os', 'Mat2b',\n",
" '4932438A13Rik', 'Lrrc8a', 'Smo', 'Nupl2', 'Trpc2', 'Arsk', 'D630023B12Rik', 'Mtfr1', '5730414N17Rik',\n",
" 'Scp2', 'Zrsr1', 'Nol7', 'C330018D20Rik', 'Ift122', 'LOC100046168', 'D730039F16Rik', 'Scyl1',\n",
" '1700023B02Rik', '1700034H14Rik', 'Fbxo8', 'Paip1', 'Tmem186', 'Atpaf1', 'LOC100046254', 'LOC100047604',\n",
" 'Coq10a', 'Fn3k', 'Sipa1l1', 'Slc25a16', 'Slc25a40', 'Rps6ka5', 'Trim37', 'Lrrc61', 'Abhd3', 'Gbe1',\n",
" 'Parp16', 'Hsd3b2', 'Esm1', 'Dnajc18', 'Dolpp1', 'Lass2', 'Wdr34', 'Rfesd', 'Cacnb4', '2310042D19Rik',\n",
" 'Srr', 'Bpnt1', '6530415H11Rik', 'Clcc1', 'Tfb1m', '4632404H12Rik', 'D4Bwg0951e', 'Med14', 'Adhfe1',\n",
" 'Thtpa', 'Cat', 'Ell3', 'Akr7a5', 'Mtmr14', 'Timm44', 'Sf1', 'Ipp', 'Iah1', 'Trim23', 'Wdr89', 'Gstz1',\n",
" 'Cradd', '2510006D16Rik', 'Fbxl6', 'LOC100044400', 'Zfp106', 'Cd55', '0610013E23Rik', 'Afmid', 'Tmem86a',\n",
" 'Aldh6a1', 'Dalrd3', 'Smyd4', 'Nme7', 'Fars2', 'Tasp1', 'Cldn10', 'A930005H10Rik', 'Slc9a6', 'Adk',\n",
" 'Rbks', '2210016F16Rik', 'Vwce', '4732435N03Rik', 'Zfp11', 'Vldlr', '9630013D21Rik', '4933407N01Rik',\n",
" 'Fahd1', 'Mipol1', '1810019D21Rik', '1810049H13Rik', 'Tfam', 'Paics', '1110032A03Rik', 'LOC100044139',\n",
" 'Dnajc19', 'BC016495', 'A930041I02Rik', 'Rqcd1', 'Usp34', 'Zcchc3', 'H2afj', 'Phf7', '4921508D12Rik',\n",
" 'Kmo', 'Prpf18', 'Mcat', 'Txndc4', '4921530L18Rik', 'Vps13b', 'Scrn3', 'Tor1a', 'AI316807', 'Acbd4',\n",
" 'Fah', 'Apool', 'Col4a4', 'Lrrc19', 'Gnmt', 'Nr3c1', 'Sip1', 'Ascc1', 'Fech', 'Abhd14a', 'Arhgap18',\n",
" '2700046G09Rik', 'Yme1l1', 'Gk5', 'Glo1', 'Sbk1', 'Cisd1', '2210011C24Rik', 'Nxt2', 'Notum', 'Ankrd42',\n",
" 'Ube2e1', 'Ndufv1', 'Slc33a1', 'Cep68', 'Rps6kb1', 'Hyi', 'Aldh1a3', 'Mynn', '3110048L19Rik', 'Rdh14',\n",
" 'Proz', 'Gorasp1', 'LOC674449', 'Zfp775', '5430437P03Rik', 'Npy', 'Adh5', 'Sybl1', '4930432O21Rik',\n",
" 'Nat9', 'LOC100048387', 'Mettl8', 'Eny2', '2410018G20Rik', 'Pgm2', 'Fgfr4', 'Mobkl2b', 'Atad3a',\n",
" '4932432K03Rik', 'Dhtkd1', 'Ubox5', 'A530050D06Rik', 'Zdhhc5', 'Mgat1', 'Nudt6', 'Tpmt', 'Wbscr18',\n",
" 'LOC100041586', 'Cdk5rap1', '4833426J09Rik', 'Myo6', 'Cpt1a', 'Gadd45gip1', 'Tmbim4', '2010309E21Rik',\n",
" 'Asb9', '2610019F03Rik', '7530414M10Rik', 'Atp6v1b2', '2310068J16Rik', 'Ddt', 'Klhdc4', 'Hpn', 'Lifr',\n",
" 'Ovol1', 'Nudt12', 'Cdan1', 'Fbxo9', 'Fbxl3', 'Hoxa7', 'Aldh8a1', '3110057O12Rik', 'Abhd11', 'Psmb1',\n",
" 'ENSMUSG00000074286', 'Chpt1', 'Oxsm', '2310009A05Rik', '1700001L05Rik', 'Zfp148', '39509', 'Mrpl9',\n",
" 'Tmem80', '9030420J04Rik', 'Naglu', 'Plscr2', 'Agbl3', 'Pex1', 'Cno', 'Neo1', 'Asf1a', 'Tnfsf5ip1',\n",
" 'Pkig', 'AI931714', 'D130020L05Rik', 'Cntd1', 'Clec2h', 'Zkscan1', '1810044D09Rik', 'Mettl7a', 'Siae',\n",
" 'Fbxo3', 'Fzd5', 'Tmem166', 'Tmed4', 'Gpr155', 'Rnf167', 'Sptlc1', 'Riok2', 'Tgds', 'Pms1', 'Pitpnc1',\n",
" 'Pcsk7', '4933403G14Rik', 'Ei24', 'Crebl2', 'Tln1', 'Mrpl35', '2700038C09Rik', 'Ubie', 'Osgepl1',\n",
" '2410166I05Rik', 'Wdr24', 'Ap4s1', 'Lrrc44', 'B3bp', 'Itfg1', 'Dmxl1', 'C1d']\n",
"\n",
"# Run X2K results\n",
"x2k_results = run_X2K(input_genes)\n",
"x2k_results.keys()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. X2K API Documentation\n",
"\n",
"### 2.1 API Inputs\n",
"A **full list of the input parameters** for the `run_X2K()` function is available below.\n",
"\n",
"The optional parameters can provided to the function in the `options` dictionary."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
" \n",
" Parameter | \n",
" Step | \n",
" Description | \n",
" Notes | \n",
"
\n",
"\n",
" \n",
" **input_genes** (required) | \n",
" X2K | \n",
" Contains the input gene set for the X2K analysis. | \n",
" A list of strings representing the input gene symbols. | \n",
"
\n",
" \n",
" *organism* (optional) | \n",
" ChEA | \n",
" The organism from which TF-target interaction data should be integrated. | \n",
" One of `('human_only', 'mouse_only', 'both')`. Default `'both'`. | \n",
"
\n",
" \n",
" *TF-target gene background database used for enrichment* (optional) | \n",
" ChEA | \n",
" The database from which TF-target interaction data should be integrated, | \n",
" One of `('ChEA 2015', 'ENCODE 2015', 'ChEA & ENCODE Consensus', 'Transfac and Jaspar', 'ChEA 2016', 'ARCHS4 TFs Coexp', 'CREEDS', 'Enrichr Submissions TF-Gene Coocurrence')` Default `'ChEA & ENCODE Consensus')`. | \n",
"
\n",
" \n",
" *sort transcription factors by* (optional) \n",
" | ChEA | \n",
" The method used to sort the top Transcription Factors identified by ChEA. | \n",
" One of `('p-value', 'rank', 'combined score')`. Default `'p-value'`. | \n",
"
\n",
" \n",
" *path_length* (optional) | \n",
" G2N | \n",
" The maximum Protein-Protein Interaction path length for the network expansion. | \n",
" Integer, default `2`. | \n",
"
\n",
" \n",
" *minimum_network_size* (optional)\n",
" | G2N | \n",
" The minimum size of the Protein-Protein interaction network generated using Genes2Networks. | \n",
" Integer, default `50`. | \n",
"
\n",
" \n",
" *min_number_of_articles_supporting_interaction* (optional) \n",
" | G2N | \n",
" The minimum number of published articles supporting a Protein-Protein Interaction for the expanded subnetwork. | \n",
" Integer, default `2`. | \n",
"
\n",
" \n",
" *max_number_of_interactions_per_protein* (optional) \n",
" | G2N | \n",
" The maximum number of physical interactions allowed for the proteins in the expanded subnetwork. | \n",
" Integer, default `200`. | \n",
"
\n",
" \n",
" *max_number_of_interactions_per_article* (optional) \n",
" | G2N | \n",
" The maximum number of physical interactions reported in each published article | \n",
" Integer, default `100`. | \n",
"
\n",
" \n",
" enable_Biocarta (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_BioGRID (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'true'`. | \n",
"
\n",
" \n",
" enable_BioPlex (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_DIP (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_huMAP (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_InnateDB (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_IntAct (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'true'`. | \n",
"
\n",
" \n",
" enable_KEGG (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_MINT (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'true'`. | \n",
"
\n",
" \n",
" enable_ppid (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'true'`. | \n",
"
\n",
" \n",
" enable_SNAVI (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_iREF (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_Stelzl (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'true'`. | \n",
"
\n",
" \n",
" enable_vidal (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_BIND (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_figeys (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" enable_HPRD (optional) \n",
" | G2N | \n",
" The Protein-Protein Interaction databases to integrate for generation of the expanded subnetwork. | \n",
" Either `'true'` or `'false'`. Default `'false'`. | \n",
"
\n",
" \n",
" *number_of_results* (optional) \n",
" | G2N | \n",
" The maximum network size of the expanded network generated using Genes2Networks. | \n",
" Integer, default `50`. | \n",
"
\n",
" \n",
" *kinase interactions to include* \n",
" | KEA | \n",
" Kinase interactions databases to include. | \n",
" One of `('p-value', 'rank', 'combined score')`. Default `'p-value'`. | \n",
"
\n",
" \n",
" *sort_kinases_by* (optional) \n",
" | KEA | \n",
" The method used to sort the top Transcription Factors identified by KEA. | \n",
" One of `('kea 2018', 'ARCHS4', 'iPTMnet', 'NetworkIN', 'Phospho.ELM', 'Phosphopoint', 'PhosphoPlus', 'MINT')`. Default `'kea 2018'`. | \n",
"
\n",
"
\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2.2 API Output\n",
"The `run_X2K()` function returns results as `dict` containing **four keys**, whose contents are described below."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" \n",
" Key | \n",
" Notes | \n",
" Contents | \n",
"
\n",
"\n",
" \n",
" **ChEA** | \n",
" Contains the results of the **Transcription Factor Enrichment Analysis**, generated using ChEA. | \n",
" A `list` of `dict`s containing information on the top TFs predicted to regulate the input genes. | \n",
"
\n",
" \n",
" \n",
" **G2N** | \n",
" Contains the results of the **Protein-Protein Interaction Expansion**, generated using Genes2Networks (G2N). | \n",
" A `dict` containing two keys:\n",
" \n",
" - nodes: A `list` containing information on the nodes of the expanded subnetwork.
\n",
" - interactions: A `list` containing information on the edges of the expanded subnetwork.
\n",
" \n",
" | \n",
"
\n",
" \n",
" \n",
" **KEA** | \n",
" Contains the results of the **Kinase Enrichment Analysis**, generated using KEA. | \n",
" A `list` of `dict`s containing information on the top kinases predicted to regulate the subnetwork identified by G2N. | \n",
"
\n",
" \n",
" \n",
" **X2K** | \n",
" Contains the **Expression2Kinases network**, generated by integrating the results of ChEA, G2N and KEA. | \n",
" A `dict` containing two keys:\n",
" \n",
" - nodes: A `list` containing information on the nodes of the final X2K network.
\n",
" - interactions: A `list` containing information on the edges of the final X2K network.
\n",
" \n",
" | \n",
"
\n",
" \n",
"
\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Interpreting the Results\n",
"\n",
"### 3.1 ChEA results\n",
"The results for the ChEA analysis can be accessed in x2k_results['ChEA']
. Here, the results are converted to a pandas DataFrame for easier interpretation."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" combinedScore | \n",
" enrichedTargets | \n",
" meta | \n",
" name | \n",
" pvalue | \n",
" simpleName | \n",
" zscore | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0.0 | \n",
" [NUDT6, ZFYVE20, YME1L1, ARSK, NOL7, TSR2, PMP... | \n",
" {} | \n",
" NR2C2_ENCODE | \n",
" 0.000643 | \n",
" NR2C2 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 0.0 | \n",
" [MTMR14, MED14, POLRMT, ZKSCAN1, VPS13B, NUDT6... | \n",
" {} | \n",
" GABPA_ENCODE | \n",
" 0.000643 | \n",
" GABPA | \n",
" 0.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.0 | \n",
" [SF1, ENY2, ACO1, LYPLA1, MTFR1, TLCD1, ZBTB44... | \n",
" {} | \n",
" ERG_CHEA | \n",
" 0.000679 | \n",
" ERG | \n",
" 0.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.0 | \n",
" [SF1, CREBL2, AP4S1, ZKSCAN1, VPS13B, C1D, TGD... | \n",
" {} | \n",
" TAF1_ENCODE | \n",
" 0.002002 | \n",
" TAF1 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.0 | \n",
" [SF1, MTMR14, TSR2, ZKSCAN1, PGM2, VPS13B, TMB... | \n",
" {} | \n",
" ELF1_ENCODE | \n",
" 0.003928 | \n",
" ELF1 | \n",
" 0.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" combinedScore enrichedTargets meta \\\n",
"0 0.0 [NUDT6, ZFYVE20, YME1L1, ARSK, NOL7, TSR2, PMP... {} \n",
"1 0.0 [MTMR14, MED14, POLRMT, ZKSCAN1, VPS13B, NUDT6... {} \n",
"2 0.0 [SF1, ENY2, ACO1, LYPLA1, MTFR1, TLCD1, ZBTB44... {} \n",
"3 0.0 [SF1, CREBL2, AP4S1, ZKSCAN1, VPS13B, C1D, TGD... {} \n",
"4 0.0 [SF1, MTMR14, TSR2, ZKSCAN1, PGM2, VPS13B, TMB... {} \n",
"\n",
" name pvalue simpleName zscore \n",
"0 NR2C2_ENCODE 0.000643 NR2C2 0.0 \n",
"1 GABPA_ENCODE 0.000643 GABPA 0.0 \n",
"2 ERG_CHEA 0.000679 ERG 0.0 \n",
"3 TAF1_ENCODE 0.002002 TAF1 0.0 \n",
"4 ELF1_ENCODE 0.003928 ELF1 0.0 "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Import pandas\n",
"import pandas as pd\n",
"\n",
"# Read results\n",
"chea_dataframe = pd.DataFrame(x2k_results['ChEA'])\n",
"chea_dataframe.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"** Table 1 | Results of the ChEA analysis. ** Each row represents a transcription factor predicted to regulate the input gene list.\n",
"\n",
"### 3.2 G2N Results\n",
"The results for the G2N analysis can be accessed in x2k_results['G2N']
.\n",
"\n",
"The results are stored in a dictionary containing two keys:\n",
"* `edges`\n",
"* `interactions`"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" type | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" NR2C2_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
" 1 | \n",
" GABPA_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
" 2 | \n",
" ERG_CHEA | \n",
" tf | \n",
"
\n",
" \n",
" 3 | \n",
" TAF1_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
" 4 | \n",
" ELF1_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name type\n",
"0 NR2C2_ENCODE tf\n",
"1 GABPA_ENCODE tf\n",
"2 ERG_CHEA tf\n",
"3 TAF1_ENCODE tf\n",
"4 ELF1_ENCODE tf"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# G2N nodes dataframe\n",
"g2n_nodes_dataframe = pd.DataFrame(x2k_results['G2N']['nodes']).drop('pvalue', axis=1)\n",
"g2n_nodes_dataframe.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"** Table 2 | Nodes of the Genes2Networks expanded subnetwork. ** Each row represents a node in the expanded subnetwork. The type column indicates whether the node is a Transcription Factor identified by ChEA, or an intermediate protein."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" source | \n",
" target | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 10 | \n",
" 6 | \n",
"
\n",
" \n",
" 1 | \n",
" 10 | \n",
" 7 | \n",
"
\n",
" \n",
" 2 | \n",
" 10 | \n",
" 10 | \n",
"
\n",
" \n",
" 3 | \n",
" 10 | \n",
" 14 | \n",
"
\n",
" \n",
" 4 | \n",
" 10 | \n",
" 15 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" source target\n",
"0 10 6\n",
"1 10 7\n",
"2 10 10\n",
"3 10 14\n",
"4 10 15"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# G2N edges dataframe\n",
"g2n_edges_dataframe = pd.DataFrame(x2k_results['G2N']['interactions'])\n",
"g2n_edges_dataframe.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"** Table 3 | Edges of the Genes2Networks expanded subnetwork. ** Each row represents an edge in the expanded subnetwork generated by G2N on the top transcription factors identified by ChEA.\n",
"\n",
"### 3.3 KEA Results\n",
"The results for the KEA analysis can be accessed in x2k_results['KEA']
."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" combinedScore | \n",
" enrichedSubstrates | \n",
" name | \n",
" pvalue | \n",
" zscore | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0.0 | \n",
" [RB1, SUB1, JUN, RCOR1, YY1, SP1, HDAC4, HDAC2... | \n",
" CK2ALPHA | \n",
" 2.373774e-09 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 1 | \n",
" 0.0 | \n",
" [RB1, JUN, SMAD4, SMAD3, SP1, SREBF1, HDAC4] | \n",
" ERK2 | \n",
" 5.720848e-07 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 2 | \n",
" 0.0 | \n",
" [JUN, SP1, SP3, SREBF1, ZNF384, HDAC2, GABPA, ... | \n",
" CDK2 | \n",
" 2.061995e-06 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 3 | \n",
" 0.0 | \n",
" [RB1, MED1, JUN, EZH2, ERG, SP1, SP3, SREBF1, ... | \n",
" MAPK1 | \n",
" 7.049956e-06 | \n",
" 0.0 | \n",
"
\n",
" \n",
" 4 | \n",
" 0.0 | \n",
" [JUN, SMAD4, SMAD3, SP1, SREBF1, HDAC4] | \n",
" ERK1 | \n",
" 1.296236e-05 | \n",
" 0.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" combinedScore enrichedSubstrates name \\\n",
"0 0.0 [RB1, SUB1, JUN, RCOR1, YY1, SP1, HDAC4, HDAC2... CK2ALPHA \n",
"1 0.0 [RB1, JUN, SMAD4, SMAD3, SP1, SREBF1, HDAC4] ERK2 \n",
"2 0.0 [JUN, SP1, SP3, SREBF1, ZNF384, HDAC2, GABPA, ... CDK2 \n",
"3 0.0 [RB1, MED1, JUN, EZH2, ERG, SP1, SP3, SREBF1, ... MAPK1 \n",
"4 0.0 [JUN, SMAD4, SMAD3, SP1, SREBF1, HDAC4] ERK1 \n",
"\n",
" pvalue zscore \n",
"0 2.373774e-09 0.0 \n",
"1 5.720848e-07 0.0 \n",
"2 2.061995e-06 0.0 \n",
"3 7.049956e-06 0.0 \n",
"4 1.296236e-05 0.0 "
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# KEA Results\n",
"kea_dataframe = pd.DataFrame(x2k_results['KEA'])\n",
"kea_dataframe.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"** Table 4 | Results of the KEA analysis. ** Each row represents a protein kinase predicted to regulate the expanded subnetwork generated by G2N.\n",
"\n",
"### 3.4 X2K Results\n",
"The results for the X2K analysis can be accessed in x2k_results['X2K']
.\n",
"\n",
"The results are stored in a dictionary containing two keys:\n",
"* `nodes`\n",
"* `interactions`"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" name | \n",
" type | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" NR2C2_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
" 1 | \n",
" GABPA_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
" 2 | \n",
" ERG_CHEA | \n",
" tf | \n",
"
\n",
" \n",
" 3 | \n",
" TAF1_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
" 4 | \n",
" ELF1_ENCODE | \n",
" tf | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" name type\n",
"0 NR2C2_ENCODE tf\n",
"1 GABPA_ENCODE tf\n",
"2 ERG_CHEA tf\n",
"3 TAF1_ENCODE tf\n",
"4 ELF1_ENCODE tf"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# X2K nodes dataframe\n",
"x2k_nodes_dataframe = pd.DataFrame(x2k_results['X2K']['nodes']).drop('pvalue', axis=1)\n",
"x2k_nodes_dataframe.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"** Table 5 | Nodes of the final Expression2Kinases network. ** Each row represents a node in the final X2K network network. The type column indicates whether the node is a Transcription Factor identified by ChEA, an intermediate protein identified by G2N, or a protein kinase identified by KEA."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" source | \n",
" target | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 10 | \n",
" 33 | \n",
"
\n",
" \n",
" 1 | \n",
" 10 | \n",
" 30 | \n",
"
\n",
" \n",
" 2 | \n",
" 10 | \n",
" 25 | \n",
"
\n",
" \n",
" 3 | \n",
" 10 | \n",
" 9 | \n",
"
\n",
" \n",
" 4 | \n",
" 10 | \n",
" 6 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" source target\n",
"0 10 33\n",
"1 10 30\n",
"2 10 25\n",
"3 10 9\n",
"4 10 6"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# X2K edges dataframe\n",
"x2k_edges_dataframe = pd.DataFrame(x2k_results['X2K']['interactions'])\n",
"x2k_edges_dataframe.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"** Table 6 | Edges of the final Expression2Kinases subnetwork. ** Each row represents an edge in the final network identified by integrating the results of ChEA, G2N, and KEA."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"hide_input": false,
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.2"
}
},
"nbformat": 4,
"nbformat_minor": 2
}