{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "V_oh9DspLamP"
},
"source": [
"# ISB-CGC Community Notebooks\n",
"\n",
"Check out more notebooks at our [Community Notebooks Repository](https://github.com/isb-cgc/Community-Notebooks)!\n",
"\n",
"```\n",
"Title: Quick Start Guide to ISB-CGC\n",
"Author: Lauren Hagen\n",
"Created: 2019-06-20\n",
"Purpose: Painless intro to working in the cloud\n",
"URL: https://github.com/isb-cgc/Community-Notebooks/blob/master/Notebooks/Quick_Start_Guide_to_ISB_CGC.ipynb\n",
"Notes: \n",
"```\n",
"***"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "35Ew4aYCspKn"
},
"source": [
"# Quick Start Guide to ISB-CGC\n",
"[ISB-CGC](https://isb-cgc.appspot.com/)\n",
"\n",
"This Quick Start Guide is intended give an overview of the data available, to walk you though the steps of setting up your accounts, and get started with a basic example in python. If you have read the R version, you can skip to the Example section.\n",
"\n",
"## Access Requirements\n",
"* Google Account to access ISB-CGC\n",
"* [Google Cloud Account](https://console.cloud.google.com)\n",
"* Some knowledge of SQL\n",
"\n",
"## Access Suggestions\n",
"* Favored Programming Language (R or Python)\n",
"* Favored IDE (RStudio or Jupyter)\n",
"\n",
"## Outline for this Notebook\n",
"* Quick Overview of ISB-CGC\n",
"* About the Data on ISB-CGC\n",
"* Overview How to Access Data\n",
"* Account Set up\n",
"* ISB-CGC Web Interface\n",
"* Google Cloud Platform (GCP) and BigQuery Overview\n",
"* Example of Accessing Data with Python\n",
"* Where to go next"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "OAgOtUGpvR5h"
},
"source": [
"## Overview of ISB-CGC\n",
"The ISB-CGC provides both interactive and programmatic access to\n",
"data hosted by institutes such as the [Genomic Data Commons (GDC)](https://gdc.cancer.gov/) of the [National Cancer Institute (NCI)](https://www.cancer.gov/) and the [Wellcome Trust Sanger Institute](https://www.sanger.ac.uk/) while leveraging many aspects of the Google Cloud Platform. You can also import your own data to analyze it side by side with the datasets and share your data when you see fit."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"cellView": "form",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 421
},
"colab_type": "code",
"id": "C8FMZWeWZhzQ",
"outputId": "2fc93d29-f09a-4d8b-c1cd-115d655ad593"
},
"outputs": [
{
"data": {
"image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz\nODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj\nY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQED\nEQH/xAAbAAEAAgMBAQAAAAAAAAAAAAAAAwUBBAYCB//EAEoQAAEDAgIECQgIBQQBAwUAAAEAAgME\nERIhBTFSkhMUFkFRU1SR0QYiIzRhcXKxFTIzNYGhweFiY3OCskKTwvDxJETSNkNkg6L/xAAZAQEA\nAwEBAAAAAAAAAAAAAAAAAQIDBAX/xAAqEQEAAgIBBAEDAwUBAAAAAAAAAQIDERIEEyFRMTJBYRRS\nkTNxgaHwIv/aAAwDAQACEQMRAD8A+foiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIi\nAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIi\nICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAi\nIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiICIiAiIgIiIC\nIiAiIgIiICIiAiIgIiICIiAiIgIpuJ1PZ5twpxOp7PNuFBCim4nU9nm3CnE6ns824UEKKbidT2eb\ncKcTqezzbhQQopuJ1PZ5twpxOp7PNuFBCim4nU9nm3CnE6ns824UEKKbidT2ebcKcTqezzbhQQop\nuJ1PZ5twpxOp7PNuFBCim4nU9nm3CnE6ns824UEKKbidT2ebcKcTqezzbhQQopuJ1PZ5twpxOp7P\nNuFBCim4nU9nm3CnE6ns824UEKKbidT2ebcKcTqezzbhQQopuJ1PZ5twpxOp7PNuFBCim4nU9nm3\nCnE6ns824UEKKbidT2ebcKcTqezzbhQQopuJ1PZ5twpxOp7PNuFBCim4nU9nm3CnE6ns824UEKKb\nidT2ebcKcTqezzbhQQopuJ1PZ5twpxOp7PNuFBCim4nU9nm3CnE6ns824UEKKbidT2ebcKcTqezz\nbhQQopuJ1PZ5twpxOp7PNuFBCim4nU9nm3CnE6ns824UEKKbidT2ebcKcTqezzbhQQopuJ1PZ5tw\npxOp7PNuFBCim4nU9nm3CnE6ns824UEKKbidT2ebcKcTqezzbhQQopuJ1PZ5twpxOp7PNuFBCim4\nnU9nm3CnE6ns824UEKKbidT2ebcKcTqezzbhQfeRqCysDUFlAREQYVJpXynpNGVhppIpZHtAJLbW\nF/erxfOPK/8A+oZ/hb8kHU6P8rKCtqOBcH05IydLYA/irP6ToO2U/wDuBfKUsg+vRyMlYHxPa9h1\nOabgr2ue8iPuM/1XfouhQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBE\nRAREQEREBERAREQEREBERAREQFg6isrB1FAGoLKwNQWUBERBhfOvLAEeUExIIBa23tyX0VQ44JZp\nGOYC6K2IuAtnmg+Sovqom0eWtcH0+FxsDlmU4WiL2sY2J5c/B5oBsbE59xQU/kbE92gweEkjvK4j\nCBnq6Qr3gH9pm/8A58FM0Bos0AAcwRBFSuc6Hz3FxDnC557EhTKCk+xP9R/+RU6AiIgwtWr0lS0Z\nwzSgO2RmV7rpzTUUsw1saSFR0Oh+HDpdIPs+YeYMWd+lbY6VmOV58M72mJ1Vb02laOqfgjlGM6mu\nFrrcXKSaAqWzlkMsTy3P61iPwV/omd9RQMdLnI0ljj0kK2XHSscqTuEUvaZ1aG3JIyJhfI9rGjW5\nxsAvEdRDLh4OWN+K9sLgb21rU03Sy1ui5aeH67yy2rKzgTry1BatLoY0WlYJ43PlZhlMsj8IOJ2A\nDJoA1N5gudqtnzRRsL3ysa1psXOcAAUjnhlDTHKx+K9sLgb21qm0ho6qm0GYWMLqh1SJnBpbf7TF\nliyuB0rSg0XpSNsxiDoy8yOGJ7WvNxHrw+aCcLgCNWXOg6jG3HgxDFa9r52WVzDtG6VPnsErRhLW\nsM95AzG0lmO+sgOzvlfWteqbpOnNPHK+qc52ULWTOJjHCankfW82wzvqPvQdc17XtxNcHN6QbhZB\nDgCCCDqIXL/RmlzWU/nScBwb2SN4ezRcu5gdeY5j+CvNDwPptE0sErHMkjjDXNc/EQR7UG6iIgwt\nGs0vR0b8Ekhc8a2sFyFLpKd1No+aVn1mty95yVNQ6Ea+Jxr34ZZh5gxZj2+9VtM/EMr2tE8arSj0\nvR1j8Echa86mvFiVvLlH6BqmSuEM0T8Gdw6xH4LotGTuqdHwyv8ArFufvGSiszPiUY72nxaGvpip\nkhNLG2o4qyaXA+ew8wYSRa+QuRbNaLtOOpuBjbJDWNwNcZi7AZbvLfMAFiRbP8Olbmm9Jx6OjgEs\ncLmzvwXnlEbBkTmSD0KKDTej3ikbOGwzSj0bC24bmWgg2yBtkcrq7ZqxeUNU6ZgkpIRG5zBiEpJs\n55YMra7i6xR+Uzpmwumjp2sk4MlzJriMPJFnZCxy/Nbf07oc0nGMdosWAHgiLm2LLLMazdYbpHQ1\nBE6jb5scUQfbg3ODmi2YNvO1jpQV48pp8UktqYxuiY6GMyEEkueCb21WasR+UlS6QTuY3gnRtlbB\nG67yDC59tWeY1qyqdL6Pjhmc2nc90DMRaYHNA1HDciwOYNtakk0jSt0dLXU9OZDAcBaYyxzSMs7i\n4AvfVqQaXKCqeMEMFLJIC67mzExkBmPI218ygqPKeSR8kMDGM9Dja8OuWus02II/iVlSaVopIaa8\nQY+YEtbE3hGgXw3xNFgD0my8R6b0S7AJDGyV8XCFoYSLYb2vbXYHJBcrKpZ/KSjjYDEyZ98V7xOb\nhszFncXsRzr3UeUNHCyfCJZZIMONjI3HWQMjbPWgt0WAbgHPPpWUBa1XXU1E0GolDb6hrJ/BTuOF\npd0C65ijoXaSqHV+kHBsDz5oc61+gKlrTHiG+HHW27XnUQuINN0FRIGNmwuOrGLXVguTqvJyZk+C\nnmieHZta51nWV1oKWY0r4Kg3kp38GTe6rW1t6tDTNhxxXljnabTFYaHRs0zL8JYNjAFyXE2GXPmV\nRR6fqYoouEJkkjY+OVsreDu8SMaHuyu0Wdc+9dJUyxQxB831cQAyvmTYfmtCtqaOannY6GV2QEmB\nrmOztlcZg2stXI0B5TSYQeAp8i/Fab64bJgtHl5x5+7pUdXpyvwcJG2GEHHhxPuLNlawl2WRzKu9\nH09HHRxtpYrRxudbHcuDrm+Zzve6jZpCglayzTaT6odERcEE31asifwQVEXlDWRMwyxQzOkfJwTu\nEte0wjscv4h06lY6I0tPXVD4p6eOKzXEFkhdfC8sPMOcKY19A1mLAfNGLCIjcN13tbUvQ0hRMbwg\nBa3FhxcGQNfiUFLHpeohdw5rDUkzzsfSBrbtYzHYiwuPqjXfWpoPKOeWmZK6jYOEcYmYZQ4OkIBa\nMum57lYt0hQMkc5rCw3s5/BEX87Drt0iyzFpGhc1oYxzW/WAMJAvbELZa7ZoKoeUVRUVD4mwcCI6\nhrMQcCSOEwkFusX5llnlNK+JjhDTNMhjILpjhYH3yebZOFtXtVmzSFI+cBkY883Ly23nB2Gx9oKl\nZFR18bXCG7A8StuwtDjbI+3WgpovKiWSoiiFG2zow9zuFFr2cbtvmW+br9qvKCaWooIpp2MY+Rgc\nWscSBf2qfgo7g8G24Fgbah0LNgG2AsAEGRqCysDUFlAREQeXuaxpc9wa0ayTYBaT6XjD5yydhhqW\nYXANubWtkb/opa+N0tOA1jnkPa6zXAEWOsXyv7Cq5tJVtl4TgngEWOBwYXNxXN7Gwcb3uEG3Ho+R\nlRxgVA4Y5OPB5Ws0ZC+R80ZrxTaJ4CoErpy8gtP1ddg4dP8AF+S13U2keLuzlL3EAWlN2twn+IZ3\ntc371BIK+SRsTHzCqscR4TzQOD1EX14ufp50HQotXRzJo4CJi8kuJAfrA6PrO+a2kENJ9if6j/8A\nIqdQUn2J/qP/AMip0IEREENXAKmlkhOQe211W0tTB6Kn0i1sdTT5NL8gfaCrhRTU8M4tNEyQfxC6\n0raIjUq2rudw0KqtpYZHPpg2askGEBmZPvWzo2mNJRsjcbvzc4+0qWGlgp78DCyO+y2ylS1o1qER\nWd7lr19Q6kpHzgAiOxcDs3z/ACVUNM1hY28EbXXc12IgAuba4uXDWSenVqKvUsOhZrqvRNbNPIY5\nHskHpCcI85lnkAO/D3alDHpmWSRzLwsGJtpHDJgOLWL/AMNubXqV1YdCWHQEFRJpSZrHua6BxDww\nMAzAuPPzcMs783NmvP0rU+bcU4LmEtGIEE2OZIdkMvaParmw6EsOhBr0FQamm4RxBOIg2bYZfifm\nVsrGrUsoCIiDXrqfjVHLDexe3I+3mWjR1VNMY21gbHWU+Vn5fiFaqGoo6eptw8LJLaiRmqzH3UtW\nd7ho1lVTQOkbSBslZUDDZmZPtK3aCn4rRxQ3uWtzPt51mno6emvwELI76yBmplMR9ytZ3uWpXR0r\npad9TiJY88G0AnES0g5AZ5ErSh0To1ggqWSPwRgMYS/IgOOFp58ibBb9XSmomp3B5aInlxsSCfNI\n/VaM+isdQ5sTW8C0NcGvvYyYrk39w/NSu8ReT2jHQl0ZkON/CcIH53sRr9x96kPk9QOqJJvShzwQ\nRjyF7f8AxC8DQ72YS1sDgGgOjdk1zvOzOWvzh3KWk0fJRzMl82R/nNe4ZFwOHM+6xQYq6CibJLws\ntQ0VVy6JjnFpOV3WHuCn4nSOo5nNle2Kd3DmRr7WORuDzagvVbRvqponNldFga8YmGzgSBYhZhhk\n4gaZ0cbC2PgxliYcug8yDQhodGw4ZIp52OY7zrOc10hc7FmNZubnL2pBobRb3BsL3m8OAtD9YsWX\nPttkp2UVU17JQ5gMZBZEZHPbqIPnEXGvV7PalHS1sM/Dytgc9+IPDZCALuvll0IE+gqScOxGUYtZ\na+2WDBbuXl/k9QyTzTu4XhJWYC7Hm0XBy/EBWqygw0WaBcm3OVlEQYIDgQdRVHA6CjB0bpJoEbX4\noXu+q4axn0q9XiWKOZmCVjXt6HC6rMbaUvx8T8K+rrtHwvbPiZLUAYYwzznH2ZKTRFPJDTvknFpp\nnmR46L8ynhoqWndihp4mO6WtAKnSIne5Ta8ceNUc8Qmicxzi0HnFv1yWoNFUw4MXfaNoawXGQHtt\nfm1alPXwyVFMY48NyQfO1EA+4/Iqui0TOynMTuBc4w8GJCTeOwI83L2+xWZLOGEQl4Y+7HkuDTzE\nkk/NatLouGGKESyOlfG0MxOOsBpFvd5xULtFyvnjkDIomtAGCN1g2xvcHDrPstqUf0RMI8LWwCx8\ny5Btlz+b5345/wASCefRIcy0EzmvLSwyONzgtaw9n/bqR+iKZ8bGEvAbexBHObnm6Qt8allBpO0Z\nTuYWHHYgA59DsXzXl+iaaSPA7HbLn6G4R+S30QaEeiqeMtIdIbG9iRY536OlbNNTimjEbXvc0ABo\ncb4QNQUyICwdRWVg6iggifO+Jjw2MYmg6yvfp+iPvKUvqsPwD5KVShF6foj7ynp+iPvKlRQlF6fo\nj7ynp+iPvKlRBF6foj7ynp+iPvKlRBF6foj7ynp+iPvKlWEGrS8NwRsI/rv5ztFTen6I+8rzSfYn\n43/5FVWl9NzaPreAjijc3CHXddWiJmdQRC39P0R95T0/RH3lc1yoquoh/NOVFV1EP5q3bsnjLpfT\n9EfeU9P0R95XNcqKrqIfzTlRVdRD+aduxxl0vp+iPvKen6I+8rmuVFV1EP5pyoquoh/NO3Y4y6X0\n/RH3lPT9EfeVzXKiq6iH805UVXUQ/mnbscZdL6foj7ynp+iPvK5rlRVdRD+aDyoqSQOAh/NO3Y4y\n6X0/RH3lPT9EfeVIsrNCL0/RH3lPT9EfeVKiCL0/RH3lPT9EfeVKiCL0/RH3lPT9EfeVKiCL0/RH\n3lPT9EfeVKiCL0/RH3lPT9EfeVKiCL0/RH3lPT9EfeVKiCL0/RH3lPT9EfeVKiCL0/RH3lPT9Efe\nVKiCL0/RH3lPT9EfeVKiCL0/RH3lPT9EfeVKiCL0/RH3lPT9EfeVIuNm0xpBs0jRUuADiBkOn3LS\nlJv8KWtFfl13p+iPvKen6I+8rjvpnSPandw8E+mdI9qd3DwWnYsp3Ydj6foj7ynp+iPvK476Z0j2\np3cPBPpnSPandw8E7Fjuw7H0/RH3lPT9EfeVx30zpHtTu4eCfTOke1O7h4J2LHdh2Pp+iPvKen6I\n+8rjvpnSPandw8E+mdI9qd3DwTsWO7DsfT9EfeU9P0R95XHfTOke1O7h4J9M6R7U7uHgnYsd2HY+\nn6I+8p6foj7yuO+mdI9qd3DwT6Z0j2p3cPBOxY7sOx9P0R95WGPeXvY8NBDQcj038FSeT1fVVdXK\nyomMjQy4BA13V2PWJfgb83LK1eM6lpW242zS+qxfAPkpVFS+qxfAPkpVRYREQEREBERAREQQUn2J\n+N/+RXLeVH3t/wDrb+q6mk+xPxv/AMiqHT2jKyr0hwtPDjZgAviAzz6StKTqyaqqmhY6lDzGXOxO\nzAxW+rzXz1lTiljAAdG27nPuQ0kDzQRn/p1rwNB6TGqnI/vb4rP0JpSxHF3WP8bfFbbj2s14obQ4\nmxtmkLwC0edYW9ikFOMIa2Jtixxc43NiL5XHRYL2NB6TGqmI/vb4p9B6Ttbixt0Y2+Kbj2IamLAJ\nAyECNpGGTp8bqaGmhlZH5oBlaCM9WH636p9B6TtbixsP42+KfQek+zHfb4puPYzHRwuIfZzg5pcG\ntBtm0mwy5veooqeKOoe6X7JjQSH5Zkajb8e5S/Qmk8v/AE5y/jb4rH0HpPsx32+Kjcexn6NZia0l\n/nE2ePq2xADvBWrUQiGSO2IYs8L9Yzt+i3BonS4wWhd6M3b57cvzXn6E0mX4nU5JvmcbfFN/kdoN\nSysDUsrmUEREBERAREQEREBERARYRBlERAREQEREBERAXz+o9Yl+M/NfQFzcnk1M+V7xUxjE4n6p\nW+G0V3tlkrM60qKZ4bG4NfgkxA3xYbjouprUZdd7mlhJu7U6+Lo6LLf5MTdpj3SscmJu0x7pW03p\n7ZcLempDDDIW+ZG43HC2JwgZ6j3LxVGldE5zC1zyNd87+FlvjyZnGqqZn/CU5MTdpj3So509p429\nKsPvAwMmbGwMIe085z5ufmWw/ij5nOcYy1x13zxXP5WW5yYm7THulOTE3aY90pzp7ONvTRbxeOPz\nTFi4Ih5xZ4sPN+N1FC/0MYZM2KxPCX5/w5/crPkxN2mPdKcmJu0x7pTnT2jjb0r/AP0rYybRkhvm\nC+ZNs7/ivTm0d8uDADjbO99dvcNS3uTE3aY90pyYm7THulOdPaeNvTTbxQxhrizEM7X829m35/eq\n9+HG7B9W+XuV5yYm7THulOTE3aY90qYyUj7omlp+zx5K+vTf0/1C6MesS/A35uVdojQ8mjqh8j5W\nvDm4bAW51Yj1iX4G/Ny58sxNtw2pExHlml9Vh+AfJSqKl9Vh+AfJSrJoKKSohiNpJWMPQXWWKuoZ\nS075n6mjV0lcZNK6eZ8shu5xuV0YcHc8/ZlkycHY8epe0RbwTj1L2iLeC4orC6P0dfbL9RPp23Hq\nTtMW8FM1wc0OaQQdRC4JXnk5XYXmkkOTs2e/nCzy9LwryiVqZuU6l0aLCyuN0IKT7E/G/wDyKqtO\naXqdH1MccDYi1zMRxgnn96taT7E/G/8AyK5vys9eh/pfqVekbt5TV45TV+xT7p8U5TV+xT7p8VTI\nt+FfS+oXPKav2KfdPinKau2KfdPiqZE4V9GoXPKau2KfdPinKau2KfcPiqZE4V9GoXPKau2KfdPi\nnKau2KfdPiquka11XC14BaXgEH3reigY50V42huJmtgs73O5/cVE1rH2R4Tcpq7Yp9w+Kcpq7Yp9\n0+K054o7wF9o7/XDmYHWv0D2L1wRMpD4Y2tBdweVsWWQ9o1Zpqvo8NrlNX7FPunxTlNX7FPunxWq\n6KzMQjaZy0EM4Ox1m5w9y8QsY2pkknbGGxj6uZbiOoZX/wChONfR4bvKav2KfdPinKav2KfdPite\nRlPTMbiwvYXut5gJc2wtnza15raaJjZJI2kWd02Gvmyz701X0eG1ymr9in3T4pymr9in3T4qmRTw\nr6NQueU1fsU+6fFOU1fsU+6fFUyJwr6NQueU1fsU+6fFOU1fsU+6fFUyJwr6TqFzymr9in3T4qei\n8oayeshheyDC94abNN/mufW1ov7zpf6rfmk0rr4RqHUadr5qCKF0GG7yQcQuqblFX/ytz91veVn2\nVN8Tv0XNqcVKzXcw5Mlpi3hbcoq/+VufunKKv/lbn7qpRaduvpTnb2tuUVf/ACtz905RV/8AK3P3\nVSiduvo529rblFX/AMrc/dOUVf8Aytz91Uonbr6Odva25RV/8rc/dOUVf/K3P3VSiduvo529rblF\nX/ytz905RV/8rc/dVKJ26+jnb2tT5RV554txOUOkNqPcVUiduvo529rXlDpDaj3E5Q6Q2o9xVSJ2\n6+kc7e1ryh0htR7icodIbUe4qpE7dfRzt7WvKHSG1HuJyh0htR7iqkTt19HO3ta8odIbUe4nKHSG\n1HuKqRO3X0c7e1ryh0htR7icodIbUe4qpE7dfRzt7dRoLSlTXVMkc5aWtZcWbbnVsPWJfgb83LnP\nJX16b+n+oXRj1iX4G/Ny5csRFtQ6KTM18s0vqsXwD5KRR0vqsPwD5LFRLwURPOcgsb2isTMtIjfh\nzvlDX8NU8VYfMi+t7XLQo6d1VUMibz6z0DnWxpmnNxVNHsf+hVnoOj4Cl4V4tJLn7hzLtwdTSemi\n9P8Apc+TFbu6lsDRlFa3F2/mn0ZRdmZ+a3mMvmV7MYOpc3cv7ltxr6c3p3RjIqUVFIzBwf12t5x0\n/gufZLLG9sjHkOabgrvXtBBa4XBFiDzritJUZoKx8X+g+cw+xYZb5I8xaf5eh0cYrf8Ai1Y3/Z2m\nja1tdRMmbkTk4dB51trjfJ2v4pWcE82imyPsPMV2KmluUMOoxdq+vshpPsT8b/8AIqn0/ouqrquO\nSnY1zWssbuAzurik+xPxv/yKp9OaUqqKsbHA5oaWA5tut6RM28OebcY2q+T2keqZvhOT2keqZvhe\n+UGkNtm4s8oa/ai3FvxyfhXvwj5PaR6pm+E5PaR6pm+FJyhr9qLcWeUVf/K3P3TjkO/CLk9pHqmb\n4Tk9pHqmb4UvKKv/AJW5+6zyjruiHdPinHId+EPJ7SPVM3wnJ7SPVM3wpuUdd0Q7p8VkeUlbsQ7p\n8VHHId+EHJ7SPVM3wnJ7SPVM3wp+UlbsQ7p8VkeUtZ1cB/tPinHId6Gvye0j1TN8Jye0j1TN8LY5\nS1nVQbp8VkeU1XzxQdx8U45DvQ1uT2keqZvhOT2keqZvhbXKaq6mH8/FB5TVXPBD+fimsh3oavJ7\nSPVM3wnJ7SPVM3wtvlPU9RF+aDynqOeni7ymsnpPehqcntI9UzfCcntI9UzfC3OU8/Z494pynn56\naPeKayejvVafJ7SPVs3wnJ7SPVM3wt3lPL2Vm+U5Ty9lZvlRrL6O9VpcntI9UzfCnodB18NdBLJG\n0MY8OPnjVdT8qJOyt3/2TlRJ2Vu/+yayejvVSeVh82lHtd+i5xWGldKHSXBXiEfB3/1XvdV62xxN\na6lzXndtwIiK6giIgIiIJGwSOZja3LM6xc212HOvFjllrWwyePAzGwlzGFoFgQdfdrUprmEEekGK\n+Y1s1ZN9mSruVtQ02Mc9wa1pJJsF5VoK6MNMpcRfLggf4r3961p6qOcx4mHCHXc23uyGfgkTPo1C\nHi02INLLEtx5kDLpWHwSxgFzLA6iDe//AG62TXhzXl8YxkOAsLix6brMNfhY1smLIm+EWyysBYi2\npN2NQ0UWTmSVhWVEREBERAREQEREF35K+vTf0/1C6MesS/A35uXOeSvr039P9QujHrEvwN+blx5v\nrl04/pZpfVYfgHyUNfGXMDx/p1qal9Vh+AfJSEAgg6iubLTuVmras8Z2pXBr2lrgCDrB51vxPD2A\njuWpPEYpS3m5l6p5MD7HUV5fTZJxZOFnVkryruFlGbtC9LXBI1LJe4869dyDzdxXK+UdW2erbAwA\niG93e08yvNK1ooaJ0g+0d5rB7VxhJJJcbk5knnWOW3jTv6LFuectigpX1lXHBHrccz0DpX0ACwA6\nFS+TWj+L0pqZBaSYZexqu1OKuo2z6vLzvqPiEFJ9ifjf/kVzXlP95t/pj5ldLSfYn43/AORXOeUk\nUr9JAsje4cGMw0npXXh+t5+T6VKil4tP1Mm4U4vP1Mm4V2bhz6RIpeLT9RLuFOLz9TJuFNwaRIpO\nLz9RLuFZ4tP1Eu4U3BpEil4tP1Mm4U4vP1Mm4U3BpEil4vP1Eu4U4tP1Mm4U3BpEil4vP1Mm4U4v\nP1Eu4U3BpEil4tP1Mm4U4vP1Mm4U3BpEil4vP1Eu4U4tP1Mm4U3BpEil4vP1Mm4U4vP1Eu4U3BpE\nil4tP1Mm4U4vP1Mm4U3BpEil4vP1Eu4U4tP1Mm4U3BpEil4vP1Mm4U4vP1Eu4U3BpEil4tP1Mm4U\n4vP1Mm4U3BpEil4vP1Eu4U4tP1Mm4U3BpEil4vP1Mm4U4vP1Eu4U3BpEil4tP1Mm4U4vP1Mm4U3B\npEil4vP1Eu4U4tP1Mm4U3BpEil4vP1Mm4U4vP1Eu4U3BpEil4tP1Mm4U4vP1Mm4U3BpEil4vP1Eu\n4U4tP1Mm4U3BpEil4vP1Mm4U4vP1Eu4U3BpEil4tP1Mm4U4vP1Mm4U3BpEil4vP1Eu4U4tP1Mm4U\n3BpbeSvr039P9QujHrEvwN+blz/kxFJHWyl8b2jg9bmkc4XQD1iX4G/Ny4831OjH9LNL6rD8A+Sl\nUVL6rD8A+SlWLVr1kPCR3H1mqsV0tOWhxPJY4AHmXn9X09rzzpDfFkiPEtF0sw1Sut+CjdUTj/7z\nvyW/xB+21a9VomeWFzI5mNLsrkFYRj6n8/y1i2LbmdI1klXPd7y9rMm3UmhqA19c1jh6JnnP93Qr\nDkrUdpi3SrzRWjmaNpuDacb3G7nWtddtMdt/+nTk6jHTHxxz5boAAAGQCyiLqeUgpPsT8b/8ivTp\nJOFLI2NNgCS51td/Z7F5pPsT8b/8ivTfWpPgb8ypQYp+rj/3D4Jin6uP/cPgpEUJR4p+rj/3D4Ji\nn6uP/cPgpVhBqVlXLR0r6h8LHNZrDZM9fuVXynHYZN79lv6f+5qj3D5hUUX2LPhCzy5YxVidbb4c\nUZN7b3Kb/wDBk3v2Q+U4AuaKQD4v2WooKz1V/uWVOqi1orx/22t00RG9rDlWzsjt/wDZOVbOyO3x\n4LmUV7XmJmGtenxzETp3GidKDSbJHNiMeAgZm91uyP4ONzyL4ReyofJH7Cp+IfJXlV6tJ8K1r5hw\nZYitpiDhJOpO8E4STqTvBSIpUR8JL1J3gnCS9Sd4KREEfCSdSd4KtqdPQ01S+B9PMXstfCARqv0q\n2XKV337Wf2/IJNorWba+BY8pafs1TujxTlLT9mqd0eKrUXN+pj9qOSwPlRSjXBP3DxTlTSdRP3Dx\nXM1XrEnvUS2vbUxp3YcNL13Lr6fyjpqiojhbDMHSODQSBbP8VbSPwMxWvmBZcJoz70pf6rfmu5qP\ns/7m/MKaTuPLHqKRS2qnCSdSd4JwknUneCkWVdgi4STqTvBOEk6k7wUqIIuEk6k7wVXUeUVPTzvh\nkhmxMNja1vmrhclM0HStdcA+k5wq2vFKzaY+F8dOdtLHlTSdRP3DxTlTSdRP3DxVfgZsN7kwM2G9\ny5/1lP2/7dP6T8rDlTSdRP3DxTlTSdRP3DxXLzfbP+IrwtsltT4WxYKWruXY0nlDTVdVHAyKVrnm\nwLgLfNWckhZhs0uLjYD/AL7lxOg/vim+I/IrtZfrw/H/AMSppO48ufPSKW1BwknUneCcJJ1J3gpE\nVmKPhJOpO8E4SXqTvBSIgj4SXqTvBOEk6k7wUiII+El6k7wThJepO8FKsII+Ek6k7wThJOpO8FIi\nCISuD2tdGW4jYG4PNf8ARB6xL8Dfm5Zl+0h+P/iVgesS/A35uUoZpfVYfgHyUqipfVYfgHyUqhIi\nIgIiICIiAiLCCul0jT6NpWSVTi1j5nMxAXscR1rZppoqiV0kMjZGOY2zmm41uWrpXQtNpNjGSDgw\nH43FjQHO9l1Lo+hptHvfDSxNjZhaTbWTc5kq3jSvnb3pAPMLMIcWYxwgZe5bz6vw/BaD3SMnYaVl\nTFB5wc7ATYXGbWm/y6VY1sssUTeBAL3OAsbXtz2uRcrSh0nKZuCwMkyDQc2HEXOFiDqsGqqyE1uk\nQxreDkEhcCbRamYMzq135lvaKxmGYyOldeZ2F0jcJI5srBQO00OAfK2nJa02tizJw4jkAVmXTLYg\nH8A4xElocHZlwF7W/JB70/8Ac1R7h8wqKL7FnwhW2lZzUaBqXOjdGQcJBB2hmLgZKijq42xtBD8g\nP9K5+ppa1I4xt1dNaKzO20oKz1V/uWOOR9D91RVFTG+BzQHXPS1c2LDki8TMfd1XyUms+Veic6Lp\nv9UtKfTDp/JH7Cp+IfJXlV6tJ8Ko/JH7Cp+IfJXlV6tJ8K3r8Q8vP9coqiujp5CxzJHYWY3FoFmt\nva//AIXt1ZTsIEk0bC5xaA54Fze3zWtX6MFZM55dGA6PgzijxFuZN2m+RzWtPoubhC2EhzJXNMjn\nDmEhf0+09P4KWbfjr4Xlt8TGvF2PeLB49n7qQ1VOHhhmjDjezcQvlr+RWhJoqZ/BxmqbwMTMMbeD\nzB5iTfPoWBoY8I9758RkOJ4s4C9yRYB1ufnvqQWMVRDMAYpWPBNhhcDdcxXfftZ/b8gr2moeBrTN\nc4WxNjb7TznuAH4Kirvv2s/t+QVcn9Ox9mERF5rNU1XrEnvUSlqvWJPeol3ZPt/aHrdN/TbWjPvS\nl/qt+a7mo+z/ALm/MLhtGfelL/Vb813NR9n/AHN+YV8fw5+r+p4NYwTGMMkcGuDHPa24a4835hDX\nUjWlxqYQ0HCTjGvoUEuj3PkdabDG6QSEYfOByvY3yBt0dKhh0MIw0cLfC3ADYk4cLmi9ydrmsruV\nvOrKZhcHVETSy2K7xlfp71hlbTSEcHPG++y8G2RP6FabdEYTEDPdkL8bBgzzcHG5v0j/AMrDtCtf\nhvMbBhYbN13x/wDz/JBvNraV5aG1ERxmzbPGa5iX71rv6i6A0Uz5opXTtD2WuWMLcQB1fW1e+/ss\nudqJWR6Vrcbg28nOs80TOO0Q36edXhKih4zB1rU4zD1rV5nbv6l6PKvtWT/bP+IqNe5iDK8jMFxX\nhell+VMP0t7Qf3xTfEfkV2sv14fj/wCJXFaD++Kb4j8iu1l+vD8f/Eq2P4cfVfW8VwlNHKIb47ZW\n1+23tWg+Tg/U2VMURcMb+DJtkdTSDz2ubc63tITPp6GWWKxe0XFxda8ldLSsBlhe82c84sLSGttf\nUTfWruZp8NpUxukc6Rrgx54MRC1wGkDVfMkr1S11W4MkLpZI8XpDwP1fOIsLDMaulT/S93s9AQx7\nS9jsQNxnrHNqXlumA0iMUjw8DEWM86zbA5WGvztXvzQacNRpQMBvKHPJcTJGcnYW2bYNOWvxXQjU\ntGtrJKWQ5Nc18Z4IWzMl8h+Nx3FaLtK1bBJGREZW1LWA4Tbg7gOOvXr7wgvUWrR1ZqcYdHwbm2u0\nnMe8W/b2raQEREEUv2kPx/8AErA9Yl+Bvzcsy/aQ/H/xKwPWJfgb83KUM0vqsXwD5KVRUvqsPwD5\nKVQkREQEREBERAREQFE31qT4G/NylULmyCYvYGkFoGZtqv4oPcsUczMErGvbrsQouJU1rcBHa1vq\n+2/zzXu8+zHvHwS8+zHvHwQeHUVK5mB1PGW67YRbVb5ZKIaLpuMcKWYsrBhAwgWt0dA51sXn2I94\n+CXn2Y94+CDQ01EyHQdQyNoa3I2HxBUcX2TPhC6Ovpp62jkp/Rsx2865Ns79CpuTVUNVa3uKrkx9\nyut6PlAoav1Z/uW7yaqu2t7ih8makixrGke4rOnTRW0TyREOdRdByVm7VHulOSs3ao90qbUmZmXp\n16jHERG0/kj9hU/EPkryq9Wk+FaOhdGP0ZHK18jZMZByFrLfnaXwva21yLC60r4cGWYtaZh7WVFi\nm6pm/wDsmKbqmb/7K2lEiyosU/VM3/2TFN1TN/8AZNCRcpXfftZ/b8gunxTdUzf/AGVFX6Frqmvl\nqIpIoxJbLGb6gOhRNeVZqNRFJ9AaT7TFvHwTk/pPtMW+fBc36Sf3QrxUtV6xJ71Erx3kzXOJJlgJ\nPPiPgvPJit62DePgtslJ3GvT0MGWlaamVfoz70pf6rfmu5qPs/7m/MLnaPydq4KyCZ8kJbG8ONib\n5H3Lo5ml0dm2vcHM9BuppExHlj1F63tur2ijxTdUzf8A2TFN1TN/9lbTnSLKixTdUzf/AGTFN1TN\n/wDZBIuQna12la3E0H0nOF1eKbqmb/7KgqdCaQlrJp4pYWCV2KxcfBLVm1ZiJPlp8GzYb3LPBs2G\n9y2PoLSnaIO8+CfQWlO0Qd58Fz/pr/uhHGfahmylfbaK8K7PkzXOJJlguf4j4LHJet62DePgtclZ\nm3h6GHLStNTLT0H98U3xH5FdrL9eH4/+JVBo3QFVSV8M8kkJaw3IaTfV7lfytecBYAS117E25iP1\nU0jUeXP1F4vbdXqSNksbo5GhzHZEHUVGKSnAI4JpBBBuL5HWvWKbqmb/AOyYpuqZv/sraYNeHRlN\nFM+XDjc7aAyzJ5h7VM6jpnG7oIz/AG+wD9B3L1im6pm/+yYpuqZv/smhmSKOUsMjGuwOxNuL2PSF\n4NHTl+IwRl2eeEXzIJ/MA/gvWKbqmb/7Jim6pm/+yaCKCKG/BRtZfXYKVRYpuqZv/smKbqmb/wCy\naEqKLFN1TN/9kxTdUzf/AGTQS/aQ/H/xKwPWJfgb83LBEr5Iy5jWhrrmzr8xHR7VkesS/A35uUoe\nYhOyJjMMZwtAviPgvd59iPfPgpBqCyiUV59iPfPgl59iPfPgpUUCK8+xHvnwS8+xHvnwUqIIrz7E\ne+fBLz7Ee+fBSogivPsR758EvPsR758FKiCK8+xHvnwS8+xHvnwUqIIrz7Ee+fBLz7Ee+fBSogiv\nPsR758EvPsR758FKiCK8+xHvnwS8+xHvnwUqIIrz7Ee+fBLz7Ee+fBSogivPsR758EvPsR758FKi\nCK8+xHvnwS8+xHvnwUqIIrz7Ee+fBLz7Ee+fBSogivPsR758EvPsR758FKiCK8+xHvnwS8+xHvnw\nUqIIrz7Ee+fBLz7Ee+fBSogivPsR758EvPsR758FKiCK8+xHvnwS8+xHvnwUqIIrz7Ee+fBLz7Ee\n+fBSogivPsR758EvPsR758FKiCK8+xHvnwS8+xHvnwUqIIrz7Ee+fBLz7Ee+fBSogivPsR758EvP\nsR758FKiCK8+xHvnwS8+xHvnwUqIIrz7Ee+fBLz7Ee+fBSogivPsR758EvPsR758FKiCK8+xHvnw\nS8+xHvnwUqIIrz7Ee+fBLz7Ee+fBSogivPsR758EvPsR758FKiCK8+xHvnwWGNfje94aLtAABvqv\n4qZYOooA1BZWBqCygIiICIiAiIgIiICIiAiIgIiICLxKSInkaw0qnfJUvjjkEkz2iBhPAPbia4i5\nLgdamI2mIXaKinnqJDUTQOqHMY1rmva9rWjzQblp19K3amsL2siiMkeORrHSmMgWOyTlzW/FTxk0\nsEVZMZKf0DKxx4SRrbusXRA35/bbK/SopJ5aZ00LZZxfBZ0wBwguDS4H8edOJpbrQqtMUVJO6GaR\nwe21wGkqanjME7ozVOlBaHYJCC4e2/Qua0zG2TyglY7UQ3/EJ4jcz8QmteU6XXKHR3Wv3CrGGVk8\nLJYzdj2hzT7CuKqaWKKBz2g3FuddXo55bouia0ZuiaBf3KkXpevKq+THNJ1LeRQ8NhyeM+e2oZ5L\nzxoXHmOsRfwUcoZ8ZbCLXfUhoyab3sL8+eaNqmHLn6BzlOUJ4y2EUHGLQOkIAIJFifbZGVAe1psS\nbXNuZOUI4ynRQOqWNALgRcXF7aulDUsF8jkbcycoOMp0UT3uBYG2u48/uXkVLcg4ecSRl0qeUGpT\nooY6hr3NaQQ5wvZTJE7JjQiIpQIipvKDSVRo/gOLlox4r4m31WUTOlqVm86hcouN5SaQ2otxdfGS\n6NrjrIBURaJXyYrY/l7REVmQiIgIiICIiAiIgIiICIiAsHUVlYOooA1BZWBqCygIiICIiAiIgIiI\nCIiAiIg8ve2Nhe9wa0Zkk2ATG3GGYhiIuBfOy19JMdJo6oYwEucwgAC5WjUU9XHKXNnnnPBWxYWg\ngYm3AsBna6tEbTELcgEEHMFa0lFRylrXwxuLGhoHQ3mHuVXM2c07uCdVthEnmB4kLiLZ3scYF/8A\ntkmjqryysZOyaSGLFm51gCcQytn7rHWpiv5TpaS0FJI8ySQsJsL31ZL1UOpTSF1Q6Li7gLl5GG3M\nqkMqeLsErqp8Xn4eDa9rr5Yb3JdbXr/FbkzHt0NTgxvLmCIuaGknItvkkx+TSWkj0fNTvZSiCWJx\n88NIcCfapGUdJCHMbFGOFyIIvi9metaFS98rpaiCnnDeDDDk6NzvO92LIX5ufJa9PHOXQySsqXcF\nO/BfGDYsyvc3tfpU6/Jpc09LBTX4CJrMWsjWVy+mA/6fm4K2INbr+EK20Qajjb+EbO2N0YNpMZAd\nfPN36ABVOmJWxafmc/VhaMvhCraJ1P38HmJ8NSp4zwLuELMOV7LrNH4Poqkx3+yba2vUuSqKuKSF\nzG3uekLrdHNcdG0bm2uIm5H3BZ0iYx+Y15LWtbzZsNiiIBGf46/enARZZahbWohTvxu1Dnxc/PqW\nRTHDY4dRH5a9Sr/hXlL2+OEEY/8AUcs+deuBjAIzt0X1LzJA6Q5vOQsPevPAvtmGE3uT0+xPv8HK\nUghjbaw1G+acFHixDWddjrUfF3FxJtY8wNrZatSyyFzXtd5uWv8A7ZT/AIOUmGEOw3cCMtZy9l16\ndHHfMm7svrHuWHRPLnjLC4g3vnzeCj4s+97sv/58VH+DlKcxtcGjMYdVjayxwEdwbHL2pBG6Nrg6\n2ZvkpVbUT8wRMomwRscC0G41Zn3KRZRTEaN7ERFILmvK/wD9r/d+i6Vc35X/APtf7v0Vb/Dbp/6k\nObOpd5VSvh0bwkZs4Nbn3LgzqX0IRsmpBHI0OY9liDziyzp93T1Xia7albWyUtRJhY6VrYmnAOku\ntc5E2UL9ONEMLo4myPlDnWY/E0AGxzAPyH4LaZoymZG9gEpx2u4zPLstVnXuEOi6UxsZhkGAkhzZ\nXBxvru4G5v7Vp5c0Tj+8M0ta6qnc1kBbE1rSXudY3IBAw2WtKap762WOsdGIHWZGWtLDZoOeV+fp\nVhFBHC5zo22L7Xz6BYLXl0XSyzPleJCZDd7eFdgdlbNt7HV0JMSitqxLXdpi1TFFwTCHlrTaW7ml\nwuLttkPeVCdMzU9DHLPDE6Qtc8tE1iQDzC1z8vat52i6V9Tw5Y/HiD7CRwbiHPhva68SaGopG4XM\nkw2IIEzwCCbkHPMexRqy0WxeNx/38t5jg9jXDU4XXpeWNDGNa3U0WC9K7AREQEREBERAWDqKysHU\nUAHIJdfAsb9o96Y37Tu9B99ul18Cxv2nd6Y37Tu9B99ul18Cxv2nd6Y37Tu9B99ul18Cxv2nd6Y3\n7Tu9B99ul18Cxv2nd6Y37Tu9B99ul18Cxv2nd6Y37Tu9B99ul18Cxv2nd6Y37Tu9B99ul18Cxv2n\nd6Y37Tu9B99ul18Cxv2nd6Y37Tu9B99ul18Cxv2nd6Y37Tu9B99ul18Cxv2nd6Y37Tu9B99uvDoo\nnm7o2E9JAXwXG/ad3pjftO70H3ngIOqj3QpAA0ACwA1AL4FjftO70xv2nd6D77dLr4FjftO70xv2\nnd6D77dLr4FjftO70xv2nd6D77dLr4FjftO70xv2nd6D77dLr4FjftO70xv2nd6D77dLr4FjftO7\n0xv2nd6D77dLr4FjftO70xv2nd6D77dLr4FjftO70xv2nd6D77dRTU8FRbh4o5MOrG0Gy+D437Tu\n9Mb9p3eh8Pun0dQ9kg/2wtkWAsNQXwLG/ad3pjftO70TMzPy++3S6+BY37Tu9Mb9p3eiH326XXwL\nG/ad3pjftO70H326XXwLG/ad3pjftO70H326XXwLG/ad3pjftO70H326XXwLG/ad3pjftO70H326\nXXwLG/ad3pjftO70H326XXwLG/ad3pjftO70H326E5FfAsb9p3emN+07vQeUREBERAREQEREBERA\nREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQERE\nBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERARE\nQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBE\nRAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQE\nREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERAREQEREBERA\nREQf/9k=\n",
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"execution_count": 1,
"metadata": {
"tags": []
},
"output_type": "execute_result"
}
],
"source": [
"#@title Introduction to ISB-CGC Video\n",
"#@markdown This 12 minute video goes over an introduction to ISB-CGC\n",
"from IPython.display import YouTubeVideo\n",
"YouTubeVideo('RQsLKDTciWk', width=600, height=400)\n",
"#@markdown For more videos check out: [ISB-CGC Video Tutorial Series](https://isb-cgc.appspot.com/videotutorials/)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "Ue2bsjIWu7o3"
},
"source": [
"## About the Data in the Cloud\n",
"The main data that is hosted on the cloud is [The Cancer Genome Atlas (TCGA)](https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga) data which was a large-scale multi-disciplinary collaboration started by the [National Cancer Institute (NCI)](https://www.cancer.gov/) and the [National Human Genome Research Institute (NHGRI)](https://www.genome.gov/). Some of the hosted data types and files include RNA-Seq FASTQ, DNA-Seq and RNA-Seq BAM Files, Genome-Wide SNP6 array CEL files, and Variant-calls in VCF files along with a number of other datasets including data from [Therapeutically Applicable Research to Generate Effective Treatments (TARGET)](https://ocg.cancer.gov/programs/target) and [Cancer Cell Line Encyclopedia (CCLE)](https://depmap.org/portal/ccle/) programs. ISB-CGC hosts several tables in BigQuery with data from the TCGA, TARGET, and CCLE along with reference tables and [Catalogue Of Somatic Mutations In Cancer (COSMIC)](https://cancer.sanger.ac.uk/cosmic) data sets from the [Wellcome Trust Sanger Institute](https://www.sanger.ac.uk/). ISB-CGC is adding more data sets all the time, so if you have suggestions for a datasets to be added please email: [feedback@isb-cgc.org](mailto:feedback@isb-cgc.org)\n",
"\n",
"For more information, please visit: [Programs and Data Sets](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/Hosted-Data.html) and [Data in BigQuery](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/BigQuery/data_in_BQ.html)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "4rbvEDj6xfMT"
},
"source": [
"## Overview of How to Access Data\n",
"There are several ways to access the Data that is hosted by ISB-CGC. \n",
"\n",
"* [ISB-CGC WebApp](https://isb-cgc.appspot.com/)\n",
" * Provides a graphical interface to metadata\n",
" * Does not require knowledge of programming languages\n",
"* [ISB-CGC BigQuery Table Search](https://isb-cgc.appspot.com/bq_meta_search/)\n",
" * Provides a table search for available ISB-CGC BigQuery Tables\n",
" * Does NOT require a login for Google or ISB-CGC to access\n",
"* [ISB-CGC APIs](https://api-dot-isb-cgc.appspot.com/v4/swagger/)\n",
" * Provides programmatic access to metadata\n",
"* [Google Cloud Platform](https://cloud.google.com/)\n",
" * Allows you to use GCP APIs such as BigQuery, Cloud Datalab, Colaboratory\n",
" * Allows you to host your own data on the Cloud\n",
"* [BigQuery](https://cloud.google.com/bigquery/)\n",
" * A GCP Allows you to use SQL to access some data\n",
"* Supported Programming Languages\n",
" * SQL\n",
" * Can be used directly in BigQuery\n",
" * [Python](https://www.python.org/)\n",
" * [gsutil tool](https://cloud.google.com/storage/docs/gsutil) is a Python tool to access data via the command line\n",
" * [Jupyter Notebooks](https://jupyter.org/)\n",
" * [Google Colabratory](https://colab.research.google.com/)\n",
" * [Cloud Datalab](https://cloud.google.com/datalab/)\n",
" * [R](https://www.r-project.org/)\n",
" * [RStudio](https://rstudio.com/)\n",
" * [RStudio.Cloud](https://rstudio.cloud/)\n",
"* Command Line Interfaces\n",
" * Cloud Shell via Project Console\n",
" * [CLOUD SDK](https://cloud.google.com/sdk/)"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "Xlx2svEHfLfN"
},
"source": [
"## Account Set-up\n",
"*If not completed prior to reading this guide*\n",
"1. Log in or [create](https://accounts.google.com/signup/v2/webcreateaccount?dsh=308321458437252901&continue=https%3A%2F%2Faccounts.google.com%2FManageAccount&flowName=GlifWebSignIn&flowEntry=SignUp#FirstName=&LastName=) a Gmail account\n",
"* Can be use your institutional email if it is a Google Identity\n",
"2. Create a GCP Project using a GMail account\n",
"* Required to use all of the data, tools and the Google Cloud\n",
"* New accounts recieve a one-time allotment of [$300 in Google Credit](https://cloud.google.com/free/)\n",
" * Google also offers a [Free Tier](https://cloud.google.com/free/) which grants 1 TB of queries a month\n",
" * Additionally, ISB-CGC offers [$300 in free Cloud Credits](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/HowtoRequestCloudCredits.html)\n",
"3. Authorize your account for dbGaP in the ISB-CGC WebApp (required for viewing controlled access data)\n",
"* To access controlled data, users must first be authenticated by NIH (via the ISB-CGC\n",
"web-app). Upon successful authentication, user dbGaP authorization will be verified.\n",
"These two steps are required before the user’s Google identity is added to the access\n",
"control list (ACL) for the controlled data. At this time, this access must be renewed every\n",
"24 hours.\n",
"* Please view [Accessing Controlled-Access Data](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/Gaining-Access-To-Controlled-Access-Data.html) if you need help with this step.\n",
"4. Register your GCP project in the ISB-CGC WebApp\n",
"* Please view [Registering your Google Cloud Project Service Account](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/webapp/Gaining-Access-To-Contolled-Access-Data.html#requirements-for-registering-a-google-cloud-project-service-account) if you need help with this step.\n",
"5. Enable the following required Google Cloud APIs:\n",
" * Google Compute Engine\n",
" * Google Genomics\n",
" * Google BigQuery\n",
" * Google Cloud Logging\n",
" * Google Cloud Pub/Sub\n",
"* [Google Tutorial on Enabling/Disabling GC APIs](https://cloud.google.com/apis/docs/enable-disable-apis)\n",
"6. Install optional software such as:\n",
" * [Cloud SDK](https://cloud.google.com/sdk/)\n",
" * [Anaconda Python](https://www.anaconda.com/distribution/)\n",
" * [Jupyter Notebook](https://jupyter.org/)\n",
" * [R](https://cran.r-project.org/)\n",
" * [RStudio](https://www.rstudio.com/)\n",
" * [Chrome](https://www.google.com/chrome/)\n",
" * [Docker](https://www.docker.com/)\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "UI5UpdyaeS2v"
},
"source": [
"## ISB-CGC Web Interface\n",
"The ISB-CGC Web Interface is an [interactive web-based application](https://isb-cgc.appspot.com/) to access and explore the rich TCGA, TARGET, and CCLE datasets with more datasets being added regularly. Through the WebApp you can create Cohorts, lists of Favorite Genes, miRNA, and Variables. The Cohorts and Variables can be used in Workbooks to allow you to quickly analyze and export datasets by mixing and matching the selections. The ISB-CGC Web Interface also allows you to view and analyze available pathology and radiology images associated with selected cohort data."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "c1bk7F_-uuYp"
},
"source": [
"## Google Cloud Platform and BigQuery Overview\n",
"The [Google Cloud Platform Console](https://console.cloud.google.com/) is the web-based interface to your GCP Project. From the Console, you can check the overall status of your project, create and delete Cloud Storage buckets, upload and download files, spin up and shut down VMs, add members to your project, acces the [Cloud Shell command line](https://cloud.google.com/shell/docs/), etc. Click [here](https://raw.githubusercontent.com/isb-cgc/readthedocs/master/docs/include/intro_to_Console.pdf) to download a quick tour from ISB-CGC of the GCP Console. You'll want to remember that any costs that you incur are charged under your *current* project, so you will want to make sure you are on the correct one if you are part of multiple projects. [Here](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/DIYWorkshop.html#google-cloud-platform-console) is how to check which project is your *current* project.\n",
"\n",
"\"BigQuery is a serverless, highly-scalable, and cost-effective cloud data warehouse with an in-memory BI Engine and machine learning built in.\" [*Source*](https://cloud.google.com/bigquery/) ISB-CGC has uploaded multiple cancer genomic datasets into BigQuery tables that are open-source such as TCGA and TARGET Clinical, Biospecimen and Molecular Data along with dataset megadata. This data can be accessed from the Google Cloud Platform Console web-UI, programmatically with R, and programmatically with python through Cloud Datalab or Colab. Check out our [Community Notebook Repository](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/HowTos.html) for example notebooks."
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "FLkFocDF265b"
},
"source": [
"## Example of Accessing Data with Python\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "uwUePpm5uLNS"
},
"source": [
"### Log into Google Cloud Storage and Authenticate ourselves\n",
"1. Authenticate yourself with your Google Cloud Login\n",
"2. A second tab will open or follow the link provided\n",
"3. Follow prompts to Authorize your account to use Google Cloud SDK\n",
"4. Copy code provided and paste into the box under the Command\n",
"5. Press Enter\n",
"\n",
"Alternatives for Authentication can be found [here](https://googleapis.github.io/google-cloud-python/latest/core/auth.html)"
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 360
},
"colab_type": "code",
"id": "CzjAwn9w3Ghc",
"outputId": "7b61026c-a807-4356-f124-e1a406e6c91a"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Go to the following link in your browser:\n",
"\n",
" https://accounts.google.com/o/oauth2/auth?redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&prompt=select_account&response_type=code&client_id=764086051850-6qr4p6gpi6hn506pt8ejuq83di341hur.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&access_type=offline\n",
"\n",
"\n",
"Enter verification code: 4/XQGk8wtHV404M8mfwbkdcZjmj-DpxkeKCnUvD3hh4y8XCWa00jfNoww\n",
"\n",
"Credentials saved to file: [/content/.config/application_default_credentials.json]\n",
"\n",
"These credentials will be used by any library that requests\n",
"Application Default Credentials.\n",
"\n",
"To generate an access token for other uses, run:\n",
" gcloud auth application-default print-access-token\n",
"\n",
"\n",
"To take a quick anonymous survey, run:\n",
" $ gcloud alpha survey\n",
"\n"
]
}
],
"source": [
"# Run a command line command with the bang (!) and gcloud\n",
"!gcloud auth application-default login "
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "lqtslyr2zNTA"
},
"source": [
"### View Datasets and Tables in BigQuery\n",
"Let us look at the datasets available through ISB-CGC that are in BigQuery. You will need to load the BigQuery API and set the client [(click here for more information)](https://googleapis.github.io/google-cloud-python/latest/bigquery/usage/client.html)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 364
},
"colab_type": "code",
"id": "usDiQye0PWPF",
"outputId": "15ceedeb-6275-4623-cf37-bbd6da3a7650"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Datasets in project isb-cgc:\n",
"\tCCLE_bioclin_v0\n",
"\tGDC_metadata\n",
"\tGTEx_v7\n",
"\tQotM\n",
"\tTARGET_bioclin_v0\n",
"\tTARGET_hg38_data_v0\n",
"\tTCGA_bioclin_v0\n",
"\tTCGA_hg19_data_v0\n",
"\tTCGA_hg38_data_v0\n",
"\tToil_recompute\n",
"\tccle_201602_alpha\n",
"\tgenome_reference\n",
"\thg19_data_previews\n",
"\thg38_data_previews\n",
"\tmetadata\n",
"\tplatform_reference\n",
"\ttcga_201607_beta\n",
"\ttcga_cohorts\n",
"\ttcga_seq_metadata\n"
]
}
],
"source": [
"\n",
"# Load BigQuery API\n",
"from google.cloud import bigquery\n",
"\n",
"# Create a client to access the data within BigQuery\n",
"client = bigquery.Client('isb-cgc')\n",
"\n",
"# Create a variable of datasets \n",
"datasets = list(client.list_datasets())\n",
"# Create a variable for the name of the project\n",
"project = client.project\n",
"\n",
"# If there are datasets available then print their names,\n",
"# else print that there are no data sets available\n",
"if datasets:\n",
" print(\"Datasets in project {}:\".format(project))\n",
" for dataset in datasets: # API request(s)\n",
" print(\"\\t{}\".format(dataset.dataset_id))\n",
"else:\n",
" print(\"{} project does not contain any datasets.\".format(project))"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "dcKkXP0H8W6I"
},
"source": [
"Let us see which tables are under the TCGA_bioclin_v0 dataset."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 85
},
"colab_type": "code",
"id": "IZ_Odw0z1fOn",
"outputId": "60e16a24-836e-4a7d-87a3-7e1251b24356"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tables:\n",
"\tAnnotations\n",
"\tBiospecimen\n",
"\tClinical\n"
]
}
],
"source": [
"print(\"Tables:\")\n",
"# Create a variable with the list of tables in the dataset\n",
"tables = list(client.list_tables('isb-cgc.TCGA_bioclin_v0'))\n",
"\n",
"# If there are tables then print their names,\n",
"# else print that there are no tables\n",
"if tables:\n",
" for table in tables:\n",
" print(\"\\t{}\".format(table.table_id))\n",
"else:\n",
" print(\"\\tThis dataset does not contain any tables.\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "mjXF-Vg4z5mv"
},
"source": [
"### Access BigQuery to call a table\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "GjCNNGweyyCE"
},
"source": [
"First you'll want to call to BigQuery with a magic command and then you can use Standard SQL to write your query. Click [here](https://googleapis.github.io/google-cloud-python/latest/bigquery/magics.html) for more on IPython Magic Commands for BigQuery. The result will be a [Pandas Dataframe](https://pandas.pydata.org/)."
]
},
{
"cell_type": "code",
"execution_count": 0,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 204
},
"colab_type": "code",
"id": "TYf4eBOKnKEz",
"outputId": "fd105f87-1aec-4b95-af4c-3d1e75b99b90"
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
"
\n",
"
\n",
"
program_name
\n",
"
case_barcode
\n",
"
project_short_name
\n",
"
\n",
" \n",
" \n",
"
\n",
"
0
\n",
"
TCGA
\n",
"
TCGA-01-0628
\n",
"
TCGA-OV
\n",
"
\n",
"
\n",
"
1
\n",
"
TCGA
\n",
"
TCGA-01-0630
\n",
"
TCGA-OV
\n",
"
\n",
"
\n",
"
2
\n",
"
TCGA
\n",
"
TCGA-01-0631
\n",
"
TCGA-OV
\n",
"
\n",
"
\n",
"
3
\n",
"
TCGA
\n",
"
TCGA-01-0633
\n",
"
TCGA-OV
\n",
"
\n",
"
\n",
"
4
\n",
"
TCGA
\n",
"
TCGA-01-0636
\n",
"
TCGA-OV
\n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" program_name case_barcode project_short_name\n",
"0 TCGA TCGA-01-0628 TCGA-OV\n",
"1 TCGA TCGA-01-0630 TCGA-OV\n",
"2 TCGA TCGA-01-0631 TCGA-OV\n",
"3 TCGA TCGA-01-0633 TCGA-OV\n",
"4 TCGA TCGA-01-0636 TCGA-OV"
]
},
"execution_count": 5,
"metadata": {
"tags": []
},
"output_type": "execute_result"
}
],
"source": [
"# Call to BigQuery with a magic command\n",
"# and replace PROJECT_ID with your project ID Number\n",
"%%bigquery --project PROJECT_ID\n",
"SELECT # Select a few columns to view\n",
" program_name,\n",
" case_barcode,\n",
" project_short_name\n",
"FROM # From the TCGA Clinical Dataset\n",
" `isb-cgc.TCGA_bioclin_v0.Clinical`\n",
"LIMIT # Limit to 5 rows as the dataset is very large and we only want to see a few results\n",
" 5\n",
"\n",
"# Syntax for the above query\n",
"# SELECT * \n",
"# FROM `project_name.dataset_name.INFORMATION_SCHEMA.COLUMNS`\n",
"# Limit to the first 5 fields"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "XBHrxYK_1v6f"
},
"source": [
"Now that wasn't so difficult! Have fun exploring and analyzing the ISB-CGC Data!"
]
},
{
"cell_type": "markdown",
"metadata": {
"colab_type": "text",
"id": "vcRchKFAuiz5"
},
"source": [
"## Where to Go Next\n",
"\n",
"Explore, Discover, and Analyze the Data provided by ISB-CGC along with side by side with your own! :)\n",
"\n",
"ISB-CGC Links:\n",
"\n",
"* [ISB-CGC Landing Page](https://isb-cgc.appspot.com/)\n",
"* [ISB-CGC Documentation](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/)\n",
"* [How to Get Started on ISB-CGC](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/HowToGetStartedonISB-CGC.html)\n",
"* [How to access Google BigQuery](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/progapi/bigqueryGUI/HowToAccessBigQueryFromTheGoogleCloudPlatform.html)\n",
"* [Community Notebook Repository](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/HowTos.html)\n",
"* [Query of the Month](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/QueryOfTheMonthClub.html)\n",
"* [Quick Links](https://isb-cancer-genomics-cloud.readthedocs.io/en/latest/sections/QuicklinksOneTable.html)\n",
"\n",
"Google Tutorials:\n",
"\n",
"* [Google's What is BigQuery?](https://cloud.google.com/bigquery/what-is-bigquery)\n",
"* [Google Cloud Client Library for Python](https://googleapis.github.io/google-cloud-python/latest/index.html)"
]
}
],
"metadata": {
"colab": {
"collapsed_sections": [],
"name": "Quick Start Guide to ISB-CGC.ipynb",
"provenance": []
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 1
}