{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to CyberGIS-Compute" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Authors:** Rebecca Vandewalle rcv3@illinois.edu, Zimo Xiao zimox2@illinois.edu, Furqan Baig fbaig@illinois.edu, and Anand Padmanabhan apadmana@illinois.edu\n", "
**Last Updated:** 10-10-21\n", "\n", "CyberGIS-Compute enables students and researchers from diverse backgrounds to take advantage of High Performance Computing (HPC) resources without having to delve into the details of system setup, maintenance and management. \n", "\n", "CyberGIS-Compute is designed to run on the CyberGISX platform, which uses Virtual ROGER (Resourcing Open Geospatial Education and Research), a geospatial supercomputer with access to a number of readily available popular geospatial libraries. A major goal of CyberGISX is to provide its users with straightforward access software tools, libraries and computational resources for reproducible geospatial research and education, so they may focus their efforts on research and application development activities instead of software installation and system engineering. CyberGIS-Compute provides a bridge between the regular capabilities of CyberGISX and powerful HPC resources so its users can leverage these computational resources for geospatial problem solving with minimal effort and a low learning curve.\n", "\n", "In example 1: the hello world example, we will learn\n", "- The basics of the CyberGIS-Compute environment\n", "- The life cycle of a typical job in CyberGIS-Compute\n", "- How to run a simple predefined `Hello World` example on HPC using CyberGIS-Compute\n", "\n", "In example 2: the custom code example, we will learn\n", "- How to connect to custom code to CyberGIS-Compute\n", "- How to run a job with custom (user supplied) code on HPC using CyberGIS-Compute\n", "\n", "In example 3: the user interface example, we will learn\n", "- How to run a job using the user interface\n", "\n", "In example 4: the evacuation example, we will learn\n", "- How to run a more complex job using CyberGIS-Compute" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Contents\n", "- [Prerequisites](#prereqs)\n", "- [Setup](#setup)\n", "- [CyberGISX terminologies](#terminologies)\n", "- [Example 1: Hello World - Running Prepackaged Code using CyberGIS-Compute](#example1)\n", " - [Import the CyberGIS-Compute client](#import_client)\n", " - [Create a CyberGIS-Compute object](#create_object)\n", " - [List GitHub repositories](#list_git)\n", " - [Create an HPC job](#create_job)\n", " - [Submit the HPC job](#submit_job)\n", " - [View job events](#view_events)\n", " - [View job logs](#view_logs)\n", "- [Example 2: Setting up Custom Code using CyberGIS-Compute](#example2)\n", " - [Creating a job](#create_comm_job)\n", " - [List HPC resources](#list_hpc)\n", " - [Accessing custom code](#accessing)\n", " - [Stages of execution](#stages)\n", " - [Configuring the manifest file](#manifest)\n", " - [Specifying the custom code repository](#specifying)\n", " - [Submitting the job and tracking progress](#submit_and_track)\n", "- [Example 3: Using the Job Submission User Interface](#example3)\n", "- [Example 4: Using CyberGIS-Compute to run an Evacuation Computation](#example4)\n", " - [Review available resources](#review)\n", " - [Submit the job](#submit_accessibility)\n", " - [Download results](#download_results)\n", " - [Display results](#display_results)\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Prerequisites\n", "To best understand this notebook, you should ideally have:\n", "- Familiarity with the Python programming language\n", "- Some experience working with Jupyter Notebooks\n", "- Slight familiarity with the Git repositories" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Setup\n", "Run the following code cell to **load CyberGIS-Compute**, the backend software development kit (SDK). This will allow us to be able to access and work with High Performance Computing (HPC) resources within CyberGISX.\n", "\n", "Note: This cell will generate considerable text output." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Load the CyberGIS-Compute client\n", "\n", "import sys\n", "!{sys.executable} -m pip install --ignore-installed git+https://github.com/cybergis/job-supervisor-python-sdk.git@v2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Important!\n", "\n", "After installing the client using the previous cell you must restart the Kernel! Then, run each code cell individually! The notebook will not work if you select `Restart & Run All`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## CyberGIS-Compute terminologies\n", "\n", "Before we get to the example, it is helpful to introduce some key terms.\n", "\n", "A typical High Performance Computing (HPC) job in CyberGISX using CyberGIS-Compute consists of the following major components:\n", "\n", "- CyberGIS-Compute\n", " - This is an entry point to the CyberGISX environment from a Python/Jupyter notebook. \n", " All interactions with the High Performance Computing (HPC) backend are performed using this object.\n", "- High Performance Computing (HPC) Resources\n", " - These are backend resources which typically require considerable effort to setup and maintain\n", " - The details of working with these resources are abstracted from users\n", " - These include a number of popular geospatial libraries\n", "- CyberGIS-Compute SDK (Software Development Kit)\n", " - The CyberGIS-Compute SDK provide an application programming interface (API) for creating the CyberGIS-compute objects for submitting computational tasks to HPC resources, monitoring such tasks and downloading results after the execution of the tasks on remote HPC resources\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Example 1: Hello World - Running Prepackaged Code using CyberGIS-Compute" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To gain familiarity with the CyberGIS-Compute environment, we first present a simple `Hello World` example. We will learn how to write code to initialize and work with some of the CyberGIS-Compute components mentioned above." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Import the CyberGIS-Compute client\n", "As mentioned earlier, every notebook using CyberGIS-Compute has to create a `CyberGISCompute` object to interact with the broader system. The following cell imports the required library from the Python Software Development Kit (SDK) that was downloaded and installed \n", "in the [Setup](#setup) section." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# Load CyberGIS-Compute client\n", "from cybergis_compute_client import CyberGISCompute" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Create a CyberGIS-Compute object\n", "After importing CyberGIS-Compute, we first need to initialize a `CyberGISCompute` object. `v2` specifies that we want to use the 2nd version." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "application/javascript": [ "IPython.notebook.kernel.execute(`CyberGISCompute.jupyterhubHost = \"${window.location.host}\"`);" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Create CyberGIS-Compute object\n", "cybergis = CyberGISCompute(suffix='v2')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To work with CyberGIS-Compute we need to login. This helps track use of computing resources." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "๐Ÿ’ป Found system token\n", "๐ŸŽฏ Logged in as beckvalle@cybergisx.cigi.illinois.edu\n" ] } ], "source": [ "# Login\n", "cybergis.login()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### List GitHub repositories\n", "In this example we are running a prebuilt job. You can see what jobs are available in the current environment using the `list_git()` function for the `CyberGISCompute` object we created above. Prebuilt jobs are stored using [GitHub repositories](https://github.com/). GitHub is common online storage place for code." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
link name container repository commit
git://uncertainty_in_spatial_accessibilityUncertainty_in_Spatial_Accessibilitycybergisx-0.4https://github.com/JinwooParkGeographer/Uncertainty-in-Spatial-Accessibility.gitNONE
git://spatial_access_covid-19 COVID-19 spatial accessibility cybergisx-0.4https://github.com/cybergis/cybergis-compute-spatial-access-covid-19.git NONE
git://mpi_hello_world MPI Hello World mpich https://github.com/cybergis/cybergis-compute-mpi-helloworld.git NONE
git://hello_world hello world python https://github.com/cybergis/cybergis-compute-hello-world.git NONE
git://fireabm hello FireABM cybergisx-0.4https://github.com/cybergis/cybergis-compute-fireabm.git NONE
git://data_fusion data fusion python https://github.com/CarnivalBug/data_fusion.git NONE
git://cybergis-compute-modules-test modules test cjw-eb https://github.com/alexandermichels/cybergis-compute-modules-test.git NONE
git://bridge_hello_world hello world python https://github.com/cybergis/CyberGIS-Compute-Bridges-2.git NONE
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Show available jobs\n", "cybergis.list_git()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Create an HPC job\n", "\n", "In the next line, you will create a simple job object using the `create_job()` function. The variable that this function result is assigned to will be used for further interactions with the job." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "๐ŸŽฏ Logged in as beckvalle@cybergisx.cigi.illinois.edu\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346193d8oumkeeling_community {} null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:49:52.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Create a job\n", "demo_job = cybergis.create_job()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the next line, you will create a simple `Hello World` job by setting the `executableFolder` value to the GitHub folder short link for the job. This job also expects a variable named `a` to be set. We will discuss the process of setting additional options more later." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346193d8oumkeeling_communitygit://hello_world {"a": 32}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:49:52.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Set job options\n", "demo_job.set(executableFolder=\"git://hello_world\", param={\"a\": 32})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Submit the HPC job\n", "\n", "The next step is to submit the job created by the maintainer using the `submit()` function. Once a job is submitted, the code will be sent to and run on the selected High Performance Computing backend resources." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "โœ… job submitted\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346193d8oumkeeling_communitygit://hello_world {"a": 32}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:49:52.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Submit the job\n", "demo_job.submit()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### View job events\n", "\n", "Once you have submitted a job, there is a lot going on in the backend. Since CyberGIS-Compute is implemented as a public shared service, a submitted job may not necessarily be executed right away. Instead, all jobs are added to a queue and then scheduled one by one based on available resources. Once a job starts executing, it goes through several\n", "steps, such as:\n", "\n", "- Uploading the executable to the HPC backend\n", "- Creating directories for writing results (None for this example)\n", "- Running the executable\n", "- Getting the job results\n", "\n", "All of these job stages can be observed in real time once a job is submitted using the `events()` function, with the `liveOutput` parameter set to `True`, which you can see in the following cell." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[H\u001b[2J๐Ÿ“‹ Job events:\n", "๐Ÿ“ฎ Job ID: 1635346193d8oum\n", "๐Ÿ–ฅ HPC: keeling_community\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
types message time
JOB_QUEUED job [1635346193d8oum] is queued, waiting for registration 2021-10-27T14:49:52.000Z
JOB_REGISTEREDjob [1635346193d8oum] is registered with the supervisor, waiting for initialization2021-10-27T14:49:54.000Z
JOB_INIT job [1635346193d8oum] is initialized, waiting for job completion 2021-10-27T14:49:57.000Z
JOB_ENDED job [1635346193d8oum] finished 2021-10-27T14:50:00.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# View job events\n", "demo_job.events(liveOutput=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### View job logs\n", "A submitted job usually goes through a regular life cycle, as mentioned in previous step. The status of a submitted job provides limited information about the current stage a job is in. Additional useful information can be retrieved through logs. Logs are very important when you debugging code for jobs that execute on remote resources. You can access logs for a job using the `logs()` function, with the `liveOutput` parameter set to `True`, as shown below.\n", "\n", "For this `Hello World` example, when the job has been successfully submitted and executed, the log will display a message about what scripts are running and print the job parameters." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[H\u001b[2J๐Ÿ“‹ Job logs:\n", "๐Ÿ“ฎ Job ID: 1635346193d8oum\n", "๐Ÿ–ฅ HPC: keeling_community\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
message time
running setup...\n", "running main...\n", "\n", "./job.json\n", "\n", "SLURM_NODEID\n", "\n", "0\n", "SLURM_PROCID\n", "\n", "0\n", "job_id\n", "1635346193d8oum\n", "param_a\n", "32\n", "{'job_id': '1635346193d8oum', 'user_id': 'beckvalle@cybergisx.cigi.illinois.edu', 'maintainer': 'community_contribution', 'hpc': 'keeling_community', 'param': {'a': 32}, 'env': {}, 'executable_folder': '/1635346193d8oum/executable', 'data_folder': '/1635346193d8oum/data', 'result_folder': '/1635346193d8oum/result'}\n", "running cleanup... 2021-10-27T14:50:00.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# View job logs\n", "demo_job.logs(liveOutput=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example illustrated the necessary steps to get started with CyberGIS-Compute on the CyberGISX environment. We went through the setup, basic terminologies and life cycle of a simple `Hello World` job.\n", "\n", "In our next step, we will dive more into the details of customizing maintainers to be able to create and execute user specified code in CyberGISX." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Example 2: Setting up Custom Code using CyberGIS-Compute" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the previous example, we ran some code already written by a developer. However, you may want to change how the job runs by setting job parameters and variables.\n", "\n", "In this example, we will further discuss how to set parameters and variables to customize how the job uses HPC resources on the CyberGISX environment using CyberGIS-Compute." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Creating a job\n", "\n", "As above, first we need to we will create a job using the `create_job()` function." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "๐ŸŽฏ Logged in as beckvalle@cybergisx.cigi.illinois.edu\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346203fTD7Mkeeling_community {} null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:03.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Create a custom job\n", "custom_job = cybergis.create_job()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### List HPC resources\n", "You can list the HPC resources available in the current environment using the `list_hpc()` function for the `CyberGISCompute` object we created above. Right now we will be using the default `keeling_community` HPC backend, but you can change the `hpc` variable within the `create_job()` function in order to run the code on different computing resources." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
hpc ip port is_community_account
keeling_communitykeeling.earth.illinois.edu22 True
expanse_communitylogin.expanse.sdsc.edu 22 True
bridges_communitybridges2.psc.edu 22 True
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# List available HPC resources\n", "cybergis.list_hpc()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Accessing custom code\n", "Again we will need to specify a Git repository containing the code. For security reasons, only approved git repositories are allowed. The list of approved and available repositories can be listed by using the `list_git()` function, as shown in the following cell." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
link name container repository commit
git://uncertainty_in_spatial_accessibilityUncertainty_in_Spatial_Accessibilitycybergisx-0.4https://github.com/JinwooParkGeographer/Uncertainty-in-Spatial-Accessibility.gitNONE
git://spatial_access_covid-19 COVID-19 spatial accessibility cybergisx-0.4https://github.com/cybergis/cybergis-compute-spatial-access-covid-19.git NONE
git://mpi_hello_world MPI Hello World mpich https://github.com/cybergis/cybergis-compute-mpi-helloworld.git NONE
git://hello_world hello world python https://github.com/cybergis/cybergis-compute-hello-world.git NONE
git://fireabm hello FireABM cybergisx-0.4https://github.com/cybergis/cybergis-compute-fireabm.git NONE
git://data_fusion data fusion python https://github.com/CarnivalBug/data_fusion.git NONE
git://cybergis-compute-modules-test modules test cjw-eb https://github.com/alexandermichels/cybergis-compute-modules-test.git NONE
git://bridge_hello_world hello world python https://github.com/cybergis/CyberGIS-Compute-Bridges-2.git NONE
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# List approved GitHub repositories\n", "cybergis.list_git()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As above, we will use `git://hello_world` which points to the [`https://github.com/cybergis/cybergis-compute-hello-world.git` git repository](https://github.com/cybergis/cybergis-compute-hello-world.git). " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Stages of execution\n", "\n", "A job that runs using remote HPC resources can be divided into three primary stages: initialize, execute, and finalize. When we create custom code, we can specify what files should be run at each of these stages. The `git://hello_world` repository contains three python files, one for each stage.\n", "\n", "- Initialization Stage\n", "\n", " - In general the initialization stage will specify and setup a one time initial configuration for the job. For instance, code for this stage will set global variables, parse logic for data, etc. In the current example, code for this stage is specified in `setup.py` which simply prints a message `running setup`.\n", "\n", "- Execution Stage\n", "\n", " - The execution stage contains main logic for the job. Generally this will include the parallel and distributed logic to take advantage of the CyberGISX High Performance Computing backend. For instance, multi-threaded code, distributed logic, etc. In this example, this is specified in `main.py` which simply prints a message `running main`.\n", "\n", "- Finalization Stage\n", "\n", " - As the name suggests, the code in this stage is intended for tasks like clearing up job specific configurations etc. In this example, this is specified in `setup.py` which simply prints a message `running cleanup`.\n", " \n", "**Note that the only stage that needs to be set is the Execution State.**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Configuring the manifest file\n", "\n", "A code repository intended for use with CyberGIS-Compute needs to have a `manifest.json` file, which specifies which code to execute in the different job stages.\n", "\n", "A typical `manifest.json` will have the following format\n", "\n", "\n", "```\n", "{\n", " \"name\": \"hello world\",\n", " \"container\": \"python\",\n", " \"execution_stage\": \"python {{JOB_EXECUTABLE_PATH}}/main.py\"\n", "}\n", "```\n", "\n", "Note that `{{JOB_EXECUTABLE_PATH}}` will be replaced during the remote job submission process by the actual path needed to run the job.\n", "\n", "This example is setup to run a Python script, however other commands can also be used such as `\"execution_stage\": \"bash run_job.sh\"`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Specifying the custom code repository\n", "\n", "Now we need to specify the git repository containing the code for the job object. You can do this using the `set()` function and the `executableFolder` parameter. This job also needs a custom variable passed to this. You do this by passing a dictionary called `param` to the `set()` function. Note that we are passing a variable called `a` to CyberGISCompute. Within the job, this variable will be accessed using the name `param_a`." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346203fTD7Mkeeling_communitygit://hello_world {"a": "param a is set"}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:03.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Specify GitHub repository\n", "custom_job.set(executableFolder='git://hello_world', param={\"a\": \"param a is set\"})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Submitting the job and tracking progress\n", "\n", "Now we have everything setup to submit our new job with custom code specified at the configured Git repository. These last processes of submitting and tracking the job status work the same as in the first example." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "โœ… job submitted\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346203fTD7Mkeeling_communitygit://hello_world {"a": "param a is set"}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:03.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Submit custom job\n", "custom_job.submit()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can get information about the job using the `status()` function as shown below." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'id': '1635346203fTD7M',\n", " 'userId': 'beckvalle@cybergisx.cigi.illinois.edu',\n", " 'secretToken': 'tkeJM0NOxIbcLiC7GMdUb0gYEevuz1FqkhK3okCGjf8CO',\n", " 'maintainer': 'community_contribution',\n", " 'hpc': 'keeling_community',\n", " 'executableFolder': 'git://hello_world',\n", " 'dataFolder': None,\n", " 'resultFolder': None,\n", " 'param': {'a': 'param a is set'},\n", " 'env': {},\n", " 'slurm': None,\n", " 'createdAt': '2021-10-27T14:50:03.000Z',\n", " 'updatedAt': '2021-10-27T14:50:03.000Z',\n", " 'deletedAt': None,\n", " 'initializedAt': None,\n", " 'finishedAt': None,\n", " 'isFailed': False,\n", " 'logs': [],\n", " 'events': [{'id': 492,\n", " 'jobId': '1635346203fTD7M',\n", " 'type': 'JOB_QUEUED',\n", " 'message': 'job [1635346203fTD7M] is queued, waiting for registration',\n", " 'createdAt': '2021-10-27T14:50:07.000Z',\n", " 'updatedAt': '2021-10-27T14:50:07.000Z',\n", " 'deletedAt': None}]}" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Check custom job status\n", "custom_job.status()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next you can track the job progress." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[H\u001b[2J๐Ÿ“‹ Job events:\n", "๐Ÿ“ฎ Job ID: 1635346203fTD7M\n", "๐Ÿ–ฅ HPC: keeling_community\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
types message time
JOB_QUEUED job [1635346203fTD7M] is queued, waiting for registration 2021-10-27T14:50:07.000Z
JOB_REGISTEREDjob [1635346203fTD7M] is registered with the supervisor, waiting for initialization2021-10-27T14:50:09.000Z
JOB_INIT job [1635346203fTD7M] is initialized, waiting for job completion 2021-10-27T14:50:11.000Z
JOB_ENDED job [1635346203fTD7M] finished 2021-10-27T14:50:15.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Check custom job events\n", "custom_job.events(liveOutput=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As above, you can check the job output using the `logs()` function." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[H\u001b[2J๐Ÿ“‹ Job logs:\n", "๐Ÿ“ฎ Job ID: 1635346203fTD7M\n", "๐Ÿ–ฅ HPC: keeling_community\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
message time
running setup...\n", "running main...\n", "\n", "./job.json\n", "\n", "SLURM_NODEID\n", "\n", "0\n", "SLURM_PROCID\n", "\n", "0\n", "job_id\n", "1635346203fTD7M\n", "param_a\n", "param a is set\n", "{'job_id': '1635346203fTD7M', 'user_id': 'beckvalle@cybergisx.cigi.illinois.edu', 'maintainer': 'community_contribution', 'hpc': 'keeling_community', 'param': {'a': 'param a is set'}, 'env': {}, 'executable_folder': '/1635346203fTD7M/executable', 'data_folder': '/1635346203fTD7M/data', 'result_folder': '/1635346203fTD7M/result'}\n", "running cleanup... 2021-10-27T14:50:15.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Check custom job logs\n", "custom_job.logs(liveOutput=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, you can download the output and error messages using the `downloadResultFolder()` function. The zip folder will be placed in the same location as this notebook." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "# create results folder to store downloaded results\n", "import os\n", "if not os.path.isdir(\"custom_result\"):\n", " os.mkdir(\"custom_result\")" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "file successfully downloaded under: ./custom_result/1635346209ZGWs.zip\n" ] }, { "data": { "text/plain": [ "'./custom_result/1635346209ZGWs.zip'" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Download results from custom job\n", "custom_job.downloadResultFolder('./custom_result')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example illustrated the necessary steps to run custom code on the CyberGISX environment using CyberGIS-Compute. We went quickly through how to set up and specify a git repository that contains code and ran the custom job by specifying where to access the code.\n", "\n", "Future examples will demonstrate how to run existing maintainers with custom data and how to run custom code that uses custom data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Example 3: Using the Job Submission User Interface" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We are working on a graphical job submission user interface to simplify the job submission process and help make sure the submitted options make sense for the submitted job. Although this interface is under active development, it still can be used to select and run jobs.\n", "\n", "Try setting the Git Repository to `git://cybergis-compute-modules-test`." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "b990f349bd434371b2a00db61aa936e9", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='๐Ÿ“ฆ Git Repository:', options=('git://uncertainty_in_spatial_accessibility', 'git://spatiaโ€ฆ" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "27014e92f2b8448c80df8676aecd292c", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Dropdown(description='๐Ÿ–ฅ HPC Endpoint:', options=('keeling_community', 'expanse_community', 'bridges_community'โ€ฆ" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "#### Slurm Options:" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "Click checkboxs to enable option and overwrite default config value. All configs are optional. Please refer to [Slurm official documentation](https://slurm.schedmd.com/sbatch.html)" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "04e7052e79934e4e9b76adf3a506cb86", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "8cd7f0cb3831491eacc4e1d35d6da976", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/markdown": [ "#### Globus File Upload/Download:" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "fcb485c525344d5b9ef54d5291cd4686", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "29bb26d3428a4b05963c805a300c4536", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "467422b3b3f14aeca502af93f9895a15", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "09e877a7b53e4e8ab64e4a566bdff56f", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "1f7c1ff06dd84f53baeccb5bd9dd8e8a", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "616536d2851845008ed53956886aff76", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "d9b534ae939f4748b62c117aaff42565", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Output()" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "cybergis.create_job_by_UI()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "## Example 4: Using CyberGIS-Compute to run an Evacuation Computation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this final example, we will use CyberGIS-Compute to run a more complex example based on an agent based model of evacuation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Review available resources" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Using the commands described above, we can review the available resources, maintainers, and custom code repositories." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
link name container repository commit
git://uncertainty_in_spatial_accessibilityUncertainty_in_Spatial_Accessibilitycybergisx-0.4https://github.com/JinwooParkGeographer/Uncertainty-in-Spatial-Accessibility.gitNONE
git://spatial_access_covid-19 COVID-19 spatial accessibility cybergisx-0.4https://github.com/cybergis/cybergis-compute-spatial-access-covid-19.git NONE
git://mpi_hello_world MPI Hello World mpich https://github.com/cybergis/cybergis-compute-mpi-helloworld.git NONE
git://hello_world hello world python https://github.com/cybergis/cybergis-compute-hello-world.git NONE
git://fireabm hello FireABM cybergisx-0.4https://github.com/cybergis/cybergis-compute-fireabm.git NONE
git://data_fusion data fusion python https://github.com/CarnivalBug/data_fusion.git NONE
git://cybergis-compute-modules-test modules test cjw-eb https://github.com/alexandermichels/cybergis-compute-modules-test.git NONE
git://bridge_hello_world hello world python https://github.com/cybergis/CyberGIS-Compute-Bridges-2.git NONE
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Review CyberGIS-Compute resources\n", "cybergis.list_git()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, as above, we create a job, set the GitHub repository to the evacuation repo, and submit the job." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Submit the job" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "๐ŸŽฏ Logged in as beckvalle@cybergisx.cigi.illinois.edu\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346224qotAjkeeling_community {} null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:24.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346224qotAjkeeling_communitygit://fireabm {"start_value": 20}{"num_of_task": 2, "walltime": "10:00"}beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:24.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "โœ… job submitted\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346224qotAjkeeling_communitygit://fireabm {"start_value": 20}{"num_of_task": 2, "walltime": "10:00"}beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:24.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create job, set GitHub repo, and submit job\n", "demo_job = cybergis.create_job()\n", "demo_job.set(executableFolder=\"git://fireabm\", param={\"start_value\": 20}, \n", " slurm = {\"num_of_task\": 2, \"walltime\": \"10:00\"})\n", "demo_job.submit()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now view the job events as it is running." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[H\u001b[2J๐Ÿ“‹ Job events:\n", "๐Ÿ“ฎ Job ID: 1635346224qotAj\n", "๐Ÿ–ฅ HPC: keeling_community\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
types message time
JOB_QUEUED job [1635346224qotAj] is queued, waiting for registration 2021-10-27T14:50:24.000Z
JOB_REGISTEREDjob [1635346224qotAj] is registered with the supervisor, waiting for initialization2021-10-27T14:50:27.000Z
JOB_INIT job [1635346224qotAj] is initialized, waiting for job completion 2021-10-27T14:50:30.000Z
JOB_ENDED job [1635346224qotAj] finished 2021-10-27T14:51:17.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# View job events\n", "demo_job.events(liveOutput=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once the job has finished, you can view logs. In this case, information about the running software is displayed." ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[H\u001b[2J๐Ÿ“‹ Job logs:\n", "๐Ÿ“ฎ Job ID: 1635346224qotAj\n", "๐Ÿ–ฅ HPC: keeling_community\n" ] }, { "data": { "text/html": [ "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
message time
node id: 0, task id: 1, start number: 20, SEED: 21, result folder: /1635346224qotAj/result\n", "\n", "/1635346224qotAj/executable\n", "node id: 0, task id: 0, start number: 20, SEED: 20, result folder: /1635346224qotAj/result\n", "\n", "/1635346224qotAj/executable\n", "copying over files\n", "using FireABM_opt\n", "\n", "!! starting file parse at: 09:50:45\n", "\n", "!! Working Directory: /1635346224qotAj/executable\n", "\n", "!! checking input parameters\n", "!! input parameters OK\n", "\n", "!! starting full run at 09:50:45 \n", "\n", "!! run simulation\n", "\n", "run params: 100% shortest ...[download for full log] 2021-10-27T14:51:17.000Z
" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# View job logs\n", "demo_job.logs(liveOutput=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Download results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can save the completed evacuation computation results to a local folder and extract them from the zip file." ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "# create results folder to store downloaded results\n", "import os\n", "if not os.path.isdir(\"fireabm_result/result/\"):\n", " os.makedirs(\"fireabm_result/result/\")" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "file successfully downloaded under: ./fireabm_result/1635346227u7qn.zip\n" ] } ], "source": [ "# Save results to zip\n", "result_zip = demo_job.downloadResultFolder('./fireabm_result')" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "# Extract results\n", "import os, zipfile\n", "extract_results_to = \"fireabm_result/result/\"\n", "with zipfile.ZipFile(result_zip, 'r') as zip_ref:\n", " zip_ref.extractall(extract_results_to)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "### Display results" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we can visualize the results of the computation and plot the result." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Display results from the job\n", "import glob\n", "from IPython.display import Video, HTML\n", "rfile = glob.glob(\"./fireabm_result/result/demo_quick_start20/1videos/*.mp4\")[0]\n", "HTML(''%rfile)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This example illustrated how to run a more complex custom program on the CyberGISX environment using CyberGIS-Compute.\n", "\n", "Future examples will demonstrate how to run existing maintainers with custom data and how to run custom code that uses custom data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3-0.8.0" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.5" } }, "nbformat": 4, "nbformat_minor": 4 }