{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# A Python Tour of Data Science\n", "\n", "[Michaël Defferrard](http://deff.ch), *PhD student*, [EPFL](http://epfl.ch) [LTS2](http://lts2.epfl.ch)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This short primer is an introduction to the scientific [Python](https://www.python.org) stack for [Data Science](https://en.wikipedia.org/wiki/Data_science). It is designed as a tour around the major Python packages used for the main computational tasks encountered in [the sexiest job of the 21st century](https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century). At the end of this tour, you'll have a broad overview of the available libraries as well as why and how they are used for each task. This notebook aims at answering the following question: **which tool should I use for which task and how**. Before starting, two remarks:\n", "1. There exists better / faster ways to accomplish the presented computations. The goal is to present the packages and get a sense of which problems they solve.\n", "1. It is not meant to teach you (scientific) Python. I however tried to include the main constructions and idioms of the language and packages. A good ressource to learn scientific Python is a [set of lectures](https://github.com/jrjohansson/scientific-python-lectures) from J.R. Janson." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "\n", "This notebook will walk you through a typical Data Science **process**:\n", "1. **Data acquisition**\n", " 1. Importation\n", " 1. Cleaning\n", " 1. Exploration\n", "1. **Data exploitation**\n", " 1. Pre-processing\n", " 1. (Feature extraction)\n", " 1. Modeling\n", " 1. (Algorithm design)\n", " 1. Evaluation\n", "\n", "Our **motivating example**: predict whether a credit card client will default.\n", "* It is a binary classification task: client will default or not ($y=1$ if yes; $y=0$ if no).\n", "* We have data for 30'000 real clients from Taiwan.\n", "* There is 23 numerical & categorical explanatory variables:\n", " 1. $x_1$: amount of the given credit.\n", " 2. $x_2$: gender (1 = male; 2 = female).\n", " 3. $x_3$: education (1 = graduate school; 2 = university; 3 = high school; 4 = others).\n", " 4. $x_4$: marital status (1 = married; 2 = single; 3 = others).\n", " 5. $x_5$: age (year).\n", " 6. $x_6$ to $x_{11}$: history of past payment (monthly from September to April, 2005) (-1 = pay duly; 1 = payment delay for one month; ...; 9 = payment delay for nine months and above).\n", " 7. $x_{12}$ to $x_{17}$: amount of bill statement (monthly from September to April, 2005).\n", " 8. $x_{18}$ to $x_{23}$: amount of previous payment (monthly from September to April, 2005).\n", "* The data comes from the [UCI ML repository](https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "During this tour, we'll need the packages shown below, which are best installed from [PyPI](https://pypi.python.org) in a [virtual environment](https://docs.python.org/3/library/venv.html):\n", "```\n", "$ pyvenv /path/to/new/virtual/env\n", "$ . /path/to/new/virtual/env/bin/activate\n", "$ pip install -r requirements.txt\n", "$ jupyter-notebook\n", "```" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "numpy\n", "scipy\n", "matplotlib\n", "scikit-learn\n", "\n", "pandas\n", "xlrd\n", "xlwt\n", "tables\n", "sqlalchemy\n", "\n", "statsmodels\n", "sympy\n", "autograd\n", "bokeh\n", "numba\n", "Cython\n", "\n", "keras\n", "#theano\n", "tensorflow\n", "\n", "jupyter\n", "ipython\n" ] } ], "source": [ "%%script sh\n", "cat requirements.txt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The statements starting with `%` or `%%` are [built-in magic commands](http://ipython.readthedocs.io/en/stable/interactive/magics.html), i.e. commands interpreted by the IPython kernel. E.g. `%%script sh` tells IPython to run the cell with `sh` (like the `#!` line at the beginning of script)." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# While packages are usually imported at the top, they can\n", "# be imported wherever you prefer, in whatever scope.\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0 Python\n", "\n", "Before taking our tour, let's briefly talk about Python.\n", "First thing first, the general characteristics of the language:\n", "* **General purpose**: not built for a particular usage, it works as well for scientific computing as for web and application development. It features high-level data structures and supports multiple paradigms: procedural, object-oriented and functional.\n", "* **Elegant syntax**: easy-to-read and intuitive code, easy-to-learn minimalistic syntax, quick to write (low boilerplate / verbosity), maintainability scales well with size of projects.\n", "* **Expressive language**: fewer lines of code, fewer bugs, easier to maintain.\n", "\n", "Technical details:\n", "* **Dynamically typed**: no need to define the type of variables, function arguments or return types. Everything is an object and can be modified at runtime.\n", "* **Automatic memory management** (garbage collector): no need to explicitly allocate and deallocate memory for variables and data arrays. No memory leak bugs.\n", "* **Interpreted** (JIT is coming): No need to compile the code. The Python interpreter reads and executes the python code directly. It also means that a single Python source runs anywhere a runtime is available, like on Windows, Mac, Linux and in the Cloud.\n", "\n", "From those characteristics emerge the following advantages:\n", "* The main advantage is ease of programming, minimizing the time required to develop, debug and maintain the code.\n", "* The well designed language encourages many good programming practices:\n", " * Modular and object-oriented programming, good system for packaging and re-use of code. This often results in more transparent, maintainable and bug-free code.\n", " * Documentation tightly integrated with the code.\n", "* A large community geared toward open-source, an extensive standard library and a large collection of add-on packages and development tools.\n", "\n", "And the following disadvantages:\n", "* There is two versions of Python in general use: 2 and 3. While Python 3 is around since 2008, there are still libraries which only support Python 2. While you should generally go for Python 3, a specific library or legacy code can hold you on Python 2.\n", "* Due to its interpreted and dynamic nature, the execution of Python code can be slow compared to compiled statically typed programming languages, such as C and Fortran. That is however almost solved, see the available solutions at the end of this notebook.\n", "* There is no compiler to catch your errors. Solutions include unit / integration tests or the use of a [linter](https://en.wikipedia.org/wiki/Lint_%28software%29) such as [pyflakes](https://pypi.python.org/pypi/pyflakes), [Pylint](https://www.pylint.org) or [PyChecker](http://pychecker.sourceforge.net). [Flake8](https://pypi.python.org/pypi/flake8) combines static analysis with style checking.\n", "\n", "### 0.1 Why Python for Data Science\n", "\n", "Let's state [why is Python a language of choice for Data Scientists](https://www.quora.com/Why-is-Python-a-language-of-choice-for-data-scientists). Viable alternatives include [matlab](http://ch.mathworks.com/products/matlab), [R](https://www.r-project.org) and [Julia](http://julialang.org), and, for more statistical jobs, the SAS and SPSS statistical packages. The strenghs of Python are:\n", "* Minimal development time.\n", " * Rapid prototyping for data exploration.\n", " * Same language and framework for R&D and production.\n", "* A strong position in scientific computing.\n", " * Large community of users, easy to find help and documentation.\n", " * Extensive ecosystem of open-source scientific libraries and environments.\n", "* Easy integration.\n", " * Many libraries to access data from files, databases or web scraping.\n", " * Many wrappers to legacy code, e.g. C, Fortran or Matlab.\n", "* Available and suitable for High Performance Computing (HPC)\n", " * Close integration with time-tested and highly optimized libraries for fast numerical mathematics like BLAS, LAPACK, ATLAS, OpenBLAS, ARPACK, MKL, etc.\n", " * JIT and AOT compilers.\n", " * Good support for parallel processing with processes and threads, interprocess communication (MPI) and GPU computing (OpenCL and CUDA).\n", "\n", "### 0.2 Why Jupyter \n", "\n", "[Jupyter](http://jupyter.org) notebook is an HTML-based notebook which allows you to create and share documents that contain live code, equations, visualizations and explanatory text. It allows a clean presentation of computational results as HTML or PDF reports and is well suited for interactive tasks surch as data cleaning, transformation and exploration, numerical simulation, statistical modeling, machine learning and more. It runs everywhere (Window, Mac, Linux, Cloud) and supports multiple languages through various kernels, e.g. [Python](https://ipython.org), [R](https://irkernel.github.io), [Julia](https://github.com/JuliaLang/IJulia.jl), [Matlab](https://github.com/Calysto/matlab_kernel).\n", "\n", "While Jupyter is itself becoming an Integreted Development Environment (IDE), alternative scientific IDEs include [Spyder](https://pythonhosted.org/spyder) and [Rodeo](http://rodeo.yhat.com). Non-scientific IDEs include [IDLE](https://docs.python.org/3/library/idle.html) and [PyCharm](https://www.jetbrains.com/pycharm). Vim and Emacs lovers (or more recently Atom and Sublime Text) will find full support of Python in their editor of choice. An interactive prompt, useful for experimentations or as a calculator, is offered by Python itself or by [IPython](https://ipython.org), the Jupyter kernel for Python." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1 Data Importation\n", "\n", "* The world is messy, we got data in CSV, [JSON](http://www.json.org), Excel, [HDF5](https://www.hdfgroup.org/HDF5) files and an SQL database.\n", "* Could also have been matlab, HTML, XML files or from the web via scraping and APIs (e.g. [Twitter Firhose](https://dev.twitter.com/streaming/firehose)) or noSQL data stores, etc." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Download the data.\n", "import utils\n", "utils.get_data('data/')" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "bills.hdf5 demographics.csv payments.sqlite\r\n", "delays.xls original.xls target.json\r\n" ] } ], "source": [ "!ls data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.1 Importing from an SQL Database\n", "\n", "[SQLAlchemy](http://www.sqlalchemy.org/) to the rescue.\n", "* Abstraction between DBAPIs.\n", " * Supported databases: SQLite, Postgresql, MySQL, Oracle, MS-SQL, Firebird, Sybase and others.\n", "* [SQL Expression Language](http://docs.sqlalchemy.org/en/rel_1_0/core/tutorial.html).\n", "* [Object Relational Mapper (ORM)](http://docs.sqlalchemy.org/en/rel_1_0/orm/tutorial.html)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2017-12-03 18:24:19,043 INFO sqlalchemy.engine.base.Engine SELECT payments.\"ID\", payments.\"PAY1\", payments.\"PAY2\", payments.\"PAY3\", payments.\"PAY4\", payments.\"PAY5\", payments.\"PAY6\" \n", "FROM payments\n", "2017-12-03 18:24:19,043 INFO sqlalchemy.engine.base.Engine ()\n" ] } ], "source": [ "import sqlalchemy\n", "engine = sqlalchemy.create_engine('sqlite:///data/payments.sqlite', echo=False)\n", "\n", "# Infer from existing DB.\n", "metadata = sqlalchemy.MetaData()\n", "metadata.reflect(engine)\n", "\n", "# An SQL SELECT statement.\n", "table = metadata.tables.get('payments')\n", "op = sqlalchemy.sql.select([table])\n", "engine.echo = True\n", "result = engine.execute(op)\n", "engine.echo = False" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ID: 1, payments: (0, 689, 0, 0, 0, 0)\n", "ID: 2, payments: (0, 1000, 1000, 1000, 0, 2000)\n", "ID: 3, payments: (1518, 1500, 1000, 1000, 1000, 5000)\n", "ID: 4, payments: (2000, 2019, 1200, 1100, 1069, 1000)\n", "ID: 5, payments: (2000, 36681, 10000, 9000, 689, 679)\n", "ID: 6, payments: (2500, 1815, 657, 1000, 1000, 800)\n", "ID: 7, payments: (55000, 40000, 38000, 20239, 13750, 13770)\n", "ID: 8, payments: (380, 601, 0, 581, 1687, 1542)\n", "ID: 9, payments: (3329, 0, 432, 1000, 1000, 1000)\n", "ID: 10, payments: (0, 0, 0, 13007, 1122, 0)\n" ] } ], "source": [ "# Show some lines, i.e. clients.\n", "for row in result.fetchmany(size=10):\n", " print('ID: {:2d}, payments: {}'.format(row[0], row[1:]))\n", "result.close()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "1299 clients paid 1000 in April 2005\n" ] } ], "source": [ "# Execute some raw SQL.\n", "paid = 1000\n", "op = sqlalchemy.sql.text('SELECT payments.\"ID\", payments.\"PAY6\" FROM payments WHERE payments.\"PAY6\" = {}'.format(paid))\n", "result = engine.execute(op).fetchall()\n", "print('{} clients paid {} in April 2005'.format(len(result), paid))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.2 Merging data Sources\n", "\n", "Put some [pandas](http://pandas.pydata.org/) in our Python !\n", "* Import / export data from / to various sources.\n", "* Data frames manipulations: slicing, dicing, grouping.\n", "* And many more !" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def get_data(directory):\n", " demographics = pd.read_csv(directory + 'demographics.csv', index_col=0)\n", " delays = pd.read_excel(directory + 'delays.xls', index_col=0)\n", " bills = pd.read_hdf(directory + 'bills.hdf5', 'bills')\n", " payments = pd.read_sql('payments', engine, index_col='ID')\n", " target = pd.read_json(directory + 'target.json')\n", " return pd.concat([demographics, delays, bills, payments, target], axis=1)\n", "\n", "import pandas as pd\n", "data = get_data('data/')\n", "attributes = data.columns.tolist()\n", "\n", "# Tansform from numerical to categorical variable.\n", "data['SEX'] = data['SEX'].astype('category')\n", "data['SEX'].cat.categories = ['MALE', 'FEMALE']\n", "data['MARRIAGE'] = data['MARRIAGE'].astype('category')\n", "data['MARRIAGE'].cat.categories = ['UNK', 'MARRIED', 'SINGLE', 'OTHERS']\n", "data['EDUCATION'] = data['EDUCATION'].astype('category')\n", "data['EDUCATION'].cat.categories = ['UNK', 'GRAD SCHOOL', 'UNIVERSITY', 'HIGH SCHOOL', 'OTHERS', 'UNK1', 'UNK2']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 1.3 Looking at the Data" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LIMITSEXEDUCATIONMARRIAGEAGEDEFAULT
120000FEMALEUNIVERSITYMARRIED241
2120000FEMALEUNIVERSITYSINGLE261
390000FEMALEUNIVERSITYSINGLE340
450000FEMALEUNIVERSITYMARRIED370
550000MALEUNIVERSITYMARRIED570
650000MALEGRAD SCHOOLSINGLE370
\n", "
" ], "text/plain": [ " LIMIT SEX EDUCATION MARRIAGE AGE DEFAULT\n", "1 20000 FEMALE UNIVERSITY MARRIED 24 1\n", "2 120000 FEMALE UNIVERSITY SINGLE 26 1\n", "3 90000 FEMALE UNIVERSITY SINGLE 34 0\n", "4 50000 FEMALE UNIVERSITY MARRIED 37 0\n", "5 50000 MALE UNIVERSITY MARRIED 57 0\n", "6 50000 MALE GRAD SCHOOL SINGLE 37 0" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.loc[:6, ['LIMIT', 'SEX', 'EDUCATION', 'MARRIAGE', 'AGE', 'DEFAULT']]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AGEDELAY1DELAY2DELAY3DELAY4DELAY5
12422-1-1-2
226-12000
33400000
43700000
557-10-100
\n", "
" ], "text/plain": [ " AGE DELAY1 DELAY2 DELAY3 DELAY4 DELAY5\n", "1 24 2 2 -1 -1 -2\n", "2 26 -1 2 0 0 0\n", "3 34 0 0 0 0 0\n", "4 37 0 0 0 0 0\n", "5 57 -1 0 -1 0 0" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.iloc[:5, 4:10]" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BILL1BILL2BILL3BILL4BILL5BILL6PAY1PAY2PAY3PAY4PAY5PAY6
13913310268900006890000
2268217252682327234553261010001000100002000
3292391402713559143311494815549151815001000100010005000
4469904823349291283142895929547200020191200110010691000
58617567035835209401914619131200036681100009000689679
\n", "
" ], "text/plain": [ " BILL1 BILL2 BILL3 BILL4 BILL5 BILL6 PAY1 PAY2 PAY3 PAY4 PAY5 \\\n", "1 3913 3102 689 0 0 0 0 689 0 0 0 \n", "2 2682 1725 2682 3272 3455 3261 0 1000 1000 1000 0 \n", "3 29239 14027 13559 14331 14948 15549 1518 1500 1000 1000 1000 \n", "4 46990 48233 49291 28314 28959 29547 2000 2019 1200 1100 1069 \n", "5 8617 5670 35835 20940 19146 19131 2000 36681 10000 9000 689 \n", "\n", " PAY6 \n", "1 0 \n", "2 2000 \n", "3 5000 \n", "4 1000 \n", "5 679 " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data.iloc[:5, 11:23]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Export as an [HTML table](./subset.html) for manual inspection." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "data[:1000].to_html('subset.html')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2 Data Cleaning\n", "\n", "While cleaning data is the [most time-consuming, least enjoyable Data Science task](http://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says), it should be perfomed nonetheless. Problems come in two flavours:\n", "\n", "1. Missing data, i.e. unknown values.\n", "1. Errors in data, i.e. wrong values.\n", "\n", "The actions to be taken in each case is highly **data and problem specific**." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Example: marital status\n", "1. According to dataset description, it should either be 1 (married), 2 (single) or 3 (others).\n", "1. But we find some 0 (previously transformed to `UNK`).\n", "1. Let's *assume* that 0 represents errors when collecting the data and that we should remove those clients." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "SINGLE 15964\n", "MARRIED 13659\n", "OTHERS 323\n", "UNK 54\n", "Name: MARRIAGE, dtype: int64\n", "\n", "We are left with (29946, 24) clients\n", "\n", "[MARRIED, SINGLE, OTHERS]\n", "Categories (3, object): [MARRIED, SINGLE, OTHERS]\n" ] } ], "source": [ "print(data['MARRIAGE'].value_counts())\n", "data = data[data['MARRIAGE'] != 'UNK']\n", "data['MARRIAGE'] = data['MARRIAGE'].cat.remove_unused_categories()\n", "print('\\nWe are left with {} clients\\n'.format(data.shape))\n", "print(data['MARRIAGE'].unique())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Example: education\n", "1. It should either be 1 (graduate school), 2 (university), 3 (high school) or 4 (others).\n", "1. But we find some 0, 5 and 6 (previously transformed to `UNK`, `UNK1` and `UNK2`).\n", "1. Let's *assume* these values are dubious, but do not invalidate the data and keep them as they may have some predictive power." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "UNIVERSITY 14024\n", "GRAD SCHOOL 10581\n", "HIGH SCHOOL 4873\n", "UNK1 280\n", "OTHERS 123\n", "UNK2 51\n", "UNK 14\n", "Name: EDUCATION, dtype: int64\n", "UNIVERSITY 14024\n", "GRAD SCHOOL 10581\n", "HIGH SCHOOL 4873\n", "UNK 345\n", "OTHERS 123\n", "Name: EDUCATION, dtype: int64\n" ] } ], "source": [ "print(data['EDUCATION'].value_counts())\n", "data.loc[data['EDUCATION']=='UNK1', 'EDUCATION'] = 'UNK'\n", "data.loc[data['EDUCATION']=='UNK2', 'EDUCATION'] = 'UNK'\n", "data['EDUCATION'] = data['EDUCATION'].cat.remove_unused_categories()\n", "print(data['EDUCATION'].value_counts())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3 Data Exploration\n", "\n", "* Get descriptive statistics.\n", "* Plot informative figures.\n", "* Verify some intuitive correlations.\n", "\n", "Let's get first some descriptive statistics of our numerical variables." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
LIMITAGEBILL1BILL2BILL3BILL4BILL5BILL6PAY1PAY2PAY3PAY4PAY5PAY6
count2994629946299462994629946299462994629946299462994629946299462994629946
mean16754635512784922447063433064035238911565959265227482948045220
std1298079736827121969393643746083659592165522306017618156771529017791
min1000021-165580-69777-157264-170000-81334-339603000000
25%50000283570298826842335177012611000836390298255122
50%14000034224002122120108190661812117098210020101800150015001500
75%24000041672636410860240546015024449248500750004511401540404000
max10000007996451198393116640898915869271719616648735521684259896040621000426529528666
\n", "
" ], "text/plain": [ " LIMIT AGE BILL1 BILL2 BILL3 BILL4 BILL5 BILL6 \\\n", "count 29946 29946 29946 29946 29946 29946 29946 29946 \n", "mean 167546 35 51278 49224 47063 43306 40352 38911 \n", "std 129807 9 73682 71219 69393 64374 60836 59592 \n", "min 10000 21 -165580 -69777 -157264 -170000 -81334 -339603 \n", "25% 50000 28 3570 2988 2684 2335 1770 1261 \n", "50% 140000 34 22400 21221 20108 19066 18121 17098 \n", "75% 240000 41 67263 64108 60240 54601 50244 49248 \n", "max 1000000 79 964511 983931 1664089 891586 927171 961664 \n", "\n", " PAY1 PAY2 PAY3 PAY4 PAY5 PAY6 \n", "count 29946 29946 29946 29946 29946 29946 \n", "mean 5659 5926 5227 4829 4804 5220 \n", "std 16552 23060 17618 15677 15290 17791 \n", "min 0 0 0 0 0 0 \n", "25% 1000 836 390 298 255 122 \n", "50% 2100 2010 1800 1500 1500 1500 \n", "75% 5007 5000 4511 4015 4040 4000 \n", "max 873552 1684259 896040 621000 426529 528666 " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "attributes_numerical = ['LIMIT', 'AGE']\n", "attributes_numerical.extend(attributes[11:23])\n", "data.loc[:, attributes_numerical].describe().astype(np.int)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's plot an histogram of the ages, so that we get a better impression of who our clients are. That may even be an end goal, e.g. if your marketing team asks which customer groups to target.\n", "\n", "Then a boxplot of the bills, which may serve as a verification of the quality of the acquired data." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "data.loc[:, 'AGE'].plot.hist(bins=20, figsize=(15,5))\n", "ax = data.iloc[:, 11:17].plot.box(logy=True, figsize=(15,5))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Simple **question**: which proportion of our clients default ?" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Percentage of defaults: 22.14%\n" ] } ], "source": [ "percentage = data['DEFAULT'].value_counts()[1] / data.shape[0] * 100\n", "print('Percentage of defaults: {:.2f}%'.format(percentage))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Another **question**: who's more susceptible to default, males or females ?" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DEFAULT01All
SEX
MALE9003287111874
FEMALE14312376018072
All23315663129946
\n", "
" ], "text/plain": [ "DEFAULT 0 1 All\n", "SEX \n", "MALE 9003 2871 11874\n", "FEMALE 14312 3760 18072\n", "All 23315 6631 29946" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "observed = pd.crosstab(data['SEX'], data['DEFAULT'], margins=True)\n", "observed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Seems like females are better risk. Let's verify with a Chi-Squared test of independance, using [scipy.stats](http://docs.scipy.org/doc/scipy/reference/stats.html)." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "p-value = 6.75e-12\n", "expected values:\n", "[[ 9244.71749148 2629.28250852]\n", " [ 14070.28250852 4001.71749148]]\n" ] } ], "source": [ "import scipy.stats as stats\n", "_, p, _, expected = stats.chi2_contingency(observed.iloc[:2,:2])\n", "print('p-value = {:.2e}'.format(p))\n", "print('expected values:\\n{}'.format(expected))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Intuition**: people who pay late present a higher risk of defaulting. Let's verify !\n", "Verifying some intuitions will also help you to identify mistakes. E.g. it would be suspicious if that intuition is not verified in the data: did we select the right column, or did we miss-compute a result ?" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "group = data.groupby('DELAY1').mean()\n", "corr = data['DEFAULT'].corr(data['DELAY1'], method='pearson')\n", "group['DEFAULT'].plot(grid=True, title='Pearson correlation: {:.4f}'.format(corr), figsize=(15,5));" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4 Interactive Visualization\n", "\n", "[Bokeh](http://bokeh.pydata.org) is a Python interactive visualization library that targets modern web browsers for presentation, in the style of [D3.js](https://d3js.org). Alternatively, [matplotlib.widgets](http://matplotlib.org/api/widgets_api.html) could be used. Those interactive visualizations are very helpful to explore the data at hand in the quest of anomalies or patterns. Try with the plots below !" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", " \n", " Loading BokehJS ...\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "\n", "(function(root) {\n", " function now() {\n", " return new Date();\n", " }\n", "\n", " var force = true;\n", "\n", " if (typeof (root._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n", " root._bokeh_onload_callbacks = [];\n", " root._bokeh_is_loading = undefined;\n", " }\n", "\n", " var JS_MIME_TYPE = 'application/javascript';\n", " var HTML_MIME_TYPE = 'text/html';\n", " var EXEC_MIME_TYPE = 'application/vnd.bokehjs_exec.v0+json';\n", " var CLASS_NAME = 'output_bokeh rendered_html';\n", "\n", " /**\n", " * Render data to the DOM node\n", " */\n", " function render(props, node) {\n", " var script = document.createElement(\"script\");\n", " node.appendChild(script);\n", " }\n", "\n", " /**\n", " * Handle when an output is cleared or removed\n", " */\n", " function handleClearOutput(event, handle) {\n", " var cell = handle.cell;\n", "\n", " var id = cell.output_area._bokeh_element_id;\n", " var server_id = cell.output_area._bokeh_server_id;\n", " // Clean up Bokeh references\n", " if (id !== undefined) {\n", " Bokeh.index[id].model.document.clear();\n", " delete Bokeh.index[id];\n", " }\n", "\n", " if (server_id !== undefined) {\n", " // Clean up Bokeh references\n", " var cmd = \"from bokeh.io.state import curstate; print(curstate().uuid_to_server['\" + server_id + \"'].get_sessions()[0].document.roots[0]._id)\";\n", " cell.notebook.kernel.execute(cmd, {\n", " iopub: {\n", " output: function(msg) {\n", " var element_id = msg.content.text.trim();\n", " Bokeh.index[element_id].model.document.clear();\n", " delete Bokeh.index[element_id];\n", " }\n", " }\n", " });\n", " // Destroy server and session\n", " var cmd = \"import bokeh.io.notebook as ion; ion.destroy_server('\" + server_id + \"')\";\n", " cell.notebook.kernel.execute(cmd);\n", " }\n", " }\n", "\n", " /**\n", " * Handle when a new output is added\n", " */\n", " function handleAddOutput(event, handle) {\n", " var output_area = handle.output_area;\n", " var output = handle.output;\n", "\n", " // limit handleAddOutput to display_data with EXEC_MIME_TYPE content only\n", " if ((output.output_type != \"display_data\") || (!output.data.hasOwnProperty(EXEC_MIME_TYPE))) {\n", " return\n", " }\n", "\n", " var toinsert = output_area.element.find(\".\" + CLASS_NAME.split(' ')[0]);\n", "\n", " if (output.metadata[EXEC_MIME_TYPE][\"id\"] !== undefined) {\n", " toinsert[0].firstChild.textContent = output.data[JS_MIME_TYPE];\n", " // store reference to embed id on output_area\n", " output_area._bokeh_element_id = output.metadata[EXEC_MIME_TYPE][\"id\"];\n", " }\n", " if (output.metadata[EXEC_MIME_TYPE][\"server_id\"] !== undefined) {\n", " var bk_div = document.createElement(\"div\");\n", " bk_div.innerHTML = output.data[HTML_MIME_TYPE];\n", " var script_attrs = bk_div.children[0].attributes;\n", " for (var i = 0; i < script_attrs.length; i++) {\n", " toinsert[0].firstChild.setAttribute(script_attrs[i].name, script_attrs[i].value);\n", " }\n", " // store reference to server id on output_area\n", " output_area._bokeh_server_id = output.metadata[EXEC_MIME_TYPE][\"server_id\"];\n", " }\n", " }\n", "\n", " function register_renderer(events, OutputArea) {\n", "\n", " function append_mime(data, metadata, element) {\n", " // create a DOM node to render to\n", " var toinsert = this.create_output_subarea(\n", " metadata,\n", " CLASS_NAME,\n", " EXEC_MIME_TYPE\n", " );\n", " this.keyboard_manager.register_events(toinsert);\n", " // Render to node\n", " var props = {data: data, metadata: metadata[EXEC_MIME_TYPE]};\n", " render(props, toinsert[0]);\n", " element.append(toinsert);\n", " return toinsert\n", " }\n", "\n", " /* Handle when an output is cleared or removed */\n", " events.on('clear_output.CodeCell', handleClearOutput);\n", " events.on('delete.Cell', handleClearOutput);\n", "\n", " /* Handle when a new output is added */\n", " events.on('output_added.OutputArea', handleAddOutput);\n", "\n", " /**\n", " * Register the mime type and append_mime function with output_area\n", " */\n", " OutputArea.prototype.register_mime_type(EXEC_MIME_TYPE, append_mime, {\n", " /* Is output safe? */\n", " safe: true,\n", " /* Index of renderer in `output_area.display_order` */\n", " index: 0\n", " });\n", " }\n", "\n", " // register the mime type if in Jupyter Notebook environment and previously unregistered\n", " if (root.Jupyter !== undefined) {\n", " var events = require('base/js/events');\n", " var OutputArea = require('notebook/js/outputarea').OutputArea;\n", "\n", " if (OutputArea.prototype.mime_types().indexOf(EXEC_MIME_TYPE) == -1) {\n", " register_renderer(events, OutputArea);\n", " }\n", " }\n", "\n", " \n", " if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n", " root._bokeh_timeout = Date.now() + 5000;\n", " root._bokeh_failed_load = false;\n", " }\n", "\n", " var NB_LOAD_WARNING = {'data': {'text/html':\n", " \"
\\n\"+\n", " \"

\\n\"+\n", " \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n", " \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n", " \"

\\n\"+\n", " \"\\n\"+\n", " \"\\n\"+\n", " \"from bokeh.resources import INLINE\\n\"+\n", " \"output_notebook(resources=INLINE)\\n\"+\n", " \"\\n\"+\n", " \"
\"}};\n", "\n", " function display_loaded() {\n", " var el = document.getElementById(\"604c0e84-2e00-4577-8ec2-a5e34ea9a29a\");\n", " if (el != null) {\n", " el.textContent = \"BokehJS is loading...\";\n", " }\n", " if (root.Bokeh !== undefined) {\n", " if (el != null) {\n", " el.textContent = \"BokehJS \" + root.Bokeh.version + \" successfully loaded.\";\n", " }\n", " } else if (Date.now() < root._bokeh_timeout) {\n", " setTimeout(display_loaded, 100)\n", " }\n", " }\n", "\n", "\n", " function run_callbacks() {\n", " try {\n", " root._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n", " }\n", " finally {\n", " delete root._bokeh_onload_callbacks\n", " }\n", " console.info(\"Bokeh: all callbacks have finished\");\n", " }\n", "\n", " function load_libs(js_urls, callback) {\n", " root._bokeh_onload_callbacks.push(callback);\n", " if (root._bokeh_is_loading > 0) {\n", " console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n", " return null;\n", " }\n", " if (js_urls == null || js_urls.length === 0) {\n", " run_callbacks();\n", " return null;\n", " }\n", " console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n", " root._bokeh_is_loading = js_urls.length;\n", " for (var i = 0; i < js_urls.length; i++) {\n", " var url = js_urls[i];\n", " var s = document.createElement('script');\n", " s.src = url;\n", " s.async = false;\n", " s.onreadystatechange = s.onload = function() {\n", " root._bokeh_is_loading--;\n", " if (root._bokeh_is_loading === 0) {\n", " console.log(\"Bokeh: all BokehJS libraries loaded\");\n", " run_callbacks()\n", " }\n", " };\n", " s.onerror = function() {\n", " console.warn(\"failed to load library \" + url);\n", " };\n", " console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n", " document.getElementsByTagName(\"head\")[0].appendChild(s);\n", " }\n", " };var element = document.getElementById(\"604c0e84-2e00-4577-8ec2-a5e34ea9a29a\");\n", " if (element == null) {\n", " console.log(\"Bokeh: ERROR: autoload.js configured with elementid '604c0e84-2e00-4577-8ec2-a5e34ea9a29a' but no matching script tag was found. \")\n", " return false;\n", " }\n", "\n", " var js_urls = [\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.11.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.11.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-tables-0.12.11.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-gl-0.12.11.min.js\"];\n", "\n", " var inline_js = [\n", " function(Bokeh) {\n", " Bokeh.set_log_level(\"info\");\n", " },\n", " \n", " function(Bokeh) {\n", " \n", " },\n", " function(Bokeh) {\n", " console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-0.12.11.min.css\");\n", " Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.11.min.css\");\n", " console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.11.min.css\");\n", " Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.11.min.css\");\n", " console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-tables-0.12.11.min.css\");\n", " Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-tables-0.12.11.min.css\");\n", " }\n", " ];\n", "\n", " function run_inline_js() {\n", " \n", " if ((root.Bokeh !== undefined) || (force === true)) {\n", " for (var i = 0; i < inline_js.length; i++) {\n", " inline_js[i].call(root, root.Bokeh);\n", " }if (force === true) {\n", " display_loaded();\n", " }} else if (Date.now() < root._bokeh_timeout) {\n", " setTimeout(run_inline_js, 100);\n", " } else if (!root._bokeh_failed_load) {\n", " console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n", " root._bokeh_failed_load = true;\n", " } else if (force !== true) {\n", " var cell = $(document.getElementById(\"604c0e84-2e00-4577-8ec2-a5e34ea9a29a\")).parents('.cell').data().cell;\n", " cell.output_area.append_execute_result(NB_LOAD_WARNING)\n", " }\n", "\n", " }\n", "\n", " if (root._bokeh_is_loading === 0) {\n", " console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n", " run_inline_js();\n", " } else {\n", " load_libs(js_urls, function() {\n", " console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n", " run_inline_js();\n", " });\n", " }\n", "}(window));" ], "application/vnd.bokehjs_load.v0+json": "\n(function(root) {\n function now() {\n return new Date();\n }\n\n var force = true;\n\n if (typeof (root._bokeh_onload_callbacks) === \"undefined\" || force === true) {\n root._bokeh_onload_callbacks = [];\n root._bokeh_is_loading = undefined;\n }\n\n \n\n \n if (typeof (root._bokeh_timeout) === \"undefined\" || force === true) {\n root._bokeh_timeout = Date.now() + 5000;\n root._bokeh_failed_load = false;\n }\n\n var NB_LOAD_WARNING = {'data': {'text/html':\n \"
\\n\"+\n \"

\\n\"+\n \"BokehJS does not appear to have successfully loaded. If loading BokehJS from CDN, this \\n\"+\n \"may be due to a slow or bad network connection. Possible fixes:\\n\"+\n \"

\\n\"+\n \"\\n\"+\n \"\\n\"+\n \"from bokeh.resources import INLINE\\n\"+\n \"output_notebook(resources=INLINE)\\n\"+\n \"\\n\"+\n \"
\"}};\n\n function display_loaded() {\n var el = document.getElementById(\"604c0e84-2e00-4577-8ec2-a5e34ea9a29a\");\n if (el != null) {\n el.textContent = \"BokehJS is loading...\";\n }\n if (root.Bokeh !== undefined) {\n if (el != null) {\n el.textContent = \"BokehJS \" + root.Bokeh.version + \" successfully loaded.\";\n }\n } else if (Date.now() < root._bokeh_timeout) {\n setTimeout(display_loaded, 100)\n }\n }\n\n\n function run_callbacks() {\n try {\n root._bokeh_onload_callbacks.forEach(function(callback) { callback() });\n }\n finally {\n delete root._bokeh_onload_callbacks\n }\n console.info(\"Bokeh: all callbacks have finished\");\n }\n\n function load_libs(js_urls, callback) {\n root._bokeh_onload_callbacks.push(callback);\n if (root._bokeh_is_loading > 0) {\n console.log(\"Bokeh: BokehJS is being loaded, scheduling callback at\", now());\n return null;\n }\n if (js_urls == null || js_urls.length === 0) {\n run_callbacks();\n return null;\n }\n console.log(\"Bokeh: BokehJS not loaded, scheduling load and callback at\", now());\n root._bokeh_is_loading = js_urls.length;\n for (var i = 0; i < js_urls.length; i++) {\n var url = js_urls[i];\n var s = document.createElement('script');\n s.src = url;\n s.async = false;\n s.onreadystatechange = s.onload = function() {\n root._bokeh_is_loading--;\n if (root._bokeh_is_loading === 0) {\n console.log(\"Bokeh: all BokehJS libraries loaded\");\n run_callbacks()\n }\n };\n s.onerror = function() {\n console.warn(\"failed to load library \" + url);\n };\n console.log(\"Bokeh: injecting script tag for BokehJS library: \", url);\n document.getElementsByTagName(\"head\")[0].appendChild(s);\n }\n };var element = document.getElementById(\"604c0e84-2e00-4577-8ec2-a5e34ea9a29a\");\n if (element == null) {\n console.log(\"Bokeh: ERROR: autoload.js configured with elementid '604c0e84-2e00-4577-8ec2-a5e34ea9a29a' but no matching script tag was found. \")\n return false;\n }\n\n var js_urls = [\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.11.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.11.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-tables-0.12.11.min.js\", \"https://cdn.pydata.org/bokeh/release/bokeh-gl-0.12.11.min.js\"];\n\n var inline_js = [\n function(Bokeh) {\n Bokeh.set_log_level(\"info\");\n },\n \n function(Bokeh) {\n \n },\n function(Bokeh) {\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-0.12.11.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-0.12.11.min.css\");\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.11.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-widgets-0.12.11.min.css\");\n console.log(\"Bokeh: injecting CSS: https://cdn.pydata.org/bokeh/release/bokeh-tables-0.12.11.min.css\");\n Bokeh.embed.inject_css(\"https://cdn.pydata.org/bokeh/release/bokeh-tables-0.12.11.min.css\");\n }\n ];\n\n function run_inline_js() {\n \n if ((root.Bokeh !== undefined) || (force === true)) {\n for (var i = 0; i < inline_js.length; i++) {\n inline_js[i].call(root, root.Bokeh);\n }if (force === true) {\n display_loaded();\n }} else if (Date.now() < root._bokeh_timeout) {\n setTimeout(run_inline_js, 100);\n } else if (!root._bokeh_failed_load) {\n console.log(\"Bokeh: BokehJS failed to load within specified timeout.\");\n root._bokeh_failed_load = true;\n } else if (force !== true) {\n var cell = $(document.getElementById(\"604c0e84-2e00-4577-8ec2-a5e34ea9a29a\")).parents('.cell').data().cell;\n cell.output_area.append_execute_result(NB_LOAD_WARNING)\n }\n\n }\n\n if (root._bokeh_is_loading === 0) {\n console.log(\"Bokeh: BokehJS loaded, going straight to plotting\");\n run_inline_js();\n } else {\n load_libs(js_urls, function() {\n console.log(\"Bokeh: BokehJS plotting callback run at\", now());\n run_inline_js();\n });\n }\n}(window));" }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
\n", "
\n", "
" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/javascript": [ "(function(root) {\n", " function embed_document(root) {\n", " var docs_json = {\"6e9c6347-e469-4a06-88ac-65da94081480\":{\"roots\":{\"references\":[{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"2604681c-10df-4bc2-b28d-5b24c8d09aae\",\"type\":\"PolyAnnotation\"}},\"id\":\"a37508dd-448a-4045-aa4f-89c900039c97\",\"type\":\"LassoSelectTool\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"cd4885d7-3b88-4bdf-9d4b-ea25c39aa964\",\"type\":\"BoxAnnotation\"},{\"attributes\":{},\"id\":\"fc2ea073-a979-4be5-a55e-e41d776af867\",\"type\":\"CrosshairTool\"},{\"attributes\":{},\"id\":\"4ef6c0c4-8fb5-4db9-a19a-28774669a16b\",\"type\":\"ResetTool\"},{\"attributes\":{},\"id\":\"86820640-0351-4aac-a374-e2ddea515824\",\"type\":\"SaveTool\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"xs_units\":\"screen\",\"ys_units\":\"screen\"},\"id\":\"2604681c-10df-4bc2-b28d-5b24c8d09aae\",\"type\":\"PolyAnnotation\"},{\"attributes\":{\"plot\":{\"id\":\"aaad1d4f-5860-4030-94a1-719eb6693dd9\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"a29cedec-bf37-4009-9634-7c1ddf63da23\",\"type\":\"LogTicker\"}},\"id\":\"6df1bbc9-b156-411a-ad18-801d4a9a2378\",\"type\":\"Grid\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"a38efdfa-869c-448e-bc27-eba5b503663a\",\"type\":\"BoxAnnotation\"},{\"attributes\":{},\"id\":\"aa2753fb-9522-4ca7-b44b-3ceb1f85f5b7\",\"type\":\"LogScale\"},{\"attributes\":{\"ticker\":null},\"id\":\"5fa32b5e-1729-4a49-8bb2-d789bce2b1c9\",\"type\":\"LogTickFormatter\"},{\"attributes\":{},\"id\":\"c268a842-a5fa-4aba-b8ad-26b26891efee\",\"type\":\"LogScale\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"size\":{\"field\":\"size\",\"units\":\"screen\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y1\"}},\"id\":\"40e302f2-d6f2-44bf-9a5e-0549d5ddc3df\",\"type\":\"Circle\"},{\"attributes\":{\"num_minor_ticks\":10},\"id\":\"a29cedec-bf37-4009-9634-7c1ddf63da23\",\"type\":\"LogTicker\"},{\"attributes\":{\"num_minor_ticks\":10},\"id\":\"354290d4-182c-47f6-b1cc-a94435f14472\",\"type\":\"LogTicker\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"aaad1d4f-5860-4030-94a1-719eb6693dd9\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"354290d4-182c-47f6-b1cc-a94435f14472\",\"type\":\"LogTicker\"}},\"id\":\"c4fa9393-820d-450e-a7b9-0c93a34390f3\",\"type\":\"Grid\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"40178a8b-bd81-4be1-8f1b-022bfddbfd3f\",\"type\":\"BoxAnnotation\"},{\"attributes\":{\"overlay\":{\"id\":\"cd4885d7-3b88-4bdf-9d4b-ea25c39aa964\",\"type\":\"BoxAnnotation\"}},\"id\":\"7bbd03d1-85ab-4270-9c59-a7844d1c3939\",\"type\":\"BoxZoomTool\"},{\"attributes\":{},\"id\":\"7bf74ae5-4bfe-4c1d-986a-a7fd1093e41c\",\"type\":\"PanTool\"},{\"attributes\":{\"overlay\":{\"id\":\"40178a8b-bd81-4be1-8f1b-022bfddbfd3f\",\"type\":\"BoxAnnotation\"}},\"id\":\"81ac4395-7046-498c-8293-7d2addeb431d\",\"type\":\"BoxZoomTool\"},{\"attributes\":{},\"id\":\"71dcb45e-1d41-4337-838f-ee3904afb254\",\"type\":\"WheelZoomTool\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"b7f20fe7-2526-41b5-91a2-4bfee1baecbc\",\"type\":\"BoxAnnotation\"},\"renderers\":[{\"id\":\"a75abc21-47f1-4af2-8952-622bdf51f6d5\",\"type\":\"GlyphRenderer\"}]},\"id\":\"68a3cdf8-fedd-4646-bccb-e006fd30564b\",\"type\":\"BoxSelectTool\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"53f8a857-b23a-45c2-9062-4dff07864663\",\"type\":\"PolyAnnotation\"}},\"id\":\"80ae923f-6b49-476a-b6c0-5517a8dceddc\",\"type\":\"LassoSelectTool\"},{\"attributes\":{},\"id\":\"ab3b5ebd-fbc0-4fef-a35f-e39cfb4e3751\",\"type\":\"CrosshairTool\"},{\"attributes\":{},\"id\":\"f02b0937-e895-4bcb-a185-c8dd880e3b5b\",\"type\":\"ResetTool\"},{\"attributes\":{},\"id\":\"b8f4d50d-2e5d-492a-ae31-52a95dd4fa7b\",\"type\":\"SaveTool\"},{\"attributes\":{\"plot\":null,\"text\":\"\"},\"id\":\"184ba2b5-7ab7-4722-ad7e-488413e36539\",\"type\":\"Title\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"xs_units\":\"screen\",\"ys_units\":\"screen\"},\"id\":\"53f8a857-b23a-45c2-9062-4dff07864663\",\"type\":\"PolyAnnotation\"},{\"attributes\":{\"bottom_units\":\"screen\",\"fill_alpha\":{\"value\":0.5},\"fill_color\":{\"value\":\"lightgrey\"},\"left_units\":\"screen\",\"level\":\"overlay\",\"line_alpha\":{\"value\":1.0},\"line_color\":{\"value\":\"black\"},\"line_dash\":[4,4],\"line_width\":{\"value\":2},\"plot\":null,\"render_mode\":\"css\",\"right_units\":\"screen\",\"top_units\":\"screen\"},\"id\":\"b7f20fe7-2526-41b5-91a2-4bfee1baecbc\",\"type\":\"BoxAnnotation\"},{\"attributes\":{\"source\":{\"id\":\"d6a1bda6-724c-4c32-aabd-ec77423c723b\",\"type\":\"ColumnDataSource\"}},\"id\":\"3f8d51c5-5683-4385-a88e-dc965643c119\",\"type\":\"CDSView\"},{\"attributes\":{\"below\":[{\"id\":\"988f62f2-9d50-48e2-8184-9b9062681792\",\"type\":\"LogAxis\"}],\"left\":[{\"id\":\"8cb84659-d2cb-419a-ac88-fdb26cb9c651\",\"type\":\"LogAxis\"}],\"plot_height\":400,\"plot_width\":400,\"renderers\":[{\"id\":\"988f62f2-9d50-48e2-8184-9b9062681792\",\"type\":\"LogAxis\"},{\"id\":\"6df1bbc9-b156-411a-ad18-801d4a9a2378\",\"type\":\"Grid\"},{\"id\":\"8cb84659-d2cb-419a-ac88-fdb26cb9c651\",\"type\":\"LogAxis\"},{\"id\":\"c4fa9393-820d-450e-a7b9-0c93a34390f3\",\"type\":\"Grid\"},{\"id\":\"40178a8b-bd81-4be1-8f1b-022bfddbfd3f\",\"type\":\"BoxAnnotation\"},{\"id\":\"b7f20fe7-2526-41b5-91a2-4bfee1baecbc\",\"type\":\"BoxAnnotation\"},{\"id\":\"53f8a857-b23a-45c2-9062-4dff07864663\",\"type\":\"PolyAnnotation\"},{\"id\":\"a75abc21-47f1-4af2-8952-622bdf51f6d5\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"2fef3447-de5f-4620-8117-c97d58e4e8bf\",\"type\":\"Title\"},\"toolbar\":{\"id\":\"45d8578e-58de-4607-9a36-a23cf9b25899\",\"type\":\"Toolbar\"},\"toolbar_location\":null,\"x_range\":{\"id\":\"7980708b-5492-4198-a282-c3f63a03f7fa\",\"type\":\"Range1d\"},\"x_scale\":{\"id\":\"aa2753fb-9522-4ca7-b44b-3ceb1f85f5b7\",\"type\":\"LogScale\"},\"y_range\":{\"id\":\"32c72423-1413-4563-ba02-8a1ab1e1b997\",\"type\":\"DataRange1d\"},\"y_scale\":{\"id\":\"c268a842-a5fa-4aba-b8ad-26b26891efee\",\"type\":\"LogScale\"}},\"id\":\"aaad1d4f-5860-4030-94a1-719eb6693dd9\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{\"data_source\":{\"id\":\"d6a1bda6-724c-4c32-aabd-ec77423c723b\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"ef3e4db0-1140-4c7a-9a08-2ba189b78f8a\",\"type\":\"Circle\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"40e302f2-d6f2-44bf-9a5e-0549d5ddc3df\",\"type\":\"Circle\"},\"selection_glyph\":null,\"view\":{\"id\":\"3f8d51c5-5683-4385-a88e-dc965643c119\",\"type\":\"CDSView\"}},\"id\":\"6f59364f-3461-4d7f-bdd3-663ca7a912f0\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"callback\":null,\"column_names\":[\"x\",\"y1\",\"y2\",\"size\",\"color\"],\"data\":{\"color\":[\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#009600\",\"#960000\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#960000\",\"#960000\",\"#009600\",\"#009600\",\"#009600\",\"#009600\"],\"size\":{\"__ndarray__\":\"\",\"dtype\":\"float64\",\"shape\":[1000]},\"x\":[20000,120000,90000,50000,50000,50000,500000,100000,140000,20000,200000,260000,630000,70000,250000,50000,20000,320000,360000,180000,130000,120000,70000,450000,90000,50000,60000,50000,50000,50000,230000,50000,100000,500000,500000,160000,280000,60000,50000,280000,360000,70000,10000,140000,40000,210000,20000,150000,380000,20000,70000,100000,310000,180000,150000,500000,180000,180000,200000,400000,500000,70000,50000,50000,130000,200000,10000,210000,130000,20000,80000,320000,200000,290000,340000,20000,50000,300000,30000,240000,470000,360000,60000,400000,50000,160000,360000,160000,130000,20000,200000,280000,100000,160000,60000,90000,360000,150000,50000,20000,140000,380000,480000,50000,60000,70000,80000,350000,130000,360000,330000,50000,280000,100000,50000,30000,240000,80000,400000,240000,50000,450000,110000,310000,20000,20000,200000,180000,50000,60000,30000,240000,420000,330000,30000,240000,150000,210000,50000,50000,240000,180000,50000,170000,20000,50000,170000,200000,80000,260000,140000,80000,350000,280000,30000,140000,200000,200000,210000,50000,30000,50000,290000,250000,60000,110000,370000,100000,90000,50000,270000,300000,50000,50000,360000,130000,80000,50000,20000,80000,240000,80000,500000,60000,20000,100000,360000,200000,130000,20000,310000,60000,180000,180000,50000,50000,150000,20000,500000,30000,180000,140000,140000,120000,360000,20000,100000,210000,80000,330000,220000,210000,40000,30000,470000,30000,240000,80000,310000,360000,330000,300000,320000,50000,170000,350000,20000,50000,20000,50000,20000,50000,190000,60000,80000,150000,210000,240000,140000,60000,50000,50000,180000,30000,20000,250000,100000,330000,50000,50000,30000,140000,160000,400000,50000,140000,160000,100000,220000,510000,50000,160000,230000,80000,150000,10000,130000,40000,50000,100000,120000,260000,200000,360000,70000,460000,50000,50000,250000,30000,270000,180000,100000,230000,210000,210000,440000,100000,240000,280000,50000,30000,10000,130000,200000,120000,210000,280000,300000,100000,440000,50000,20000,200000,110000,500000,300000,30000,60000,400000,180000,20000,200000,100000,60000,110000,260000,50000,180000,110000,340000,50000,230000,360000,210000,220000,60000,340000,150000,200000,130000,60000,400000,20000,190000,260000,140000,50000,120000,240000,50000,200000,180000,180000,160000,100000,50000,140000,30000,90000,200000,380000,80000,50000,180000,240000,380000,110000,260000,500000,320000,50000,110000,230000,330000,50000,10000,300000,20000,180000,160000,90000,30000,180000,30000,30000,140000,210000,50000,130000,50000,140000,170000,80000,410000,80000,80000,150000,260000,350000,280000,310000,140000,360000,140000,100000,50000,210000,120000,240000,60000,150000,30000,160000,200000,120000,500000,320000,280000,60000,200000,230000,230000,480000,50000,200000,50000,50000,260000,30000,440000,230000,200000,490000,80000,20000,170000,70000,210000,90000,390000,110000,580000,360000,270000,60000,110000,50000,80000,50000,100000,30000,200000,240000,220000,160000,180000,200000,140000,380000,600000,260000,200000,80000,40000,130000,50000,230000,70000,180000,80000,290000,230000,170000,230000,220000,230000,500000,90000,390000,400000,180000,60000,120000,60000,50000,70000,500000,200000,250000,140000,70000,90000,100000,190000,230000,320000,220000,260000,90000,30000,260000,170000,80000,50000,200000,20000,250000,20000,30000,90000,230000,130000,30000,160000,500000,180000,80000,120000,130000,120000,360000,50000,20000,100000,70000,50000,10000,290000,30000,410000,360000,140000,210000,220000,20000,620000,360000,20000,50000,50000,10000,100000,50000,60000,440000,120000,50000,110000,200000,150000,30000,100000,360000,390000,220000,160000,230000,120000,100000,190000,50000,240000,40000,630000,160000,110000,20000,210000,270000,360000,60000,360000,390000,250000,90000,360000,50000,270000,200000,50000,350000,160000,80000,290000,270000,80000,50000,30000,30000,90000,160000,80000,50000,20000,220000,210000,130000,260000,230000,160000,50000,80000,110000,200000,360000,210000,80000,300000,310000,30000,170000,140000,180000,50000,50000,110000,450000,200000,220000,360000,50000,40000,190000,150000,200000,270000,60000,200000,150000,180000,90000,360000,90000,20000,70000,230000,30000,200000,100000,50000,20000,210000,110000,20000,50000,50000,400000,170000,280000,130000,210000,370000,30000,150000,60000,80000,280000,20000,270000,240000,20000,450000,10000,280000,290000,190000,80000,280000,160000,90000,140000,50000,40000,130000,70000,30000,100000,20000,30000,20000,100000,30000,320000,90000,190000,200000,300000,30000,110000,230000,230000,310000,320000,360000,20000,170000,140000,80000,150000,170000,200000,70000,320000,150000,50000,90000,280000,300000,400000,200000,10000,50000,30000,50000,50000,30000,20000,20000,50000,30000,360000,170000,400000,50000,160000,70000,150000,80000,30000,10000,180000,320000,80000,10000,360000,450000,330000,60000,50000,200000,120000,160000,110000,70000,150000,610000,110000,80000,170000,110000,140000,70000,260000,170000,50000,300000,150000,50000,50000,90000,50000,500000,470000,230000,140000,230000,420000,30000,50000,140000,260000,90000,240000,510000,110000,210000,220000,200000,360000,260000,20000,120000,50000,60000,200000,20000,50000,50000,230000,50000,100000,210000,230000,500000,370000,230000,80000,60000,30000,70000,410000,280000,50000,100000,230000,50000,100000,210000,400000,20000,50000,230000,220000,300000,80000,20000,190000,120000,700000,140000,300000,120000,280000,70000,20000,150000,60000,50000,150000,20000,40000,170000,50000,10000,200000,50000,320000,20000,470000,230000,20000,30000,80000,110000,420000,500000,80000,350000,30000,110000,500000,490000,300000,50000,150000,80000,40000,30000,50000,140000,50000,30000,40000,100000,130000,20000,50000,260000,30000,10000,200000,30000,180000,100000,140000,290000,20000,80000,500000,180000,20000,30000,150000,280000,50000,20000,200000,210000,230000,50000,50000,70000,310000,180000,90000,390000,150000,400000,80000,320000,240000,200000,110000,170000,90000,200000,360000,140000,220000,100000,80000,30000,50000,120000,50000,360000,90000,500000,140000,30000,70000,200000,350000,120000,240000,90000,260000,50000,110000,20000,30000,180000,20000,50000,180000,200000,260000,320000,160000,30000,50000,50000,90000,240000,180000,200000,50000,50000,330000,80000,50000,360000,240000,50000,470000,160000,30000,80000,500000,180000,110000,160000,210000,200000,350000,250000,380000,30000,200000,30000,50000,500000,30000,220000,20000,180000,300000,130000,180000,150000,140000,50000,20000,180000,180000,150000,70000,30000,500000,60000,50000,130000,260000,380000,360000,50000,50000,260000,80000,360000,290000,200000,140000,360000,50000,120000,100000,200000,90000],\"y1\":[0,0,1518,2000,2000,2500,55000,380,3329,0,2306,21818,1000,3200,3000,0,3200,10358,0,0,3000,316,2007,19428,5757,1973,0,1300,3415,1500,17270,1718,3023,4152,5006,131,8026,1500,780,9075,10000,3000,1500,3000,3000,0,3000,1013,21540,1318,0,2000,7875,1300,1600,3640,8500,8083,0,17000,1516,4025,0,1000,0,0,2300,300,0,0,0,2500,5818,0,5713,2850,1759,291,1686,0,6400,4796,1576,10700,1676,4400,1170,1147,40000,0,6300,0,7555,1602,1937,3621,8339,4031,1411,1699,10212,223,16078,1767,0,0,2861,2272,10908,0,9260,2000,6500,2000,1340,0,326,2000,9677,2000,1000,0,0,13019,1404,0,3568,4655,1000,2504,5645,2622,9744,8210,2000,1500,2500,1500,300,2360,0,0,1000,10000,15586,1000,1661,8000,1650,0,3455,30000,2906,8042,1700,5000,4000,7300,10478,0,3500,390,396,17994,3000,5500,15383,3166,3500,1510,5000,3288,15000,0,0,3000,504,0,1212,5800,15000,12500,70010,2500,1200,0,57087,8214,5396,0,10020,1342,199,37867,2000,1900,0,1500,138,1000,0,25000,10000,5000,9100,22359,1473,1586,3200,4345,5000,0,1700,3036,13000,390,0,3100,4542,12000,12388,12019,10042,0,6530,2686,1609,2401,0,2090,0,0,5000,1506,3983,4500,0,10116,3000,0,1800,1800,2821,355,1601,1631,1788,2504,1561,1710,5006,5366,7042,3127,1565,1233,2453,0,10000,40010,1602,4664,51315,258,0,1134,4100,1600,1700,2035,0,0,2000,0,4483,2728,316,2007,4000,1000,10063,7722,4173,5000,41986,3276,8610,3472,1295,898,0,1100,3507,0,9000,657,6483,2755,0,3490,0,2400,2800,665,3220,23962,5215,1200,13809,1000,7000,3000,0,1631,1970,2200,165,3754,0,2500,21105,1070,1710,10000,0,325,3800,44665,5200,1000,3000,0,0,1323,5795,10000,0,1974,6900,92,1212,0,11000,5440,0,4210,3000,6600,1411,1541,10118,0,3177,3400,1481,6349,9010,3000,291,10000,2589,0,0,300,1529,3000,0,0,0,0,2977,1640,2500,8100,1640,0,14000,3500,2000,4200,1500,3000,1105,10010,15000,80000,3500,7900,291,0,38621,4400,3000,25000,10000,2502,0,4002,4007,5233,1488,2091,1774,7500,2690,6000,325,11404,10000,0,0,0,0,26734,5000,17507,2000,3629,0,1500,8071,2795,4500,10000,0,3000,199,3600,6000,3200,13007,2500,6422,6015,10000,0,2300,2500,3212,350,4000,0,5704,12026,1600,4500,1260,5000,4505,0,0,14000,179,2500,2027,0,1800,18163,2100,5657,8638,80000,120093,0,7536,8909,1610,10026,3276,13802,14020,2790,1800,2682,1900,1800,2518,0,0,120041,17000,2112,1700,1709,7600,15111,3317,1648,6277,157,1300,4000,4000,1203,1000,7521,7403,0,1623,2000,0,10000,2000,0,396,0,2500,1700,0,1261,240,53228,2000,2900,2385,3060,3000,0,6000,1597,33049,390,6022,3200,13250,0,0,0,14410,2836,2454,2400,2050,3068,1675,13100,5417,1000,600,0,0,0,1553,10000,5000,0,0,7357,5024,1500,0,2500,6370,2000,38187,5401,11000,2055,2800,12006,2231,2790,8005,1,8514,1600,1510,1537,1000,8394,2040,10530,5156,4641,13000,475,13647,1300,2687,780,15155,3757,4000,2200,1600,0,23776,6700,23194,5000,4000,1127,2145,2681,5090,9429,0,3249,10000,1426,390,1323,3007,0,0,1351,2000,9120,390,1006,0,2000,0,6000,2100,0,5000,2018,3315,3000,0,0,0,0,1237,2600,660,1465,7193,0,2000,1155,0,4086,228,1107,1362,1981,3000,6900,2500,5000,20000,3200,5180,1027,3100,0,0,0,1000,1300,0,2500,3597,2000,2500,4500,0,8106,3114,40000,1300,2000,5540,3011,1500,8200,2000,33000,1541,3600,0,4520,0,15000,52129,1087,1283,1532,0,1250,8000,10000,0,0,7500,4310,1853,20000,0,6163,5800,500,0,1267,0,41346,1655,8235,105,2000,0,4600,1500,2500,0,1000,1441,1611,1755,0,1690,770,1613,4000,2382,0,3000,5300,2000,3338,13942,2000,1400,4484,2582,90000,0,1234,6000,0,316,2306,1041,4000,6540,4346,3000,4380,2007,2115,3000,10000,2000,1221,11000,390,0,0,4000,2100,4006,1005,5600,2000,1222,1509,1564,1300,0,67650,0,10000,1117,2460,3738,0,4266,15206,316,1174,5498,39,0,16244,1261,2500,0,5000,2184,3790,0,2630,22157,328,200,1284,5000,1800,600,5400,0,2320,2800,0,1372,3164,2000,13032,0,780,6666,3050,1713,6861,2400,1750,8100,5000,0,2000,3000,20300,0,507,2605,4147,2000,5085,1410,2000,7042,0,1167,199646,2360,2000,0,3001,0,2000,2487,1600,1600,30000,0,2000,3039,0,0,25000,4738,0,3700,6500,0,2000,2500,500,2100,2000,1333,3400,2200,7000,1482,0,3552,0,2330,1739,1484,6000,0,3000,12012,1265,2000,8351,3310,0,1800,6000,5174,1800,1300,0,0,8750,1988,2930,3000,1300,13476,1454,8046,340,3019,1259,6048,0,100,4500,5600,1212,11496,2500,0,10000,0,2200,3136,2300,1500,3000,27465,8300,25016,3000,7900,1352,0,2087,0,16044,6000,0,1800,4980,796,1767,3500,1122,1896,21197,4034,2500,3465,0,1500,1314,2000,10000,0,0,10190,1900,1261,3523,0,3955,1000,8131,2169,40012,0,2011,3700,29987,2833,2151,15315,1325,1120,33891,0,39,396,3000,1600,3027,20007,0,4102,3000,2000,20979,7010,2710,8589,5000,2100,2000,662,1126,5500,3000,0,81690,3000,2500,6433,6067,0,10792,2100,7876,0,2200,7004,0,0,1600,0,0,0,3320,5000,0],\"y2\":[689,1000,1500,2019,36681,1815,40000,601,0,0,12,9966,6500,0,3000,1500,0,10000,0,0,1537,316,3582,1473,0,1426,1000,1300,3421,1500,13281,1500,3511,22827,31178,396,8060,1518,0,0,7000,4500,2927,3000,5000,0,0,1170,15138,1315,3100,1606,7600,2010,1718,162,1500,7296,0,15029,1300,2095,2700,0,0,0,0,1159,0,1200,390,2500,15,3155,23453,0,1779,291,1400,0,7566,3400,1213,3,1302,3547,0,2000,5000,1651,5500,3968,0,840,1301,3597,3394,10006,1194,1460,850,33178,55693,1362,5000,0,3279,722,0,0,0,1000,0,2677,1305,1312,326,1700,11867,3000,4035,0,0,11128,1130,0,3585,2690,1066,7,3508,3301,9553,8095,1400,1800,3000,1500,5880,1700,0,416,10000,10000,344,1500,1200,5000,0,22500,3110,3000,1000,6700,1500,5000,3250,7108,10478,1600,0,780,396,0,1500,3900,8204,0,3500,1442,2000,0,1000,0,77,3000,500,1500,1201,1000,0,6500,30357,1600,1593,1600,5295,7000,5000,1000,6031,1664,0,0,1000,1400,0,1032,2299,85,0,0,13001,5000,7300,1305,390,343,3000,2854,5050,250,1504,2309,11001,390,0,3500,126,0,12378,9006,18832,24,4860,1864,3000,2254,1333,2002,1200,0,5000,1298,6853,1745,7422,9150,2000,0,2000,1750,0,120,2203,1536,1799,4007,1268,136,1244,5087,2000,1469,1393,1075,9880,0,9020,20094,1400,5969,0,5995,0,1298,3095,1600,0,2004,1350,0,1322,0,12496,0,316,2199,4500,664,5162,6091,3000,5000,21874,2026,6907,2092,1000,325,0,0,0,0,7465,2319,5082,3463,13001,2332,0,1206,0,960,5904,0,0,1200,14583,1354,5000,3100,6000,1286,2000,2500,526,1,0,0,5461,1500,1100,7000,38013,0,0,8321,0,0,0,1483,0,1600,7408,8000,0,4406,5037,92,1200,4872,145000,5051,0,4159,2000,6500,1522,4200,47015,0,2600,0,2000,5834,35000,2900,51058,0,2568,0,2066,399,1507,2237,1500,0,1168,0,15086,9000,2000,5299,1700,1342,3855,3000,3000,2575,400,30000,1903,1209,14000,0,4330,6500,582,0,2000,4013,3000,0,10000,1200,0,5000,0,3289,1394,384,1710,0,3487,0,15296,8232,8000,2709,0,2809,1064,199982,0,1102,1600,3332,0,1800,9021,1350,0,7143,0,1500,199,1646,4500,3300,10612,2500,6565,17454,8000,0,2200,3008,1600,400,4037,945,5930,9010,1500,4400,29366,1485,12906,0,18000,5022,1443,51432,1207,6000,1900,5654,1217,10372,5723,3000,8034,1991,7277,7500,1224,7241,6246,16525,13595,1140,1500,0,1400,1500,5950,501,0,4435,0,1200,4000,574,6303,13165,3506,792,4663,0,2,11900,308,1047,390,7000,0,0,3916,1500,6000,9000,3000,0,396,285138,2000,1400,0,1261,240,3997,1500,10,88678,3300,5000,2126,7000,1606,16000,75720,3021,3200,433,1594,6200,0,0,1200,1588,0,7500,2440,2500,11000,2847,0,7300,0,0,1000,5000,13912,3000,0,0,2566,5000,1000,16500,1500,0,1600,1207,1829,5800,0,2753,5736,0,1758,8004,1693,1751,1500,3575,1947,0,4927,2766,0,1089,2290,10200,5691,8314,1100,8765,0,5485,30348,2374,1600,1300,0,0,6000,50000,5000,3924,1074,1844,2724,5014,8738,0,3000,0,1379,390,1196,0,0,2700,1500,5000,8528,390,32906,0,1684,1435,5000,1200,0,5000,1900,3088,3000,1000,0,0,0,1249,1800,660,2000,3855,0,2000,1000,0,1428,0,1000,1294,4521,3000,6867,1000,5000,13000,0,5000,0,2586,15000,0,1100,7816,1301,200,1145,5672,1300,2000,2011,1179,12431,0,0,1000,4110,4403,3095,1387,0,500,33000,1248,3800,0,5050,2900,6600,30000,4355,7024,2000,1024,1000,6000,10000,177671,732,6000,5913,1700,30000,0,4530,0,431,0,1553,0,52110,1797,0,105,1036,0,0,2000,2500,1970,688,1400,2000,1500,3828,0,667,2000,4000,2025,0,3413,0,1000,3333,4022,2338,1700,19623,0,3000,500,2000,5000,621,0,2111,18862,4032,2100,4075,4790,3000,3796,2004,4000,3000,2000,2000,12000,390,0,1756,3258,1227,2383,1000,5700,16025,1837,1067,1800,1400,0,6000,3509,11000,4560,2000,7526,0,1667,15032,316,1149,5500,32013,8000,1963,451,3000,190,0,5,3200,2490,2614,0,0,1950,1047,1033,0,19,0,0,1282,2700,0,1275,800,2000,1290,0,0,0,0,1000,11,1600,1720,0,10000,0,3000,2700,0,2100,3042,4590,19435,2233,4000,1200,1500,2986,1400,1022,720,49325,1000,0,4005,13013,2000,0,1100,1300,10575,0,2201,2047,0,0,50000,2803,0,3700,0,15400,4000,0,0,2027,3000,2256,1500,2000,5151,1400,0,38836,0,2200,1756,1436,0,0,6000,12012,1277,2000,2340,7228,390,1512,6000,1078,1775,1300,11663,0,7470,1989,1447,2700,1000,3173,1384,5150,618,5014,1035,6371,0,100,0,0,1221,1011,0,0,3900,0,2200,0,0,1000,1500,50154,2400,20018,3000,0,1600,0,1989,0,15087,5000,0,4400,5124,0,1700,0,1500,1656,11936,945,4000,84440,874,3000,1147,1000,2600,6938,3000,10190,1428,1261,0,0,2124,1004,8123,2210,10020,2125,2211,3400,104279,2015,2161,15000,1074,1024,16267,8348,0,396,2000,1500,3858,20026,1310,1111,5000,1000,5000,6657,0,2500,3500,2000,28,0,4081,5500,2190,0,18225,4506,1289,2,10000,0,0,0,161,291,2200,1793,0,0,1600,0,780,5000,5000,2000,2806]}},\"id\":\"d6a1bda6-724c-4c32-aabd-ec77423c723b\",\"type\":\"ColumnDataSource\"},{\"attributes\":{\"below\":[{\"id\":\"67137116-6965-42c2-be31-a9f2809714ff\",\"type\":\"LogAxis\"}],\"left\":[{\"id\":\"dc812cd4-64a5-4b7b-affd-bc9d4467c005\",\"type\":\"LogAxis\"}],\"plot_height\":400,\"plot_width\":400,\"renderers\":[{\"id\":\"67137116-6965-42c2-be31-a9f2809714ff\",\"type\":\"LogAxis\"},{\"id\":\"a01219dc-8571-4df9-b26c-6e025094a7fb\",\"type\":\"Grid\"},{\"id\":\"dc812cd4-64a5-4b7b-affd-bc9d4467c005\",\"type\":\"LogAxis\"},{\"id\":\"b9ed573c-b878-459d-8fb8-d6cbec51330b\",\"type\":\"Grid\"},{\"id\":\"cd4885d7-3b88-4bdf-9d4b-ea25c39aa964\",\"type\":\"BoxAnnotation\"},{\"id\":\"a38efdfa-869c-448e-bc27-eba5b503663a\",\"type\":\"BoxAnnotation\"},{\"id\":\"2604681c-10df-4bc2-b28d-5b24c8d09aae\",\"type\":\"PolyAnnotation\"},{\"id\":\"6f59364f-3461-4d7f-bdd3-663ca7a912f0\",\"type\":\"GlyphRenderer\"}],\"title\":{\"id\":\"184ba2b5-7ab7-4722-ad7e-488413e36539\",\"type\":\"Title\"},\"toolbar\":{\"id\":\"db7d3345-bcec-494b-939a-28c1320c4e68\",\"type\":\"Toolbar\"},\"toolbar_location\":null,\"x_range\":{\"id\":\"7980708b-5492-4198-a282-c3f63a03f7fa\",\"type\":\"Range1d\"},\"x_scale\":{\"id\":\"9963f6d2-de6b-43c9-b7a2-8e4420f506a6\",\"type\":\"LogScale\"},\"y_range\":{\"id\":\"32c72423-1413-4563-ba02-8a1ab1e1b997\",\"type\":\"DataRange1d\"},\"y_scale\":{\"id\":\"dc56abbb-f0b1-478e-af01-bce9dcc034e4\",\"type\":\"LogScale\"}},\"id\":\"0b7aad8e-fe84-4329-8eef-5876deca9682\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"attributes\":{\"axis_label\":\"LIMIT\",\"formatter\":{\"id\":\"e717cb5d-f0cd-478d-b0aa-34b6ed86790e\",\"type\":\"LogTickFormatter\"},\"plot\":{\"id\":\"aaad1d4f-5860-4030-94a1-719eb6693dd9\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"a29cedec-bf37-4009-9634-7c1ddf63da23\",\"type\":\"LogTicker\"}},\"id\":\"988f62f2-9d50-48e2-8184-9b9062681792\",\"type\":\"LogAxis\"},{\"attributes\":{\"ticker\":null},\"id\":\"4dc97348-2a51-4518-8717-14b45726ac25\",\"type\":\"LogTickFormatter\"},{\"attributes\":{\"num_minor_ticks\":10},\"id\":\"fa66d785-02b2-4d17-990f-6af43bb56067\",\"type\":\"LogTicker\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_inspect\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"9d7c665c-1aaf-49f9-9af4-3a0f195ae195\",\"type\":\"PanTool\"},{\"id\":\"7bbd03d1-85ab-4270-9c59-a7844d1c3939\",\"type\":\"BoxZoomTool\"},{\"id\":\"d3c39665-9c69-4732-bc3c-c877afe25ca7\",\"type\":\"WheelZoomTool\"},{\"id\":\"11d41a20-194e-47b4-978c-d6780a1b4db2\",\"type\":\"BoxSelectTool\"},{\"id\":\"a37508dd-448a-4045-aa4f-89c900039c97\",\"type\":\"LassoSelectTool\"},{\"id\":\"fc2ea073-a979-4be5-a55e-e41d776af867\",\"type\":\"CrosshairTool\"},{\"id\":\"4ef6c0c4-8fb5-4db9-a19a-28774669a16b\",\"type\":\"ResetTool\"},{\"id\":\"86820640-0351-4aac-a374-e2ddea515824\",\"type\":\"SaveTool\"}]},\"id\":\"db7d3345-bcec-494b-939a-28c1320c4e68\",\"type\":\"Toolbar\"},{\"attributes\":{\"callback\":null},\"id\":\"32c72423-1413-4563-ba02-8a1ab1e1b997\",\"type\":\"DataRange1d\"},{\"attributes\":{\"callback\":null,\"end\":1000000.0,\"start\":10000.0},\"id\":\"7980708b-5492-4198-a282-c3f63a03f7fa\",\"type\":\"Range1d\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.6},\"fill_color\":{\"field\":\"color\"},\"line_alpha\":{\"value\":0.6},\"line_color\":{\"field\":\"color\"},\"size\":{\"field\":\"size\",\"units\":\"screen\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y1\"}},\"id\":\"ef3e4db0-1140-4c7a-9a08-2ba189b78f8a\",\"type\":\"Circle\"},{\"attributes\":{\"ticker\":null},\"id\":\"d3ff6fb9-aab2-4649-b7ba-5e303347a780\",\"type\":\"LogTickFormatter\"},{\"attributes\":{\"num_minor_ticks\":10},\"id\":\"65e6ec8b-a3eb-43a2-af8d-4215ec44e003\",\"type\":\"LogTicker\"},{\"attributes\":{},\"id\":\"9963f6d2-de6b-43c9-b7a2-8e4420f506a6\",\"type\":\"LogScale\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.6},\"fill_color\":{\"field\":\"color\"},\"line_alpha\":{\"value\":0.6},\"line_color\":{\"field\":\"color\"},\"size\":{\"field\":\"size\",\"units\":\"screen\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y2\"}},\"id\":\"db6e59ee-8598-4a18-a6b5-e53af562233c\",\"type\":\"Circle\"},{\"attributes\":{\"children\":[{\"id\":\"eaf04842-f1a1-4d3a-9631-f1db0d3b3199\",\"type\":\"Row\"}]},\"id\":\"567d42ab-f60b-4265-90e0-cca674270140\",\"type\":\"Column\"},{\"attributes\":{\"axis_label\":\"LIMIT\",\"formatter\":{\"id\":\"4dc97348-2a51-4518-8717-14b45726ac25\",\"type\":\"LogTickFormatter\"},\"plot\":{\"id\":\"0b7aad8e-fe84-4329-8eef-5876deca9682\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"fa66d785-02b2-4d17-990f-6af43bb56067\",\"type\":\"LogTicker\"}},\"id\":\"67137116-6965-42c2-be31-a9f2809714ff\",\"type\":\"LogAxis\"},{\"attributes\":{\"ticker\":null},\"id\":\"e717cb5d-f0cd-478d-b0aa-34b6ed86790e\",\"type\":\"LogTickFormatter\"},{\"attributes\":{\"children\":[{\"id\":\"0b7aad8e-fe84-4329-8eef-5876deca9682\",\"subtype\":\"Figure\",\"type\":\"Plot\"},{\"id\":\"aaad1d4f-5860-4030-94a1-719eb6693dd9\",\"subtype\":\"Figure\",\"type\":\"Plot\"}]},\"id\":\"eaf04842-f1a1-4d3a-9631-f1db0d3b3199\",\"type\":\"Row\"},{\"attributes\":{\"plot\":{\"id\":\"0b7aad8e-fe84-4329-8eef-5876deca9682\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"fa66d785-02b2-4d17-990f-6af43bb56067\",\"type\":\"LogTicker\"}},\"id\":\"a01219dc-8571-4df9-b26c-6e025094a7fb\",\"type\":\"Grid\"},{\"attributes\":{\"tools\":[{\"id\":\"9d7c665c-1aaf-49f9-9af4-3a0f195ae195\",\"type\":\"PanTool\"},{\"id\":\"7bbd03d1-85ab-4270-9c59-a7844d1c3939\",\"type\":\"BoxZoomTool\"},{\"id\":\"d3c39665-9c69-4732-bc3c-c877afe25ca7\",\"type\":\"WheelZoomTool\"},{\"id\":\"11d41a20-194e-47b4-978c-d6780a1b4db2\",\"type\":\"BoxSelectTool\"},{\"id\":\"a37508dd-448a-4045-aa4f-89c900039c97\",\"type\":\"LassoSelectTool\"},{\"id\":\"fc2ea073-a979-4be5-a55e-e41d776af867\",\"type\":\"CrosshairTool\"},{\"id\":\"4ef6c0c4-8fb5-4db9-a19a-28774669a16b\",\"type\":\"ResetTool\"},{\"id\":\"86820640-0351-4aac-a374-e2ddea515824\",\"type\":\"SaveTool\"},{\"id\":\"7bf74ae5-4bfe-4c1d-986a-a7fd1093e41c\",\"type\":\"PanTool\"},{\"id\":\"81ac4395-7046-498c-8293-7d2addeb431d\",\"type\":\"BoxZoomTool\"},{\"id\":\"71dcb45e-1d41-4337-838f-ee3904afb254\",\"type\":\"WheelZoomTool\"},{\"id\":\"68a3cdf8-fedd-4646-bccb-e006fd30564b\",\"type\":\"BoxSelectTool\"},{\"id\":\"80ae923f-6b49-476a-b6c0-5517a8dceddc\",\"type\":\"LassoSelectTool\"},{\"id\":\"ab3b5ebd-fbc0-4fef-a35f-e39cfb4e3751\",\"type\":\"CrosshairTool\"},{\"id\":\"f02b0937-e895-4bcb-a185-c8dd880e3b5b\",\"type\":\"ResetTool\"},{\"id\":\"b8f4d50d-2e5d-492a-ae31-52a95dd4fa7b\",\"type\":\"SaveTool\"}]},\"id\":\"4878a2dc-5911-4446-a91c-7f7c9560dbad\",\"type\":\"ProxyToolbar\"},{\"attributes\":{},\"id\":\"dc56abbb-f0b1-478e-af01-bce9dcc034e4\",\"type\":\"LogScale\"},{\"attributes\":{\"axis_label\":\"PAY1\",\"formatter\":{\"id\":\"5fa32b5e-1729-4a49-8bb2-d789bce2b1c9\",\"type\":\"LogTickFormatter\"},\"plot\":{\"id\":\"0b7aad8e-fe84-4329-8eef-5876deca9682\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"65e6ec8b-a3eb-43a2-af8d-4215ec44e003\",\"type\":\"LogTicker\"}},\"id\":\"dc812cd4-64a5-4b7b-affd-bc9d4467c005\",\"type\":\"LogAxis\"},{\"attributes\":{\"fill_alpha\":{\"value\":0.1},\"fill_color\":{\"value\":\"#1f77b4\"},\"line_alpha\":{\"value\":0.1},\"line_color\":{\"value\":\"#1f77b4\"},\"size\":{\"field\":\"size\",\"units\":\"screen\"},\"x\":{\"field\":\"x\"},\"y\":{\"field\":\"y2\"}},\"id\":\"548208c7-f15d-4a76-a92d-8996267efc0a\",\"type\":\"Circle\"},{\"attributes\":{\"dimension\":1,\"plot\":{\"id\":\"0b7aad8e-fe84-4329-8eef-5876deca9682\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"65e6ec8b-a3eb-43a2-af8d-4215ec44e003\",\"type\":\"LogTicker\"}},\"id\":\"b9ed573c-b878-459d-8fb8-d6cbec51330b\",\"type\":\"Grid\"},{\"attributes\":{\"plot\":null,\"text\":\"\"},\"id\":\"2fef3447-de5f-4620-8117-c97d58e4e8bf\",\"type\":\"Title\"},{\"attributes\":{},\"id\":\"9d7c665c-1aaf-49f9-9af4-3a0f195ae195\",\"type\":\"PanTool\"},{\"attributes\":{\"data_source\":{\"id\":\"d6a1bda6-724c-4c32-aabd-ec77423c723b\",\"type\":\"ColumnDataSource\"},\"glyph\":{\"id\":\"db6e59ee-8598-4a18-a6b5-e53af562233c\",\"type\":\"Circle\"},\"hover_glyph\":null,\"muted_glyph\":null,\"nonselection_glyph\":{\"id\":\"548208c7-f15d-4a76-a92d-8996267efc0a\",\"type\":\"Circle\"},\"selection_glyph\":null,\"view\":{\"id\":\"d18bdddd-9e12-4fee-a8d0-f8e51f5c5aca\",\"type\":\"CDSView\"}},\"id\":\"a75abc21-47f1-4af2-8952-622bdf51f6d5\",\"type\":\"GlyphRenderer\"},{\"attributes\":{\"source\":{\"id\":\"d6a1bda6-724c-4c32-aabd-ec77423c723b\",\"type\":\"ColumnDataSource\"}},\"id\":\"d18bdddd-9e12-4fee-a8d0-f8e51f5c5aca\",\"type\":\"CDSView\"},{\"attributes\":{\"active_drag\":\"auto\",\"active_inspect\":\"auto\",\"active_scroll\":\"auto\",\"active_tap\":\"auto\",\"tools\":[{\"id\":\"7bf74ae5-4bfe-4c1d-986a-a7fd1093e41c\",\"type\":\"PanTool\"},{\"id\":\"81ac4395-7046-498c-8293-7d2addeb431d\",\"type\":\"BoxZoomTool\"},{\"id\":\"71dcb45e-1d41-4337-838f-ee3904afb254\",\"type\":\"WheelZoomTool\"},{\"id\":\"68a3cdf8-fedd-4646-bccb-e006fd30564b\",\"type\":\"BoxSelectTool\"},{\"id\":\"80ae923f-6b49-476a-b6c0-5517a8dceddc\",\"type\":\"LassoSelectTool\"},{\"id\":\"ab3b5ebd-fbc0-4fef-a35f-e39cfb4e3751\",\"type\":\"CrosshairTool\"},{\"id\":\"f02b0937-e895-4bcb-a185-c8dd880e3b5b\",\"type\":\"ResetTool\"},{\"id\":\"b8f4d50d-2e5d-492a-ae31-52a95dd4fa7b\",\"type\":\"SaveTool\"}]},\"id\":\"45d8578e-58de-4607-9a36-a23cf9b25899\",\"type\":\"Toolbar\"},{\"attributes\":{\"axis_label\":\"PAY2\",\"formatter\":{\"id\":\"d3ff6fb9-aab2-4649-b7ba-5e303347a780\",\"type\":\"LogTickFormatter\"},\"plot\":{\"id\":\"aaad1d4f-5860-4030-94a1-719eb6693dd9\",\"subtype\":\"Figure\",\"type\":\"Plot\"},\"ticker\":{\"id\":\"354290d4-182c-47f6-b1cc-a94435f14472\",\"type\":\"LogTicker\"}},\"id\":\"8cb84659-d2cb-419a-ac88-fdb26cb9c651\",\"type\":\"LogAxis\"},{\"attributes\":{\"toolbar\":{\"id\":\"4878a2dc-5911-4446-a91c-7f7c9560dbad\",\"type\":\"ProxyToolbar\"}},\"id\":\"6291dd49-fd20-452a-8c9e-391cd041c354\",\"type\":\"ToolbarBox\"},{\"attributes\":{\"callback\":null,\"overlay\":{\"id\":\"a38efdfa-869c-448e-bc27-eba5b503663a\",\"type\":\"BoxAnnotation\"},\"renderers\":[{\"id\":\"6f59364f-3461-4d7f-bdd3-663ca7a912f0\",\"type\":\"GlyphRenderer\"}]},\"id\":\"11d41a20-194e-47b4-978c-d6780a1b4db2\",\"type\":\"BoxSelectTool\"},{\"attributes\":{\"children\":[{\"id\":\"567d42ab-f60b-4265-90e0-cca674270140\",\"type\":\"Column\"},{\"id\":\"6291dd49-fd20-452a-8c9e-391cd041c354\",\"type\":\"ToolbarBox\"}]},\"id\":\"75b101e3-cfdd-4481-866d-d22ae054f8fe\",\"type\":\"Row\"},{\"attributes\":{},\"id\":\"d3c39665-9c69-4732-bc3c-c877afe25ca7\",\"type\":\"WheelZoomTool\"}],\"root_ids\":[\"75b101e3-cfdd-4481-866d-d22ae054f8fe\"]},\"title\":\"Bokeh Application\",\"version\":\"0.12.11\"}};\n", " var render_items = [{\"docid\":\"6e9c6347-e469-4a06-88ac-65da94081480\",\"elementid\":\"6529d229-62c9-4949-825d-3bbb2a9335fe\",\"modelid\":\"75b101e3-cfdd-4481-866d-d22ae054f8fe\"}];\n", " root.Bokeh.embed.embed_items(docs_json, render_items);\n", " }\n", " if (root.Bokeh !== undefined) {\n", " embed_document(root);\n", " } else {\n", " var attempts = 0;\n", " var timer = setInterval(function(root) {\n", " if (root.Bokeh !== undefined) {\n", " embed_document(root);\n", " clearInterval(timer);\n", " }\n", " attempts++;\n", " if (attempts > 100) {\n", " console.log(\"Bokeh: ERROR: Unable to embed document because BokehJS library is missing\")\n", " clearInterval(timer);\n", " }\n", " }, 10, root)\n", " }\n", "})(window);" ], "application/vnd.bokehjs_exec.v0+json": "" }, "metadata": { "application/vnd.bokehjs_exec.v0+json": { "id": "75b101e3-cfdd-4481-866d-d22ae054f8fe" } }, "output_type": "display_data" } ], "source": [ "from bokeh.plotting import output_notebook, figure, show\n", "from bokeh.layouts import gridplot\n", "from bokeh.models import ColumnDataSource\n", "\n", "output_notebook()\n", "\n", "x, y1, y2 = 'LIMIT', 'PAY1', 'PAY2'\n", "n = 1000 # Less intensive for the browser.\n", "\n", "options = dict(\n", " tools='pan,box_zoom,wheel_zoom,box_select,lasso_select,crosshair,reset,save',\n", " x_axis_type='log', y_axis_type='log',\n", ")\n", "plot1 = figure(\n", " x_range=[1e4,1e6],\n", " x_axis_label=x, y_axis_label=y1,\n", " **options\n", ")\n", "plot2 = figure(\n", " x_range=plot1.x_range, y_range=plot1.y_range,\n", " x_axis_label=x, y_axis_label=y2,\n", " **options\n", ")\n", "\n", "html_color = lambda r,g,b: '#{:02x}{:02x}{:02x}'.format(r,g,b)\n", "colors = [html_color(150,0,0) if default == 1 else html_color(0,150,0) for default in data['DEFAULT'][:n]]\n", "# The above line is a list comprehension.\n", "\n", "radii = data['AGE'][:n] / 5\n", "\n", "# To link brushing (where a selection on one plot causes a selection to update on other plots).\n", "source = ColumnDataSource(data=dict(x=data[x][:n], y1=data[y1][:n], y2=data[y2][:n], size=radii, color=colors))\n", "\n", "plot1.scatter('x', 'y1', size='size', color='color', source=source, alpha=0.6)\n", "plot2.scatter('x', 'y2', size='size', color='color', source=source, alpha=0.6)\n", "\n", "plot = gridplot([[plot1, plot2]], toolbar_location='right', plot_width=400, plot_height=400, title='adsf')\n", "\n", "show(plot)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 5 Pre-Processing\n", "\n", "Back to [NumPy](http://www.numpy.org/), the fundamental package for scientific computing with Python. It provides multi-dimensional arrays, data types and linear algebra routines. Note that [scikit-learn](http://scikit-learn.org) provides many helpers for those tasks.\n", "\n", "Pre-processing usually consists of:\n", "1. Data types transformation. The data has not necessarilly the format the chosen learning algorithm expects.\n", "1. Data normalization. Some algorithms expect data to be centered and scaled. Some will train faster.\n", "1. Data randomization. If the samples are presented in sequence, it'll train faster if they are not correlated.\n", "1. Train / test splitting. You may have to be careful here, e.g. not including future events in the training set." ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": [ "# Back to numeric values.\n", "# Note: in a serious project, these should be treated as categories.\n", "data['SEX'].cat.categories = [-1, 1]\n", "data['SEX'] = data['SEX'].astype(np.int)\n", "data['MARRIAGE'].cat.categories = [-1, 1, 0]\n", "data['MARRIAGE'] = data['MARRIAGE'].astype(np.int)\n", "data['EDUCATION'].cat.categories = [-2, 2, 1, 0, -1]\n", "data['EDUCATION'] = data['EDUCATION'].astype(np.int)\n", "\n", "data['DEFAULT'] = data['DEFAULT'] * 2 - 1 # [0,1] --> [-1,1]" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The data is a with 29946 samples of dimensionality 23.\n" ] } ], "source": [ "# Observations and targets.\n", "X = data.values[:,:23]\n", "y = data.values[:,23]\n", "n, d = X.shape\n", "print('The data is a {} with {} samples of dimensionality {}.'.format(type(X), n, d))" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "# Center and scale.\n", "# Note: on a serious project, should be done after train / test split.\n", "X = X.astype(np.float)\n", "X -= X.mean(axis=0)\n", "X /= X.std(axis=0)" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Split: 10000 testing and 19946 training samples\n" ] } ], "source": [ "# Training and testing sets.\n", "test_size = 10000\n", "print('Split: {} testing and {} training samples'.format(test_size, y.size - test_size))\n", "perm = np.random.permutation(y.size)\n", "X_test = X[perm[:test_size]]\n", "X_train = X[perm[test_size:]]\n", "y_test = y[perm[:test_size]]\n", "y_train = y[perm[test_size:]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 6 A first Predictive Model\n", "\n", "The ingredients of a Machine Learning (ML) model are:\n", "1. A predictive function, e.g. the linear transformation $f(x) = x^Tw + b$.\n", "1. An error function, e.g. the least squares $E = \\sum_{i=1}^n \\left( f(x_i) - y_i \\right)^2 = \\| f(X) - y \\|_2^2$.\n", "1. An optional regularization, e.g. the Thikonov regularization $R = \\|w\\|_2^2$.\n", "1. Which makes up the loss / objective function $L = E + \\alpha R$.\n", "\n", "Our model has a sole hyper-parameter, $\\alpha \\geq 0$, which controls the shrinkage.\n", "\n", "A Machine Learning (ML) problem can often be cast as a (convex or smooth) optimization problem which objective is to find the parameters (here $w$ and $b$) who minimize the loss, e.g.\n", "$$\\hat{w}, \\hat{b} = \\operatorname*{arg min}_{w,b} L = \\operatorname*{arg min}_{w,b} \\| Xw + b - y \\|_2^2 + \\alpha \\|w\\|_2^2.$$\n", "\n", "If the problem is convex and smooth, one can compute the gradients\n", "$$\\frac{\\partial L}{\\partial{w}} = 2 X^T (Xw+b-y) + 2\\alpha w,$$\n", "$$\\frac{\\partial L}{\\partial{b}} = 2 \\sum_{i=1}^n (x_i^Tw+b-y_i) = 2 \\sum_{i=1}^n (x_i^Tw-y_i) + 2n \\cdot b,$$\n", "\n", "which can be used in a [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent) scheme or to form closed-form solutions:\n", "$$\\frac{\\partial L}{\\partial{w}} = 0 \\ \\rightarrow \\ 2 X^T X\\hat{w} + 2\\alpha \\hat{w} = 2 X^T y - 2 X^T b \\ \\rightarrow \\ \\hat{w} = (X^T X + \\alpha I)^{-1} X^T (y-b),$$\n", "$$\\frac{\\partial L}{\\partial{b}} = 0 \\ \\rightarrow \\ 2n\\hat{b} = 2\\sum_{i=1}^n (y_i) - \\underbrace{2\\sum_{i=1}^n (x_i^Tw)}_{=0 \\text{ if centered}} \\ \\rightarrow \\ \\hat{b} = \\frac1n I^T y = \\operatorname{mean}(y).$$\n", "\n", "What if the resulting problem is non-smooth ? See the [PyUNLocBoX](http://pyunlocbox.readthedocs.io), a convex optimization toolbox which implements [proximal splitting methods](https://en.wikipedia.org/wiki/Proximal_gradient_method)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.1 Take a *symbolic* Derivative\n", "\n", "Let's verify our manually derived gradients ! [SymPy](http://www.sympy.org/) is our computer algebra system (CAS) (like [Mathematica](https://www.wolfram.com/mathematica), [Maple](https://www.maplesoft.com/products/Maple)) of choice." ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAALcAAAAbBAMAAADMngM5AAAAMFBMVEX///8AAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAv3aB7AAAAD3RSTlMAIpmJdu8QRM1mu90y\nVKvMIHo8AAAACXBIWXMAAA7EAAAOxAGVKw4bAAACuElEQVRIDbWVT2sTQRjGn2ST3WQ32+5FxIMa\ntEiUorH4AZY24kVsD1L8c3CRFkShCYg3oUsPElCh+gliDx481BQL3iQ3vTUevAilPfRQbylKi6DE\nd3Z2ZncnLY0tfSGz8/7meZ/MZGcmwBGFduHcETmT7WVsquYjC56KDpgvYcVJlmrOgJ8kB84+4nQt\nWWw5ha0kOUS25iWLrVpqJ0kOkS331NrbPeiAQCv2FFquik6oYP/8K5O87tWNqEhv4OFdFe6TD9A+\nMcpGS5HliwpAjoRzKtwn133gy+p3T5E9xlNObDccYav7G/YlDPM9H1eAe92uMlx4ufqWI+mzRCsU\nL1lCpawnne4hBLLdbodz4ZNqAhp9ghAwTPd+5GrhWOrs0CSmhmD79mQkFz6ZIpDbGG0HIwIK9dSN\n42NuVEM9fdTFMQ9yGz5xch2Uv8FcN2IHSPiY80D1mc7PrYChWi/nly0/Ya4ZExgH8iHVbyPdSLVL\nGJzHm0gofCyaM53jX8FICIVa87Rtw4lqqHcy7eMdYDc5TW/DcnXcZJdYMRRq9frzW/X6OqXZGlCC\n/pN+eQmFWofZCCoe1Fm8oL6XddlECp2AI9tAtQVjC1eBMkesFTNn5nMw+GYUUKoH3aiC92a9wp/I\nvFrGGkCrWUSKjEQIH/az/AYtj4WAUr3SCnisKSEzQZu3yVG1jU3bMdfxAZmYSPiwF7qDaWopJORq\n2xtHxgtGZLNIGyN6oaab/6Qh4xt38FlKIh+2qX5ggY8I81A9234P9Vq7iFmaSLrMC/SN66euAWcq\nM2N8ekkftsD7YkSYh+qZytT5eA2rnKlccuhktLjLHq30oeMvQ0JJdumUiD3ahceQ7oVJ/FqWMCZM\ndgd8nZ1Eurj6CpNW2X9Y86ZLl4DfZwX9WfxHGJVhUvc/IXU/9PFVr0LNP14Ymb5bUNKwAAAAAElF\nTkSuQmCC\n", "text/latex": [ "$$a w^{2} + \\left(b + w x - y\\right)^{2}$$" ], "text/plain": [ " 2 2\n", "a⋅w + (b + w⋅x - y) " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAMcAAAAUBAMAAADLgTR0AAAAMFBMVEX///8AAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAv3aB7AAAAD3RSTlMAEJm7MquJRO/dIs12\nVGbfGimAAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAC5UlEQVQ4EZWVT0gUURzHv7O7zrjrrD6KDSMh\nszqtpCReImhvQRf3Wiw5eegU5TFCWMNL/w6eOpTg2MU6FJN0iT04ConQgtZJ7NDgzQ6hEEqS2u/3\nZt7OzrqK+4N98+b3/bzf+/3ee/MWaMg+NERLeJVarasvd8TIqKz14uTTI+h6UpMNZKBv19MCX1Ru\noQEzR9D1JM0BzgFP6mhp13dG5TVy7ga0IoLXwx/3AKp+0D5IqBBR+RVgbAawIg6OrfG0A7MCg7ka\nN72qEBE5NQ7o9JOmiOD18EfLMGuTAm+GFoU27+IHVoTEq0JMCq18a4SKhtkJtBTmlyNERzfSTvox\nOr5ML7hSUQ0HfCugd5LD2ILxDLPQ06PoR1HOW6lEyqew7r4mMu4BZz9qO9QNa7VWEM8b25rVOpF0\npKIa3RjFEtDK3mQe7R7O433MwR7itmTCSki+gwfiEZNUw0+Bf9VEajmLNg9XdKFvGv5YKVPDAfso\nm3HqdwE9AhMQzW5iBzHy6KXS5+elUl7SJAtmyZqHgSy0rWpCwwsM2OjUEO+V0JkS2yfqU0BOKLHh\nL/QYtD9A0TZHoUsyXC7eBwTJ8yQzMPxDrGo1drAIWECby2S1FUWCwvIk35Ga2kKaellaXEz7lArB\nsg1myXi59hDzD7EiaE36kaLpB3ISqmqyoKxhjCNhwZzaRZOzSmzSgutDKgTL9jdzA3Pk543f5g1k\nUwQl9hIm0mIJppBKpennrHnj390duomrWM9b+IqHvQlKiU2FYNnYjW+kPHLyYbyNy9QjU4TpGGO4\njuLyNdRenxdR9ICYhdn9/b/IlG8UhhHrnivclwHCECxrhbn53+ynwnFigcaxqUnQVc6QL1PuuORJ\nIWwy5R6bvqxc6In2KiEibrpWKlafqMh+J0uP0zW+8FWrWV5fWQsB1CeqANpnjS94visaseBLPeaQ\npBd36Q/LOSauMPrTasCM8gWiG0uMw9een2NM+Qv4D+MLyjhmTGMqAAAAAElFTkSuQmCC\n", "text/latex": [ "$$2 a w + 2 x \\left(b + w x - y\\right)$$" ], "text/plain": [ "2⋅a⋅w + 2⋅x⋅(b + w⋅x - y)" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAIAAAAASBAMAAABhvsuwAAAAMFBMVEX///8AAAAAAAAAAAAAAAAA\nAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAv3aB7AAAAD3RSTlMAEJm7MquJRO/dIs12\nVGbfGimAAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAB7ElEQVQ4EYVSv2sTURz/vEt6l0su+qwUHAKN\nEacWKWo3wZtcvVUM9nDoItiMHYTETcEhi+Ig9raioLQFpxt6HlhEBYOToGD/BAPSYtFcv/d+JN4Z\nku/w3vt+fvF+gTUuuTj9ELmaPXc5h2Ra4ZLIHMxDYCdDA8zDWiuH/dtKl0DOAg+Av5r9LhcGR6mj\nsTGzdAmCNr8SWH0tUgGlLmaONDZmFi6J73KsuGZXi1RApY+ZXxobMwvXEN/glWbck60KoMbuszjC\nN3zlqL3b3IskPxo3eG0R5fsEWAeY32Zqw6OAtm9WO1hGu8X8E8/srZFVrMjlr8P5SY3t4QfHHykY\nBTzGK2MLAxQCk5t9K5D8cLS9cm8BhSUCGsAC2AGt7oTh0zB8I0RGHbwUFY9ggEkdWCjKFXyD4Ec4\nGQFOPf0GlnrI4Q5ikrUDpwOTFqkuW6mL3u66C3xBORjAUA+pA6p1vKSNFTxskjPVZSt10QnfA0Uf\nTnCIM/tSoAOuAR/oBm0fEar8MxyeCRAuSl8GXqzevoFbuKh4FVB+snq+i49YXyq20O5dxeuMX7oM\nzxoAu0nyG7N7+9mASpIkXRiLb5t3gblPtQuaVzLhYs34/9+qj6CEUya6hXydygMT+puYb02gp1PP\ncWW6aJLiXuwCx0P4hQA8BKLoAAAAAElFTkSuQmCC\n", "text/latex": [ "$$2 b + 2 w x - 2 y$$" ], "text/plain": [ "2⋅b + 2⋅w⋅x - 2⋅y" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import sympy as sp\n", "sp.init_printing()\n", "\n", "X, y, w, b, a = sp.symbols('x y w b a')\n", "L = (X*w + b - y)**2 + a*w**2\n", "\n", "dLdw = sp.diff(L, w)\n", "dLdb = sp.diff(L, b)\n", "\n", "from IPython.display import display\n", "display(L)\n", "display(dLdw)\n", "display(dLdb)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.2 Build the Classifier\n", "\n", "Relying on the derived equations, we can implement our model relying only on the [NumPy](http://www.numpy.org/) linear algebra capabilities (really wrappers to [BLAS](http://www.netlib.org/blas) / [LAPACK](http://www.netlib.org/lapack) implementations such as [ATLAS](http://math-atlas.sourceforge.net), [OpenBLAS](http://www.openblas.net) or [MKL](https://software.intel.com/intel-mkl)).\n", "\n", "A ML model is best represented as a class, with hyper-parameters and parameters stored as attributes, and is composed of two essential methods:\n", "1. `y_pred = model.predict(X_test)`: return the predictions $y$ given the features $X$.\n", "1. `model.fit(X_train, y_train)`: learn the model parameters such as to predict $y$ given $X$." ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "class RidgeRegression(object):\n", " \"\"\"Our ML model.\"\"\"\n", " \n", " def __init__(self, alpha=0):\n", " \"The class' constructor. Initialize the hyper-parameters.\"\n", " self.a = alpha\n", " \n", " def predict(self, X):\n", " \"\"\"Return the predicted class given the features.\"\"\"\n", " return np.sign(X.dot(self.w) + self.b)\n", " \n", " def fit(self, X, y):\n", " \"\"\"Learn the model's parameters given the training data, the closed-form way.\"\"\"\n", " n, d = X.shape\n", " self.b = np.mean(y)\n", " Ainv = np.linalg.inv(X.T.dot(X) + self.a * np.identity(d))\n", " self.w = Ainv.dot(X.T).dot(y - self.b)\n", "\n", " def loss(self, X, y, w=None, b=None):\n", " \"\"\"Return the current loss.\n", " This method is not strictly necessary, but it provides\n", " information on the convergence of the learning process.\"\"\"\n", " w = self.w if w is None else w # The ternary conditional operator\n", " b = self.b if b is None else b # makes those tests concise.\n", " import autograd.numpy as np # See below for autograd.\n", " return np.linalg.norm(np.dot(X, w) + b - y)**2 + self.a * np.linalg.norm(w, 2)**2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that our model can learn its parameters and predict targets, it's time to evaluate it. Our metric for binary classification is the accuracy, which gives the percentage of correcly classified test samples. Depending on the application, the time spent for inference or training might also be important." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "accuracy: 80.16%, loss: 5979.51, time: 32.32ms\n" ] } ], "source": [ "def accuracy(y_pred, y_true):\n", " \"\"\"Our evaluation metric, the classification accuracy.\"\"\"\n", " return np.sum(y_pred == y_true) / y_true.size\n", "\n", "def evaluate(model):\n", " \"\"\"Helper function to instantiate, train and evaluate the model.\n", " It returns the classification accuracy, the loss and the execution time.\"\"\"\n", " import time\n", " t = time.process_time()\n", " model.fit(X_train, y_train)\n", " y_pred = model.predict(X_test)\n", " acc = accuracy(y_pred, y_test)\n", " loss = model.loss(X_test, y_test)\n", " t = time.process_time() - t\n", " print('accuracy: {:.2f}%, loss: {:.2f}, time: {:.2f}ms'.format(acc*100, loss, t*1000))\n", " return model\n", "\n", "alpha = 1e-2*n\n", "model = RidgeRegression(alpha)\n", "evaluate(model)\n", "\n", "models = []\n", "models.append(model)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay we got around 80% accuracy with such a simple model ! Inference and training time looks good.\n", "\n", "For those of you who don't now about numerical mathematics, solving a linear system of equations by inverting a matrix can be numerically instable. Let's do it the proper way and use a proper solver." ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "accuracy: 80.16%, loss: 5979.51, time: 9.17ms\n" ] } ], "source": [ "def fit_lapack(self, X, y):\n", " \"\"\"Better way (numerical stability): solve the linear system with LAPACK.\"\"\"\n", " n, d = X.shape\n", " self.b = np.mean(y)\n", " A = X.T.dot(X) + self.a * np.identity(d)\n", " b = X.T.dot(y - self.b)\n", " self.w = np.linalg.solve(A, b)\n", "\n", "# Let's monkey patch our object (Python is a dynamic language).\n", "RidgeRegression.fit = fit_lapack\n", "\n", "# Yeah just to be sure.\n", "models.append(evaluate(RidgeRegression(alpha)))\n", "assert np.allclose(models[-1].w, models[0].w)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.3 Learning as Gradient Descent\n", "\n", "Descending the gradient of our objective will lead us to a local minimum. If the objective is convex, that minimum will be global. Let's implement the gradient computed above and a simple gradient descent algorithm\n", "$$w^{(t+1)} = w^{(t)} - \\gamma \\frac{\\partial L}{\\partial w}$$\n", "where $\\gamma$ is the learning rate, another hyper-parameter." ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "loss at iteration 0: 529018.00\n", "loss at iteration 100: 23032.07\n", "loss at iteration 200: 14060.74\n", "loss at iteration 300: 12889.46\n", "loss at iteration 400: 12614.25\n", "loss at iteration 500: 12479.30\n", "loss at iteration 600: 12388.20\n", "loss at iteration 700: 12322.33\n", "loss at iteration 800: 12274.12\n", "loss at iteration 900: 12238.75\n", "accuracy: 80.25%, loss: 6027.15, time: 1190.39ms\n" ] } ], "source": [ "class RidgeRegressionGradient(RidgeRegression):\n", " \"\"\"This model inherits from `ridge_regression`. We overload the constructor, add a gradient\n", " function and replace the learning algorithm, but don't touch the prediction and loss functions.\"\"\"\n", " \n", " def __init__(self, alpha=0, rate=0.1, niter=1000):\n", " \"\"\"Here are new hyper-parameters: the learning rate and the number of iterations.\"\"\"\n", " super().__init__(alpha)\n", " self.rate = rate\n", " self.niter = niter\n", " \n", " def grad(self, X, y, w):\n", " A = X.dot(w) + self.b - y\n", " return 2 * X.T.dot(A) + 2 * self.a * w\n", " \n", " def fit(self, X, y):\n", " n, d = X.shape\n", " self.b = np.mean(y)\n", " \n", " self.w = np.random.normal(size=d)\n", " for i in range(self.niter):\n", " self.w -= self.rate * self.grad(X, y, self.w)\n", " \n", " # Show convergence.\n", " if i % (self.niter//10) == 0:\n", " print('loss at iteration {}: {:.2f}'.format(i, self.loss(X, y)))\n", " \n", "models.append(evaluate(RidgeRegressionGradient(alpha, 1e-6)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tyred of derivating gradients by hand ? Welcome [autograd](https://github.com/HIPS/autograd/), our tool of choice for [automatic differentation](https://en.wikipedia.org/wiki/Automatic_differentiation). Alternatives are [Theano](http://deeplearning.net/software/theano/) and [TensorFlow](https://www.tensorflow.org/)." ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "loss at iteration 0: 162693.94\n", "loss at iteration 100: 20286.79\n", "loss at iteration 200: 14282.81\n", "loss at iteration 300: 13169.71\n", "loss at iteration 400: 12788.21\n", "loss at iteration 500: 12586.18\n", "loss at iteration 600: 12456.27\n", "loss at iteration 700: 12366.82\n", "loss at iteration 800: 12303.66\n", "loss at iteration 900: 12258.57\n", "accuracy: 80.20%, loss: 6035.20, time: 2338.81ms\n" ] } ], "source": [ "class RidgeRegressionAutograd(RidgeRegressionGradient):\n", " \"\"\"Here we derive the gradient during construction and update the gradient function.\"\"\"\n", " def __init__(self, *args):\n", " super().__init__(*args)\n", " from autograd import grad\n", " self.grad = grad(self.loss, argnum=2)\n", "\n", "models.append(evaluate(RidgeRegressionAutograd(alpha, 1e-6)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 6.4 Learning as generic Optimization\n", "\n", "Sometimes we don't want to implement the optimization by hand and would prefer a generic optimization algorithm. Let's make use of [SciPy](https://www.scipy.org/), which provides high-level algorithms for, e.g. [optimization](http://docs.scipy.org/doc/scipy/reference/optimize.html), [statistics](http://docs.scipy.org/doc/scipy/reference/stats.html), [interpolation](http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html), [signal processing](http://docs.scipy.org/doc/scipy/reference/tutorial/signal.html), [sparse matrices](http://docs.scipy.org/doc/scipy/reference/sparse.html), [advanced linear algebra](http://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html)." ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/usr/lib/python3.6/site-packages/scipy/optimize/_minimize.py:415: RuntimeWarning: Method Nelder-Mead does not use gradient information (jac).\n", " RuntimeWarning)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "accuracy: 70.40%, loss: 15561.85, time: 3161.56ms\n", "accuracy: 80.16%, loss: 5979.51, time: 81.34ms\n" ] } ], "source": [ "class RidgeRegressionOptimize(RidgeRegressionGradient):\n", " \n", " def __init__(self, alpha=0, method=None):\n", " \"\"\"Here's a new hyper-parameter: the optimization algorithm.\"\"\"\n", " super().__init__(alpha)\n", " self.method = method\n", " \n", " def fit(self, X, y):\n", " \"\"\"Fitted with a general purpose optimization algorithm.\"\"\"\n", " n, d = X.shape\n", " self.b = np.mean(y)\n", " \n", " # Objective and gradient w.r.t. the variable to be optimized.\n", " f = lambda w: self.loss(X, y, w)\n", " jac = lambda w: self.grad(X, y, w)\n", " \n", " # Solve the problem !\n", " from scipy.optimize import minimize\n", " w0 = np.random.normal(size=d)\n", " res = minimize(f, w0, method=self.method, jac=jac)\n", " self.w = res.x\n", "\n", "models.append(evaluate(RidgeRegressionOptimize(alpha, method='Nelder-Mead')))\n", "models.append(evaluate(RidgeRegressionOptimize(alpha, method='BFGS')))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Accuracy may be lower (depending on the random initialization) as the optimization may not have converged to the global minima. Training time is however much longer ! Especially for gradient-less optimizers such as Nelder-Mead." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 7 More interactivity\n", "\n", "Interlude: the interactivity of Jupyter notebooks can be pushed forward with [IPython widgets](https://ipywidgets.readthedocs.io). Below, we construct a slider for the model hyper-parameter $\\alpha$, which will train the model and print its performance at each change of the value. Handy when exploring the effects of hyper-parameters ! Although it's less usefull if the required computations are long." ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "alpha = 2.99e+05\n", "accuracy: 78.24%, loss: 6773.09, time: 6.75ms\n" ] } ], "source": [ "import ipywidgets\n", "from IPython.display import clear_output\n", "\n", "slider = ipywidgets.widgets.FloatSlider(\n", " value=-2,\n", " min=-4,\n", " max=2,\n", " step=1,\n", " description='log(alpha) / n',\n", ")\n", "\n", "def handle(change):\n", " \"\"\"Handler for value change: fit model and print performance.\"\"\"\n", " value = change['new']\n", " alpha = np.power(10, value) * n\n", " clear_output()\n", " print('alpha = {:.2e}'.format(alpha))\n", " evaluate(RidgeRegression(alpha))\n", "\n", "slider.observe(handle, names='value')\n", "display(slider)\n", "\n", "slider.value = 1 # As if someone moved the slider." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 8 Machine Learning made easier\n", "\n", "Tired of writing algorithms ? Try [scikit-learn](http://scikit-learn.org), which provides many ML algorithms and related tools, e.g. metrics, cross-validation, model selection, feature extraction, pre-processing, for [predictive modeling](https://en.wikipedia.org/wiki/Predictive_modelling)." ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "accuracy: 80.16%\n" ] } ], "source": [ "from sklearn import linear_model, metrics\n", "\n", "# The previously developed model: Ridge Regression.\n", "model = linear_model.RidgeClassifier(alpha)\n", "model.fit(X_train, y_train)\n", "y_pred = model.predict(X_test)\n", "models.append(model)\n", "\n", "# Evaluate the predictions with a metric: the classification accuracy.\n", "acc = metrics.accuracy_score(y_test, y_pred)\n", "print('accuracy: {:.2f}%'.format(acc*100))\n", "\n", "# It does indeed learn the same parameters.\n", "assert np.allclose(models[-1].coef_, models[0].w, rtol=1e-1)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "accuracy: 81.52%\n" ] } ], "source": [ "# Let's try another model !\n", "models.append(linear_model.LogisticRegression())\n", "models[-1].fit(X_train, y_train)\n", "acc = models[-1].score(X_test, y_test)\n", "print('accuracy: {:.2f}%'.format(acc*100))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 9 Traditional Statistics\n", "\n", "[Statsmodels](http://statsmodels.sourceforge.net/) is similar to scikit-learn, with much stronger emphasis on parameter estimation and (statistical) testing. It is similar in spirit to other statistical packages such as [R](https://www.r-project.org), [SPSS](http://www.ibm.com/analytics/us/en/technology/spss), [SAS](http://www.sas.com/de_ch/home.html) and [Stata](http://www.stata.com). That split reflects the [two statistical modeling cultures](http://projecteuclid.org/euclid.ss/1009213726): (1) Statistics, which want to know how well a given model fits the data, and what variables \"explain\" or affect the outcome, and (2) Machine Learning, where the main supported task is chosing the \"best\" model for prediction." ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " OLS Regression Results \n", "==============================================================================\n", "Dep. Variable: y R-squared: 0.085\n", "Model: OLS Adj. R-squared: 0.084\n", "Method: Least Squares F-statistic: 80.98\n", "Date: Sun, 03 Dec 2017 Prob (F-statistic): 0.00\n", "Time: 18:24:26 Log-Likelihood: -27411.\n", "No. Observations: 19946 AIC: 5.487e+04\n", "Df Residuals: 19923 BIC: 5.505e+04\n", "Df Model: 23 \n", "Covariance Type: nonrobust \n", "==============================================================================\n", " coef std err t P>|t| [0.025 0.975]\n", "------------------------------------------------------------------------------\n", "x1 -0.0216 0.008 -2.574 0.010 -0.038 -0.005\n", "x2 -0.0170 0.007 -2.469 0.014 -0.030 -0.004\n", "x3 0.0348 0.007 4.843 0.000 0.021 0.049\n", "x4 -0.0263 0.008 -3.404 0.001 -0.041 -0.011\n", "x5 0.0335 0.008 4.263 0.000 0.018 0.049\n", "x6 0.2179 0.009 23.179 0.000 0.199 0.236\n", "x7 0.0434 0.012 3.596 0.000 0.020 0.067\n", "x8 0.0202 0.013 1.568 0.117 -0.005 0.045\n", "x9 0.0120 0.014 0.860 0.390 -0.015 0.039\n", "x10 0.0114 0.015 0.776 0.438 -0.017 0.040\n", "x11 0.0072 0.012 0.589 0.556 -0.017 0.031\n", "x12 -0.0967 0.025 -3.896 0.000 -0.145 -0.048\n", "x13 0.0225 0.034 0.663 0.508 -0.044 0.089\n", "x14 -0.0068 0.031 -0.217 0.829 -0.068 0.054\n", "x15 -0.0105 0.030 -0.347 0.729 -0.070 0.049\n", "x16 -0.0093 0.034 -0.271 0.786 -0.077 0.058\n", "x17 0.0351 0.027 1.313 0.189 -0.017 0.087\n", "x18 -0.0247 0.009 -2.859 0.004 -0.042 -0.008\n", "x19 -0.0057 0.010 -0.541 0.588 -0.026 0.015\n", "x20 9.242e-05 0.009 0.010 0.992 -0.017 0.018\n", "x21 -0.0075 0.009 -0.849 0.396 -0.025 0.010\n", "x22 -0.0166 0.009 -1.886 0.059 -0.034 0.001\n", "x23 0.0003 0.007 0.047 0.962 -0.014 0.014\n", "==============================================================================\n", "Omnibus: 3072.392 Durbin-Watson: 1.335\n", "Prob(Omnibus): 0.000 Jarque-Bera (JB): 4759.053\n", "Skew: 1.194 Prob(JB): 0.00\n", "Kurtosis: 3.141 Cond. No. 16.6\n", "==============================================================================\n", "\n", "Warnings:\n", "[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/usr/lib/python3.6/site-packages/statsmodels/compat/pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead.\n", " from pandas.core import datetools\n" ] } ], "source": [ "import statsmodels.api as sm\n", "\n", "# Fit the Ordinary Least Square regression model.\n", "results = sm.OLS(y_train, X_train).fit()\n", "\n", "# Inspect the results.\n", "print(results.summary())\n", "\n", "# Yeah, it's the same as scikit-learn.\n", "assert np.allclose(results.params, linear_model.Ridge(fit_intercept=False).fit(X_train, y_train).coef_, atol=1e-3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 10 Deep Learning (DL)\n", "\n", "Of course ! We got two low-level Python libraries: (1) [TensorFlow](https://www.tensorflow.org/) and (2) [Theano](http://deeplearning.net/software/theano/). Both of them treat data as tensors and construct a computational graph ([dataflow paradigm](https://en.wikipedia.org/wiki/Dataflow_programming)), composed of any mathematical expressions, that get evaluated on CPUs or GPUs. Theano is the pioneer and features an optimizing compiler which will turn the computational graph into efficient code. TensorFlow has a cleaner API (not need to define expressions as strings) and does not require a compilation step (which is painful when developing models).\n", "\n", "While you'll only use Theano / TensorFlow to develop DL models, these are the higher-level libraries you'll use to define and test DL architectures on your problem:\n", "* [Keras](https://keras.io/): TensorFlow & Theano backends\n", "* [Lasagne](http://lasagne.readthedocs.io): Theano backend\n", "* [nolearn](https://github.com/dnouri/nolearn): sklearn-like abstraction of Lasagne\n", "* [Blocks](http://blocks.readthedocs.io): Theano backend\n", "* [TFLearn](http://tflearn.org): TensorFlow backend" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using TensorFlow backend.\n", "/usr/lib/python3.6/site-packages/ipykernel_launcher.py:8: UserWarning: Update your `Dense` call to the Keras 2 API: `Dense(input_dim=23, activation=\"relu\", units=46)`\n", " \n", "/usr/lib/python3.6/site-packages/ipykernel_launcher.py:9: UserWarning: Update your `Dense` call to the Keras 2 API: `Dense(activation=\"sigmoid\", units=1)`\n", " if __name__ == '__main__':\n", "/home/michael/venv/tmp/lib/python3.6/site-packages/keras/models.py:939: UserWarning: The `nb_epoch` argument in `fit` has been renamed `epochs`.\n", " warnings.warn('The `nb_epoch` argument in `fit` '\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Epoch 1/5\n", "19946/19946 [==============================] - 1s 38us/step - loss: 0.5352 - acc: 0.7831\n", "Epoch 2/5\n", "19946/19946 [==============================] - 1s 28us/step - loss: 0.4893 - acc: 0.8018\n", "Epoch 3/5\n", "19946/19946 [==============================] - 1s 29us/step - loss: 0.4740 - acc: 0.8056\n", "Epoch 4/5\n", "19946/19946 [==============================] - 1s 27us/step - loss: 0.4661 - acc: 0.8083\n", "Epoch 5/5\n", "19946/19946 [==============================] - 1s 29us/step - loss: 0.4611 - acc: 0.8097\n", "10000/10000 [==============================] - 0s 19us/step\n", "\n", "\n", "Testing set: [0.45307849888801577, 0.81489999999999996]\n" ] } ], "source": [ "import keras\n", "\n", "class NeuralNet(object):\n", " \n", " def __init__(self):\n", " \"\"\"Define Neural Network architecture.\"\"\"\n", " self.model = keras.models.Sequential()\n", " self.model.add(keras.layers.Dense(output_dim=46, input_dim=23, activation='relu'))\n", " self.model.add(keras.layers.Dense(output_dim=1, activation='sigmoid'))\n", " self.model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])\n", "\n", " def fit(self, X, y):\n", " y = y / 2 + 0.5 # [-1,1] -> [0,1]\n", " self.model.fit(X, y, nb_epoch=5, batch_size=32)\n", "\n", " def predict(self, X):\n", " classes = self.model.predict_classes(X, batch_size=32)\n", " return classes[:,0] * 2 - 1\n", " \n", "models.append(NeuralNet())\n", "models[-1].fit(X_train, y_train)\n", "\n", "loss_acc = models[-1].model.evaluate(X_test, y_test/2+0.5, batch_size=32)\n", "print('\\n\\nTesting set: {}'.format(loss_acc))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 11 Evaluation\n", "\n", "Now that we tried several predictive models, it is time to evaluate them with our chosen metrics and choose the one best suited to our particular problem. Let's plot the *classification accuracy* and the *prediction time* for each classifier with [matplotlib](http://matplotlib.org), the goto 2D plotting library for scientific Python. Its API is similar to matlab.\n", "\n", "Result: The NeuralNet gives the best accuracy, by a small margin over the much simple logistic regression, but is the slowest method. Which to choose ? Again, it depends on your priorities." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA2oAAAG4CAYAAAA0QhQGAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4wLCBo\ndHRwOi8vbWF0cGxvdGxpYi5vcmcvpW3flQAAIABJREFUeJzs3XmYJWV59/HvM9OyuUZblMEFUDTB\nDUUFtwRFVHzzAka9gytuGYkLblFBo2gSDUmMWxbJGKO4extBTGIiBoOGV8EAatAoCSIqMoJjEFFg\ncGbq/aOqnUPT63Sfrnro7+e6ztVdderU+c2ZmXOf+9RTT5WmaZAkSZIkDceavgNIkiRJkm7IRk2S\nJEmSBsZGTZIkSZIGxkZNkiRJkgbGRk2SJEmSBsZGTZIkSZIGxkZNqlgp5VmllKa7/f0y7O/ckf09\nfDkySpJWRillr9H37+nLS9jvG0opFy1Pyh16/oO7P8edenr+S0Zq473H/FwvGnmuvx3nc2n4bNRU\nlVLKHUsp15VSflhKuVnfeQZiK7AH8LypFaWUR5VSvllKubqU8slSym1GH1BK+ftSyqtn2NdjgQeP\nN64krQ6llPeNfOjeUkr5binlpFLK7VYowvdp68M5C9m4lPLwLute0+56C3DQ8kabNcOWUsqzpq3+\nIu2f47KVyDCLP+kyfGvMz/N33fN8aczPowrYqKk2zwH+CfgxcETPWQAopezUd4amaX7YNM1PujwF\n+CjwXuCBwCTw2qltSylPAu5KW3in7+fHwI9WIrMkrRL/TvvBey/gWOCJwPtn23g5a0rTNFu7+vCL\nJe7nZ03TbFquXDvw/Nd3f45tfWUAftZl2DLOJ2ma5pqmaX4IXD/O51EdbNRUjVLKGuB3gJO72/oZ\ntpkopby+lPLtUsrmUsoPSil/MXL/LUopby+lfL+7/5JSymu6+2YcIlJKuaiU8oaR5aaUcmwp5cOl\nlKuAD3Xr39Qdxbqm2/9JpZRbT9vXAaWUfyml/LSU8rNSypdLKQeWUvYppWwrpTx02va/0a3fZxEv\n1SRwe+CdTdNc2OXbr9vf7YC3As9pmmbrIvYpSdoxU03GpU3TnAa8HXhcKWXXkbrztFLKp0spPwfe\nDFBKuXsp5ROllJ+UUq4spZxeSrnP6I5LKdHVqOtKKV8E7jvt/hvVtVLK7qWU95ZSLu8ed2Ep5Tnd\nUbR/7zb7Tve4M7vH3GjoYynl6FLKf3W19NJSyh+VUiZG7j+zlPK3pZTXdaNg/rc7wnjz2V6oUsol\nwFrgvVNHIrv1Nxj6OLL8+FLKl0op15ZSziul3Ku7ndXV4i+XUvab9hwHdK/lz0opPyqlnFJKuet8\nf4kzZJ16bZ9aSvlM93zf6ur2nlN/n91r9IiRx92slPLW7jXbXErZWEr56GKfX6uDjZpq8hjg5sCn\ngQ8AB8/QwLwHeBHwBtrm5InAxfDLI03/CBwOvBj4NeCZ7NgRpBNohyU8gO1Hq66lbR73A54FHAy8\nc+oBpZR7AV8ArgQeBdwfeBuwpmmai4HP0jaio54HnNHdv1CbaIeHPL60w0MPBb7S3fdO4G+bprlg\nEfuTJC2fa2k/f02MrPsT4MPAfYC/KqXcATgLuAJ4BO2wwwuBM0sptwcopdyfdvTEx4H70Y6SeMdc\nT1xK2RX4fLf902jr1YuBa2iHSU6NVHkw7VHA35plP/+HdojeB7rMrwBeSFsbRz0JuC1tPXwqcCTw\nqjkiPoh2OP9Lu+ffY64/D/Am2hp8AO0RqI8A7+pyTK1770ju/Wj//F+iHXHyqO75PltK2WWe55rN\nH3bPuT/wzS7DycC7aev8N4EPl+2na7wYCODpwL60n0nO3sHn1k1d0zTevFVxA04F3jay/GngzSPL\ndwca4EmzPP6Q7v4HznL/Xt39D5+2/iLgDSPLDfCeBeR9ArCZthGDtqB9bWp5hu1/C/g5cOtu+Ta0\nxfPJczzHs4AtM6x/KO0b/3dpC8YtgN8EvkpbNN9H28CeBtxhIa+DN2/evHlb3K17r/3XkeX9gG8D\nZ3fLU++3r5v2uDdMbTOyrnSPfWm3/EHgi9O2edHo+/f093PgucB1wJ1myfvwbvu9Zshz0cjyvwM5\nbZuX0DahO3XLZwL/OW2bk4AvzfOabQGeNW3dwV2uO01bPnJkmyd36544su4J3bpbjPx9fHTavnfu\nau2Rc2S6BPj9aeumXtuXjqx7ULfuFSPr7t+tu3e3/A7gc0CZ53U4k/aL1d7/HXvr7+YRNVWhlLIH\nbaNx8sjq9wHPHhlq8YDu5+mz7OYA4Mqmac5dhkhfniHjb5VSvlBKuayU8jPaIYc7AXccef4zmtnH\n2H8KuIr2W0dov237GW0ztShN03yxaZqDmqa5a9M0R9MOJflL4NnAccDNgHvQfkP7ztn3JElaooO7\nYXbXAl+n/ZLsqdO2mV5THgQc0D3uZ11NuZq2Odi322Y/4P9Ne9xZ82Q5APivpmkuXeSfYbqpESKj\nPg/sAtxtZN1Xp23zA+AOS3zuUV8b+f2H3c//nGHd7t3PBwFPmPa6/pg2977smMVmeC/tUciLulMk\nnlgGcK67hmli/k2kQXgu7b/Xc9sRjL+0lnbYwCkL3E8zx31TDVSZtn6m2SV/PrpQSjmQdvjJHwOv\npB3eeBBtYzn6Bjzr8zdNs6WU8h7a4Y/voh32+L6maZbjhOK3Ah9qmuYr3XO8rnu+99MWV0nSeJwD\nHE17pGhj0zSbZ9jm59OW1wBn0B4hm+6q7mdh7po2mx15zEL2U2ZYP71+NSzvaTejk6Q0c6xbM/Lz\nA8CJM+zrxyuRoWmar5ZS9qY9LeGRtEfY/rCUclDTND/dwQy6ifKImgavtJOIPI/2BOv9p90+yPZJ\nRc7vfj5mll2dB9y2lPLAWe6fOldt3chz7w7suYCYDwc2NU3z+03TnNM0zX8D06/3ch7w6O7PM5t3\nA/crpRxDew7Bkq+hUko5FDgQ+INu1Rq2N5874fuAJI3TtU3TXNQ0zSWzNGkzOZf2qNUPuseO3qZq\n1TeAh0173PTl6c4D7lVmvx7ZVGO1dp79fAP4jWnrfp126ONizqmeLcN8z7+jzqWdcOXbM7yuV47p\nOW+kaWfRPLVpmmNpz5X7NW78ekp+QFMVHgfcBfibpmm+PnqjHUJwaCllr6ZpLqIdbvjXpZSnl1Lu\nVkp5UCnlJd1+Pkc7rv5jpZQjSil7l1IeVkp5HkDTNNfSDiN5VSnlfqWUA2inUF5IYb0QuH0p5bml\nncHxmcALpm3zp7RDKz5USnlgl+/JpZSHTG3QNM33gH+h/YbtzK7h22GllFvQnhPw3JEPCF8AXlRK\nuSfwcjyiJklD85e0zconSymP6GYYfHhpZxeemh34bcBDunX3KKU8gXZSj7l8hPbc5U+VUh7d1cFD\nSim/3d3/XdrRJY8v7eyQt55lP38MPLGUclz33EF7HtufL8MokO8AjyylrCulTC5xX9O9mbYp+mAp\n5cHdn/+RpZR3zDA52ViUUl5Z2lk+79UdWXsO7YQmS6r3ummyUVMNng+c0zUx032e9kjY1MWenw38\nDfBHtDMtnQrsDdA0TQP8H9pJSE6iba4+SDud/ZTn0J4X9kXa2bQ2ABvnC9g0zT/Szj71ZuAC4Cja\nIZCj21xAewL07bvcXwV+j/YNetQG2iNdG+Z73gU4ETi1aZrRi52+kfZbz3NpG+Bjl+F5JEnLpGma\ny4GH0M7iewptvfoQ7TUwN3bbnEd7rttRtHXnOOBl8+z3GtojN1+nrXHfBP4K2HXkeY/v9rWRWc6R\nbprm07T18uhuX28D/pq2vizVK2jPpfsOy3xdz6Zpvkk72dYtgM8A/0U7kmVX4CfL+Vxz+Cntl6Rf\nov17ewLtBCgXrtDzqyKl/ewqaShKKS+gHaa453zDZEopz6KdFWrZzjct7bV0vgM8omma+U5MlyTp\nJq2013f726Zp/mgFn/NM2pk2nzfftrrp8oiaNBClvRj3/rRH2f5yEecyrO1mr/rAMmT4Au25B5Ik\nabvXdbX218b5JKWU9d1slI+Yd2Pd5HlETRqIUsr7aIexfJb2WnDXLuAxt2T7VMc/a5rmh3Ntv4D9\n3Yl2mmKAS5umuW4p+5MkqXallLuyfRKu7y3TbMyzPdetaU+RALhqZPIYrUI2apIkSZI0MAs6ryUi\nXkY7WUNDe+Ljs4H30E4p+gvaCzU+PzN/MetOJEmSJEkLMu85ahGxJ+2scA/MzHvTThd7FO3sQ79K\ne3X1Xdk+654kSZIkaQkWOlPcBLBrRPwC2A24LDNPn7ozIr7MjS/uOxPHWUrS6lH6DlAR66MkrS7z\n1sh5G7XM/EFEvAX4Hu21l06f1qTdDHgG8JJZdnEDl1122UI2m9Xk5CSbNm1a0j5WSk1Zoa68Zh2P\nmrJCXXlXW9Z169YtU5rVY6n1EVbfv7OVYtbxqSmvWcejpqywsjVy3kYtIn4FOIL2osE/AT4eEU/P\nzA92m/w18IXM/PdZHr8eWA+QmUxOLu0i8xMTE0vex0qpKSvUldes41FTVqgrr1klSdJiLGTo46OB\n72TmjwAi4hTaq7p/MCJOoJ1C9PmzPTgzNwAbusVmqR1oTV13TVmhrrxmHY+askJdeVdbVo+oSZK0\nNAtp1L4HHBQRu9EOfTwEODcingc8FjgkM7eNMaMkSZIkrSrzzvqYmecAfw+cTzs1/xraI2Qn0V5o\n90sR8dWIeP04g0qSJEnSarGgWR8z8wTghB15rCRJkiRpcWy2JGlMmis2su30T8I5Z3L55utg513g\nwINZ85gjKbvv0Xc8SZK0AH3Vcxs1SRqD5oLz2HbSibB1C2zd2q687lo463S2felzrDnmOMp9Dug3\n5AibSkmSbqzPej7vOWqSpMVprtjYvqlfv3n7m/qUrVvh+s1sO+lEmis29hNwmuaC89j2xmPhrNPb\n4tM024vQG4+lueC8viNKkrTi+q7nNmqStMy2nf7J9pu3uWzdwrbPnrYygebQdxGSJGmo+q7nNmqS\ntNzOOfPGTc90W7fC2WeuRJo59V2EJEkarJ7ruY2aJC23665b2Habrx1vjoWoqKmUJGlF9VzPbdQk\nabntssvCttt51/HmWIiamkpJklZSz/XcRk2SltuBB8PatXNvs3YtHHTwSqSZW01NpSRJK6nnem6j\nJknLbM1jjoS181z9ZO0Eaw49YmUCzaWmplKSpBXUdz23UZOkZVZ234M1xxwHO+184yZo7VrYaef2\nuisDuD5Z30VIkqSh6rue26hJ0hiU+xzAmhPeCY94LOyyG5TS/nzEY1lzwjsHc7HrvouQJElD1mc9\nn+drVEnSjiq778Hapx0DTzuGyclJNm3a1HekGU0VoW2fPa2d3XHzte05aQcdzJpDj1j1TVpE3Bl4\nP3BHYBuwITPfERFvAH4H+FG36Wsy89PdY44HngtsBY7NzM+seHBJ0rLoq57bqEmSqmkqe7IFeEVm\nnh8RtwTOi4jPdve9LTPfMrpxROwHHAXcC1gH/GtE3CMz57kOgiRJ2zn0UZKkOWTmxsw8v/v9auCb\nwJ5zPOQI4KOZuTkzvwNcBDx4/EklSTclHlGTJGmBImIv4P7AOcDDgBdFxDOBc2mPul1J28SdPfKw\nS5mhsYuI9cB6gMxkcnJyyfkmJiaWZT8rwazjUVNWqCuvWcejpqywsnlt1CRJWoCIuAXwCeClmfnT\niHgX8IdA0/38c+A5QJnh4c30FZm5Adgwdf9yDDetadiqWcejpqxQV16zjkdNWWF58q5bt25B29mo\nSZI0j4i4GW2T9qHMPAUgMy8fuf/dwD92i5cCdx55+J2Ay1YoqiTpJsJz1CRJmkNEFOA9wDcz860j\n60enw3wC8PXu908BR0XEzhGxN7Av8OWVyitJumnwiJokSXN7GPAM4IKI+Gq37jXAUyJif9phjZcA\nzwfIzG9ERAL/RTtj5Aud8VGStFg2apIkzSEzz2Lm884+Pcdj3gS8aWyhJEk3eQ59lCRJkqSBsVGT\nJEmSpIGxUZMkSZKkgbFRkyRJkqSBsVGTJEmSpIGxUZMkSZKkgbFRkyRJkqSBsVGTJEmSpIHxgter\nXHPFRrad/kk450wu33wd7LwLHHgwax5zJGX3PfqOJ0mSJK1KNmqrWHPBeWw76UTYugW2bm1XXnct\nnHU62770OdYccxzlPgf0G1KSJElahapo1Go66lNL1uaKjW2Tdv3mG9+5dSts3cq2k05kzQnvHEzu\nWl7b2tT2utaWV5IkaUcM/hy15oLz2PbGY+Gs09ujPU2z/ajPG4+lueC8viP+Uk1Zt53+yfZI2ly2\nbmHbZ09bmUDzqOm1rUltr2tteSVJknbUoBu1Gxz1mRqaN2XrVrh+M9tOOpHmio39BBxRU1YAzjnz\nxjmn27oVzj5zJdLMqbrXthK1va615ZUkSVqKQTdqNR31qSkrANddt7DtNl873hwLUN1rW4naXtfa\n8kqSJC3FoBu1mo76VJUVYJddFrbdzruON8dC1Pba1qK217W2vJIkSUuwoMlEIuJlwPOABrgAeDaw\nB/BR4LbA+cAzMvP6ZU1X0VGfqrICHHhwe57PXB98166Fgw5eqUSzq+21rUVtr2tteSVJkpZg3iNq\nEbEncCzwwMy8N7AWOAr4E+BtmbkvcCXw3GVPV9NRn5qyAmsecySsnadPXzvBmkOPWJlAc6nsta1G\nba9rbXklSZKWYKFDHyeAXSNiAtgN2Ag8Cvj77v6TgSOXPd2BB7dHdeYylKM+NWUFyu57sOaY42Cn\nnW+ce+1a2Gnn9jpqQ5juvLLXthq1va615ZUkSVqCeRu1zPwB8Bbge7QN2lXAecBPMnPqzP5LgT2X\nPVxFR31qyjql3OcA1pzwTnjEY2GX3aCU9ucjHtteP20gF7uu8bWtQW2va215JUmSlmLec9Qi4leA\nI4C9gZ8AHwcOm2HTZpbHrwfWA2Qmk5OTC083OcnmV72Zn/zZa2HLlhvO+LZ2AiYmuM0r38TO+91n\n4fscl5qyjpqchP3uAy/5fSYmJtiyZZ5Z9fpQ62vbmZiYWNy/+5VS2+taW95pBvvvYAY1ZZUk6aZq\nIZOJPBr4Tmb+CCAiTgEeCtwmIia6o2p3Ai6b6cGZuQHY0C02mzZtWlzCu+7Lmte/o51y++wz24kC\ndt4VDjqYNYcewdW778HVi93nuNSUdQaTk5Ms+u9npVT82vq6LqPa8o4Y9L+DaZYj67p165YpjSRJ\nq9NCGrXvAQdFxG7AtcAhwLnAvwFPop358WhgbBcvKrvvwdqnHQNPO2bwH3ZqylobX9vxqO11rS2v\nJEnSjljIOWrn0E4acj7t1PxraI+QvRp4eURcBNwOeM8Yc0qSJEnSqrGg66hl5gnACdNWXww8eNkT\nSZIkSdIqt9Dp+SVJkiRJK8RGTZIkSZIGxkZNkiRJkgbGRk2SJEmSBsZGTZIkSZIGxkZNkiRJkgbG\nRk2SJEmSBsZGTZIkSZIGxkZNkiRJkgbGRk2SJEmSBsZGTZIkSZIGxkZNkiRJkgbGRk2SJEmSBsZG\nTZIkSZIGZqLvAJIkDVlE3Bl4P3BHYBuwITPfERG3BT4G7AVcAkRmXhkRBXgH8HjgGuBZmXl+H9kl\nSfXyiJokSXPbArwiM38NOAh4YUTsBxwHnJGZ+wJndMsAhwH7drf1wLtWPrIkqXY2apIkzSEzN04d\nEcvMq4FvAnsCRwAnd5udDBzZ/X4E8P7MbDLzbOA2EbHHCseWJFXORk2SpAWKiL2A+wPnAHfIzI3Q\nNnPA7t1mewLfH3nYpd06SZIWzHPUJElagIi4BfAJ4KWZ+dOImG3TMsO6Zob9racdGklmMjk5ueSM\nExMTy7KflWDW8agpK9SV16zjUVNWWNm8NmqSJM0jIm5G26R9KDNP6VZfHhF7ZObGbmjjFd36S4E7\njzz8TsBl0/eZmRuADd1is2nTpiXnnJycZDn2sxLMOh41ZYW68pp1PGrKCsuTd926dQvazkZNkqQ5\ndLM4vgf4Zma+deSuTwFHAyd2P08bWf+iiPgocCBw1dQQSUmSFspGTZKkuT0MeAZwQUR8tVv3GtoG\nLSPiucD3gCd3932admr+i2in53/2ysaVJN0U2KhJkjSHzDyLmc87Azhkhu0b4IVjDSVJuslz1kdJ\nkiRJGhgbNUmSJEkaGBs1SZIkSRoYGzVJkiRJGhgbNUmSJEkaGBs1SZIkSRoYGzVJkiRJGhgbNUmS\nJEkaGBs1SZIkSRoYGzVJkiRJGhgbNUmSJEkaGBs1SZIkSRoYGzVJkiRJGpiJ+TaIiHsCHxtZtQ/w\neuBM4CRgF2AL8ILM/PIYMkqSJEnSqjJvo5aZFwL7A0TEWuAHwKnAu4E3ZuY/R8TjgT8FDh5fVEmS\nJElaHRY79PEQ4NuZ+V2gAW7Vrb81cNlyBpMkSZKk1WreI2rTHAV8pPv9pcBnIuIttA3fQ2d6QESs\nB9YDZCaTk5M7GLU1MTGx5H2slJqyQl15zToeNWWFuvKaVZIkLcaCG7WI2Ak4HDi+W/W7wMsy8xMR\nEcB7gEdPf1xmbgA2dIvNpk2blhR4cnKSpe5jpdSUFerKa9bxqCkr1JV3tWVdt27dMqWRJGl1WszQ\nx8OA8zPz8m75aOCU7vePAw9ezmCSJEmStFotplF7CtuHPUJ7TtpvdL8/Cvif5QolSZIkSavZgoY+\nRsRuwKHA80dW/w7wjoiYAK6jOw9NkiRJkrQ0C2rUMvMa4HbT1p0FHDCOUJIkSZK0mi12en5JkiRJ\n0pjZqEmSJEnSwNioSZIkSdLA2KhJkiRJ0sDYqEmSJEnSwNioSZIkSdLA2KhJkiRJ0sDYqEmSJEnS\nwNioSZIkSdLA2KhJkiRJ0sDYqEmSJEnSwNioSZIkSdLA2KhJkiRJ0sDYqEmSJEnSwNioSZIkSdLA\n2KhJkiRJ0sDYqEmSJEnSwEz0HUCSpKGLiL8DfhO4IjPv3a17A/A7wI+6zV6TmZ/u7jseeC6wFTg2\nMz+z4qElSVWzUZMkaX7vA/4SeP+09W/LzLeMroiI/YCjgHsB64B/jYh7ZObWlQgqSbppcOijJEnz\nyMwvAP+7wM2PAD6amZsz8zvARcCDxxZOknST5BE1SZJ23Isi4pnAucArMvNKYE/g7JFtLu3WSZK0\nYDZqkiTtmHcBfwg03c8/B54DlBm2baaviIj1wHqAzGRycnLJgSYmJpZlPyvBrONRU1aoK69Zx6Om\nrLCyeW3UJEnaAZl5+dTvEfFu4B+7xUuBO49seifgshkevwHY0C02mzZtWnKmyclJlmM/K8Gs41FT\nVqgrr1nHo6assDx5161bt6DtPEdNkqQdEBF7jCw+Afh69/ungKMiYueI2BvYF/jySueTJNXNI2qS\nJM0jIj4CHAxMRsSlwAnAwRGxP+2wxkuA5wNk5jciIoH/ArYAL3TGR0nSYtmoSZI0j8x8ygyr3zPH\n9m8C3jS+RJKkmzqHPkqSJEnSwNioSZIkSdLA2KhJkiRJ0sDYqEmSJEnSwNioSZIkSdLA2KhJkiRJ\n0sDYqEmSJEnSwNioSZIkSdLA2KhJkiRJ0sBMzLdBRNwT+NjIqn2A12fm2yPixcCLgC3AP2Xmq8YT\nU5IkSZJWj3kbtcy8ENgfICLWAj8ATo2IRwJHAPfNzM0RsftYk0qSJEnSKrHYoY+HAN/OzO8Cvwuc\nmJmbATLziuUOJ0mSJEmr0bxH1KY5CvhI9/s9gEdExJuA64Dfy8z/WM5wkiRJkrQaLbhRi4idgMOB\n40ce+yvAQcCDgIyIfTKzmfa49cB6gMxkcnJyaYEnJpa8j5VSU1aoK69Zx6OmrFBXXrNKkqTFWMwR\ntcOA8zPz8m75UuCUrjH7ckRsAyaBH40+KDM3ABu6xWbTpk1LCjw5OclS97FSasoKdeU163jUlBXq\nyrvasq5bt26Z0kiStDot5hy1p7B92CPAJ4FHAUTEPYCdgDo+hUiSJEnSgC2oUYuI3YBDgVNGVv8d\nsE9EfB34KHD09GGPkiRJkqTFW9DQx8y8BrjdtHXXA08fRyhJkiRJWs0WOz2/JEmSJGnMbNQkSZIk\naWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRp\nYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlg\nbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYCb6DiBJ0tBFxN8BvwlckZn3\n7tbdFvgYsBdwCRCZeWVEFOAdwOOBa4BnZeb5feSWJNXLI2qSJM3vfcDjpq07DjgjM/cFzuiWAQ4D\n9u1u64F3rVBGSdJNiI2aJEnzyMwvAP87bfURwMnd7ycDR46sf39mNpl5NnCbiNhjZZJKkm4qHPoo\nSdKOuUNmbgTIzI0RsXu3fk/g+yPbXdqt2zj64IhYT3vEjcxkcnJyyYEmJiaWZT8rwazjUVNWqCuv\nWcejpqywsnlt1CRJWl5lhnXN9BWZuQHYMHX/pk2blvzEk5OTLMd+VoJZx6OmrFBXXrOOR01ZYXny\nrlu3bkHbOfRRkqQdc/nUkMbu5xXd+kuBO49sdyfgshXOJkmqnEfUJEnaMZ8CjgZO7H6eNrL+RRHx\nUeBA4KqpIZKSJC2UjZokSfOIiI8ABwOTEXEpcAJtg5YR8Vzge8CTu80/TTs1/0W00/M/e8UDS5Kq\nZ6MmSdI8MvMps9x1yAzbNsALx5tIknRT5zlqkiRJkjQwNmqSJEmSNDA2apIkSZI0MDZqkiRJkjQw\nNmqSJEmSNDA2apIkSZI0MPNOzx8R9wQ+NrJqH+D1mfn27v7fA/4MuH1mbhpLSkmSJElaReZt1DLz\nQmB/gIhYC/wAOLVbvjNwKO2FPiVJkiRJy2CxQx8PAb6dmd/tlt8GvApoljWVJEmSJK1i8x5Rm+Yo\n4CMAEXE48IPM/FpEzPqAiFgPrAfITCYnJ3cwamtiYmLJ+1gpNWWFuvKadTxqygp15TWrJElajAU3\nahGxE3A4cHxE7Aa8FnjMfI/LzA3Ahm6x2bRpaaexTU5OstR9rJSaskJdec06HjVlhbryrras69at\nW6Y0kiStTosZ+ngYcH5mXg70pHpUAAAgAElEQVTcDdgb+FpEXALcCTg/Iu64/BElSZIkaXVZzNDH\np9ANe8zMC4Ddp+7omrUHOuujJEmSJC3dgo6odUMdDwVOGW8cSZIkSdKCjqhl5jXA7ea4f6/lCiRJ\nkiRJq91ip+eXJEmSJI2ZjZokSZIkDYyNmiRJkiQNjI2aJEmSJA2MjZokSZIkDYyNmiRJkiQNjI2a\nJEmSJA2MjZokSZIkDYyNmiRJkiQNjI2aJEmSJA2MjZokSZIkDYyNmiRJkiQNjI2aJEmSJA2MjZok\nSZIkDYyNmiRJkiQNjI2aJEmSJA2MjZokSZIkDYyNmiRJkiQNjI2aJEmSJA2MjZokSZIkDYyNmiRJ\nkiQNjI2aJEmSJA2MjZokSZIkDcxE3wEkSapZRFwCXA1sBbZk5gMj4rbAx4C9gEuAyMwr+8ooSaqP\nR9QkSVq6R2bm/pn5wG75OOCMzNwXOKNbliRpwWzUJElafkcAJ3e/nwwc2WMWSVKFbNQkSVqaBjg9\nIs6LiPXdujtk5kaA7ufuvaWTJFXJc9QkSVqah2XmZRGxO/DZiPjWQh7UNXXrATKTycnJJQeZmJhY\nlv2sBLOOR01Zoa68Zh2PmrLCyua1UZMkaQky87Lu5xURcSrwYODyiNgjMzdGxB7AFTM8bgOwoVts\nNm3atOQsk5OTLMd+VoJZx6OmrFBXXrOOR01ZYXnyrlu3bkHbOfRRkqQdFBE3j4hbTv0OPAb4OvAp\n4Ohus6OB0/pJKEmqlY2aJEk77g7AWRHxNeDLwD9l5r8AJwKHRsT/AId2y5IkLZhDHyVJ2kGZeTFw\nvxnW/xg4ZOUTSZJuKjyiJkmSJEkDY6MmSZIkSQNjoyZJkiRJAzPvOWoRcU/gYyOr9gFeD+wJ/F/g\neuDbwLMz8yfjCClJkiRJq8m8R9Qy88LM3D8z9wcOAK4BTgU+C9w7M+8L/Ddw/FiTSpIkSdIqsdhZ\nHw8Bvp2Z3wW+O7L+bOBJy5ZKkiRJklaxxZ6jdhTwkRnWPwf456XHkSRJkiQt+IhaROwEHM60IY4R\n8VpgC/ChWR63HlgPkJlMTk7ucFiAiYmJJe9jpdSUFerKa9bxqCkr1JXXrJIkaTEWM/TxMOD8zLx8\nakVEHA38JnBIZjYzPSgzNwAbusVm06ZNO5oVgMnJSZa6j5VSU1aoK69Zx6OmrFBX3tWWdd26dcuU\nRpKk1WkxjdpTGBn2GBGPA14N/EZmXrPcwSRJkiRptVrQOWoRsRtwKHDKyOq/BG4JfDYivhoRJ40h\nnyRJkiStOgs6otYdMbvdtHV3H0siSZIkSVrlFjvroyRJkiRpzGzUJEmSJGlgbNQkSZIkaWBs1CRJ\nkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmS\nJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIk\naWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRpYGzUJEmSJGlgbNQkSZIkaWBs1CRJkiRp\nYCb6DiDdFDVXbGTb6Z+Ec87k8s3Xwc67wIEHs+YxR1J236PveJJuQmp6vzHreNSUFerKa9bxqClr\nn0rTNCv5fM1ll122pB1MTk6yadOmZYozXjVlhbryDjlrc8F5bDvpRNi6BbZu3X7H2rWwdoI1xxxH\nuc8B/QWcw5Bf15nUlHe1ZV23bh1AWZZAq8MO1cea3m/MOh41ZYW68pp1PGrKOpOVrJEOfZSWUXPF\nxvbN5/rNN3zzgXb5+s1sO+lEmis29hNQ0oqJiMdFxIURcVFEHLfc+6/p/cas41FTVqgrr1nHo6as\nQ2CjJi2jbad/sv2GaC5bt7Dts6etTCBJvYiItcBfAYcB+wFPiYj9lvM5anq/Met41JQV6spr1vGo\nKesQ2KhJy+mcM2/8DdF0W7fC2WeuRBpJ/XkwcFFmXpyZ1wMfBY5Y1meo6f3GrONRU1aoK69Zx6Om\nrAPgZCLScrruuoVtt/na8eaQ1Lc9ge+PLF8KHDi6QUSsB9YDZCaTk5OLeoLLNy/8/Wax+15uZh2P\nmrJCXXnNOh41ZZ3NxMTEimWzUZOW0y67wHULaMJ23nX8WST1aaaTxG8we1dmbgA2TN236JPTd174\n+03vE9mYdTxqygp15TXreNSUdRbLOJnIvBz6KC2nAw9uZy2ay9q1cNDBK5FGUn8uBe48snwnYGnT\nHk9X0/uNWcejpqxQV16zjkdNWQfARk1aRmsecySsnedA9doJ1hy6vKeqSBqc/wD2jYi9I2In4Cjg\nU8v5BDW935h1PGrKCnXlNet41JR1CGzUpGVUdt+DNcccBzvtfONvjNauhZ12bq8P4sUcpZu0zNwC\nvAj4DPDNdlV+Yzmfo6b3G7OOR01Zoa68Zh2PmrIOgRe8HqOaskJdeYeetbliYzu17NlnthOH7Lwr\nHHQwaw49YtBvPkN/XaerKe9qy+oFrxdth+tjTe83Zh2PmrJCXXnNOh41ZZ1uJWvkvI1aRNwT+NjI\nqn2A1wPv79bvBVwCRGZeOc/z2agNWE15zToeNWWFuvKutqw2aou25PoIq+/f2Uox6/jUlNes41FT\nVljZGjnv0MfMvDAz98/M/YEDgGuAU4HjgDMyc1/gjG5ZkiRJkrREiz1H7RDg25n5XdoLd57crT8Z\nOHI5g0mSJEnSarXY66gdBXyk+/0OmbkRIDM3RsTuMz1gqRf0nG4lLzK3VDVlhbrymnU8asoKdeU1\nqyRJWowFN2rd9MKHA8cv5gmWfEHPaWoax1pTVqgrr1nHo6asUFfe1ZZ1oRfzlCRJM1vM0MfDgPMz\n8/Ju+fKI2AOg+3nFcoeTJEmSpNVoMY3aU9g+7BHaC3ce3f1+NHDacoWSJEmSpNVsQddRi4jdgO8D\n+2TmVd262wEJ3AX4HvDkzPzfeXa1ohdtkyT1yun5F876KEmry/w1smmaqm5PfvKTz+07w00xa215\nzWrW2vKa1Zt/d2Y1a915zWrWlc672On5JUmSJEljZqMmSZIkSQNTY6O2Yf5NBqOmrFBXXrOOR01Z\noa68ZtVKqOnvzqzjUVNWqCuvWcejpqywgnkXNJmIJEmSJGnl1HhETZIkSZJu0mzUJEmSJGlgbNQk\nSZIkaWAm+g6wUBGxFrgDI5kz83v9JZpdZVmfnJkfn29dnyLiAXPdn5nnr1QW9av7v3ViZr6y7yw3\nNTW8F2h2ldWdKrLW8H/C+qgp1sfx6uv9oIpGLSJeDJwAXA5s61Y3wH17CzWLmrJ2jgem/yObaV2f\n/rz7uQvwQOBrtFdzvy9wDvDwnnLNKiLuAbwLuENm3jsi7gscnpl/1HO0G6kpa2ZujYgDIqJk5uBn\nQupe21cCd+WGH0of1Vuo2dXwXqAZ1FR3aspKHf8nqquPUFfdqSWr9XHsenk/qKJRA14C3DMzf9x3\nkAWoImtEHAY8HtgzIt45ctetgC39pJpZZj4SICI+CqzPzAu65XsDv9dntjm8m/YN6G8AMvM/I+LD\nwKDe2Ds1ZQX4CnBaRHwc+PnUysw8pb9Is/o4cBLta7y15ywzqum9QLOqou50Bp+1pv8TldZHqKvu\n1JTV+rjM+n4/qKVR+z5wVd8hFqiWrJcB5wKHA+eNrL8aeFkvieb3q1NFCCAzvx4R+/cZaA67ZeaX\nI2J03aAK/IiasgLcFvgxMPqtWwMMsRBtycx39R1iHjW+F+iGaqk7UEfWGv9P1FQfoa66U1NW6+Py\n6/X9oJZG7WLgzIj4J2Dz1MrMfGt/kWZVRdbM/BrwtYj4cGb+ou88C/TNiPhb4IO0bzxPB77Zb6RZ\nbYqIu9HmJCKeBGzsN9KsaspKZj677wyL8A8R8QLgVG74fvC//UW6odH3AtqacJfMvLDnWFqcKupO\nZ/BZrY8roqa6U01W6+Py67tG1tKofa+77dTdhqymrAAPjog3sH2McAGazNyn11Qzezbwu7RDZwC+\nQDtufIheSHvl+l+NiB8A36EtnENUU9ZqzhfoHN39HD25uwGG+P/rccBbaN+39u6+jf+DzDy831ha\ngJrqTk1ZrY/jU1PdqSar9XGseqmRpWkGf77hL0XELWnfJH/Wd5b51JI1Ir5Fe+j2PEbGCA/5/IGa\nRMTNgTWZeXXfWeZTS9aI+Dzd+QKZef9u3dcz8979JqtbRJxHO1zmzJHX9T8zc4iTPGgGtdQdqCOr\n9XH8aqk7UEdW6+P49FUjqzii1p0U+wHasbdExCbgmZn5jV6DzaCmrJ2rMvOf+w6xEBGxL/DHwH60\nM1wBMMRvNyNiK/BnwPFTsy9FxPmZOedUyn2oKWunmvMFIuJmtN9y/3q36kzaAjrE4VRbMvOqaa+r\nKlBT3akpK9bHsamp7tSUFevjOPVSI6to1GgPOb88M/8NICIOpp0l5qF9hppFTVkB/i0i/oz2RNPR\nMcJDvPbKe2mndX4b8EjaoR6l10Sz+wbtBeVPj4jf7sZcm3V5VHO+AO0QlJsBf90tP6Nb97zeEs3u\n6xHxVGBt96HvWOCLPWfSwtRUd2rKan0cn5rqTk1ZrY/j00uNXDPuJ1gmN596UwfIzDOBm/cXZ041\nZQU4kPbaK2+mvR7Ln9OOwR2iXTPzDKBk5ncz8w3ccGajIdmSma+i/QDy7xFxAN0b5wDVlBXa8wX+\nhu3nC7yU9lu5IXpQZh6dmZ/rbs8GHtR3qFm8GLgX7QfSjwA/pX1tNXw11Z2aslofx6emulNTVuvj\n+PRSI2s5onZxRLyOdrgEtCdxfqfHPHOpKesvr8FSiesiYg3wPxHxIuAHwO49Z5pNAcjMjIhv0P6n\nvku/kWZVU1Yy82Lg0TWcLwBsjYi7Zea3ASJiHwZ6vZjMvAZ4bXdTXWqqO9VktT6OVU11p5qs1sfx\n6atG1tKoPQd4I+3wg0I7m9FQpyCtKSsRcQfabwvXZeZhEbEf8JDMfE/P0WbyUmA32sPNf0g7vOPo\nOR/Rn18eus/Mb0TEw4Eje8wzlyqyRsTTM/ODEfHyaeuBYU3vPeKVtMOnLqZ9P7grA3s/iIj3Mvu3\nw01mPncl82iH1FR3qslqfRyrKupOZ/BZrY/j03eNrKJRy8wrad98Bq+mrJ330Y5tn/qG4L+BjwGD\nKkQRsRaIzHwl8DMG+J8ZICIelZmfA+4aEXeddvegZjerKWtnt+7nLXtNsQiZeUY3lv2etIXoW5m5\neZ6HrbR/nGHdXWg/+K1d4SzaATXVnZqyYn1cdjXVnZqyYn0cp15r5KAbtYh4e2a+NCL+gRm62XFf\nu2Axaso6zWR3OP94gMzc0s1wNCiZuTUiDoiIMjXr0kD9BvA54P/OcF9D+y3yUNSUFeBu3c//ysyP\n95pkHlMFPiJ+a9pdd4sIMnMwr21mfmLq927oyWtoZ+E6kYF9INUN1VR3aso6wvq4/GqqOzVltT6O\nSd81ctCNGtvHsA/15N1RNWUd9fOIuB3bZwg6CLiq30iz+gpwWkR8HPj51Moh/afOzBO6n4P9RnNK\nTVk7j4+I3weOBwZdiKirwBMRv0Z71OD+tNNQH5OZg5zSWTdQU92pKesU6+Myq6nu1JQV6+NY9Vkj\nq7rgNUBE/Apw58z8z76zzKeGrBHxAOAvgHsDXwduDzxpiJm7ccLTNZn5nBUPM4+IuA3wTGAvRr4Q\nyczBDfmpJWs3TfZ62hnirhm5q9D+O7hVL8Eq132weyDtB+hk2snc3VTUqkQNdWfK0LNaH8enlroD\ndWS1Po5P3zWyikYtIs4EDqf9D/JV4EfA5zPz5XM9rg81ZZ0SERNsHyN8YQ73YoPViIgvAmcDFwDb\nptZn5sm9hZpFTVkBIuK0zDyi7xwLEREvoT3H5WraqZ0fAByXmaf3GmxERFzC9qFoDTe8PlCTA71g\nrrarqe7UlBWsj+NSU92pLKv1cZn1XSOHPvRxyq0z86cR8TzgvZl5QkQM7hutThVZ5xgjfI8hjhEG\niIh3zrD6KuDczDxtpfPMY5ehfvCYQU1ZqaUIdZ6Tme+IiMfSTpX9bNrCNJhClJl79Z1BS1ZF3ekM\nPqv1cUXUVHeqyWp9XH5918haGrWJiNgDCIZ/jZ9aslY3RhjYBfhVto+/fiLwDeC5EfHIzBzSxXk/\nEBG/Qztb0C9nMRroMLIqskbEWZn58Ii4mpm/1Rri0I6pjI+n/VD6tYgocz1gpXXDu2aVmeevVBbt\nsFrqDtSR1fo4flXUnc7gs1ofx6fvGllLo/YHwGeAszLzP7pZV/6n50yzqSJrZSfJTrk78KipEzgj\n4l2037wcSjskYUiupz3h9LXc8JD5EIeRVZE1Mx/e/axm+mHgvIg4HdgbOD4ibsnI0JmB+PM57muA\nR61UEO2wKupOZ/BZrY8rooq60xl8VuvjWPVaI6to1LqpRj8+snwx7bdFg1NL1ukXRZwuh3lxxD1p\nT5SdmnXr5rQXIt0aEUO79sbLgbtn5qa+gyxATVmJiLsBl2bm5og4GLgv8P7M/Em/yWb0XGB/4OLM\nvCYibsvArnGUmY/sO4OWppa6A3VktT6uiJrqTjVZrY/Lr+8aWUWjFhF/CvwRcC3wL8D9gJdm5gd7\nDTaDirJOfetyT+BBwKe65f8LfKGXRPP7U+Cr3cnohfY6Fm+OiJsD/9pnsBl8gxvOvDRkNWUF+ATw\nwIi4O+01TD4FfJh2+MTQPAT4amb+PCKeTnuy9Dt6zjSriLg3sB/tMCoAMvP9/SXSQlRUd2rJan0c\nv5rqTk1ZrY9j1EeNrKJRAx6Tma+KiCcAlwJPBv4NGNIb+5QqsmbmGwG6w84PyMyru+U3MNBrcGTm\neyLi08CDaQvRazLzsu7uV/aXbEZbaYvmv3HDMe2Dmc53RE1ZAbZle+HZJwBvz8y/iIiv9B1qFu8C\n7hcR9wNeRVs43097DsygRMQJwMG0RejTwGHAWbR5NWxV1J3O4LNaH1dETXWnpqzWxzHpq0auGefO\nl9HNup+PBz4ypBM4Z1BTVoC70I6/nnI97bVChupBwCOAhwMH9JxlLp8E3gR8EThv5DZENWUF+EVE\nPAU4mvbkbtj+/25otmRmAxwBvCMz38H2b+uH5knAIcAPu3Nz7gfs3G8kLVBNdaemrNbH8amp7tSU\n1fo4Pr3UyFqOqP1DRHyLdqjECyLi9sB1PWeaTU1ZAT4AfDkiTqU9KfIJDPQb9Ig4kbYQfahbdWxE\nPDQzj+8x1oyGeH2V2dSUtfNs4BjgTZn5nYjYmwF9Gz/N1RFxPPAM4BERsZbhFs1rM3NbRGyJiFsB\nVzCgk+U1p5rqTk1ZrY9jUlPdqSkr1sdx6qVGVnHBa4CI+BXgp92JsTcHbpmZP+w710xqygoQEQfQ\nfgMH8IXMHORh8u5aO/tn5rZueS3wlcy8b7/JtouIzMyIiAvYPjvUL5l1eXX/1+6cmYO6DtOUiLgj\n8FTgPzLz3yPiLsDBQzzvKyL+GngNcBTwCuBntOcPDO7kbt1YTXWnsqzWx2VUU92pKetMrI/Lq68a\nWcURtYjYDXgh7TCE9cA62pN8/3Gux/WhpqxTMvO8iPg+3cmREXGXzPxez7FmcxtgaqjMrfsMMouX\ndD9/s9cUC1NT1l/qTpY/nPb966vAjyLi8znAC5Jm5g8j4hPAvt2qTcCpPUaaUbTXrvnjbmawkyLi\nX4BbDbXA64Zqqjs1ZQXr4xjUVHdqygpYH8elzxpZRaNGe6Xy84CHdsuX0p7QO8Q39pqyEhGH014j\nYh3tYdy7AN8C7tVnrln8MfCV7oTeqVmtXtNvpBvKzI3dry/IzFeP3hcRfwK8+saP6kdNWae5dWb+\nNCKeR3uRzBO6b5MHJ9qLpK4HbgvcjXYK7ZNox7kPRmY2EfFJuvNaMvOSfhNpkWqqO9VktT4uv5rq\nTk1ZR1gfx6DPGlnLZCJ3y8w/BX4BkJnXcsOrrg9JTVkB/hA4CPjvzNwbeDTw//qNNLPM/Aht1lO6\n20O6dUN06AzrDlvxFAtTU1aAiYjYAwgG+OFumhcCDwN+CpCZ/wPs3mui2Z0dEQ/qO4R2SE11p6as\n1sfxqanu1JTV+jg+vdTIWo6oXR8Ru9KNEY72gn5DvIAj1JUV4BeZ+eOIWBMRazLz37pvigYnIs7I\nzEPYfk2b0XWDEBG/C7wA2Gfat1i3ZGAFvqas0/wB8BngrMz8j4jYB/ifnjPNZnNmXh8RAETEBDOc\n6zAQjwSeHxHfBX5O++G5Gfp5GALqqjs1ZbU+LrOa6k5NWUdYH8enlxpZS6N2Au2FMe8cER+i7cCf\n1Wui2dWUFeAnEXEL2ot4figirgC29JzpBiJiF2A3YLI7OXbq29db0Q5JGZIPA/9MOwzluJH1Vw9w\nGuqasv5SZn6ckWsZZebFwBP7SzSnz0fEa4BdI+JQ2qL/Dz1nms1QvyHW/GqqOzVltT4uv5rqTk1Z\nAevjmPVSIwffqHUn8H0L+C3aw/oFeElmbuo12AxqyjriCNppkl8GPI32BOQ/6DXRjT0feClt0Tl/\nZP1Pgb/qJdEsMvMq4CrgKRHxANrZwhrab98G9cZeU9ZR3QeT59KeJ7LL1PrMfE5voWZ3HG3WC2j/\nHX8a+NteE81uyN9kahY11Z2asnasj8usprpTU9Yp1sex6qVGVjE9f0Scl5lDv3gjUF3WtcBnMvPR\nfWdZiIh4cWb+Rd85FiIiXkc7RvyUbtWRwMcz84/6SzWzmrICRMTHaT/sPZX2Q9PTgG9m5kvmfOAK\n6/5/nZyZT+87y0KMTEFdaAv83sCFmTnEiRM0orK6U0VW6+N41VR3KstqfRyTvmpkLZOJ1HSSezVZ\nM3MrcE1EDHUa3+muiohnTr/1HWoWTwUelJknZOYJtN8eP63nTLOpKSvA3TPzdcDPs70Q6f8B7tNz\nphvp/n/dPiJ26jvLQmTmfTLzvt3PfYEHA2f1nUsLUk3doZKs1sexq6nu1JTV+jgmfdXIwQ997NR0\nkntNWQGuAy6IiM/S5gUgM4/tL9KsRov7LrRTuJ4PDPHiiJfQZryuW94Z+HZvaeZ2CfVkhW62ONrz\nR+4N/BDYq784c7oE+H8R8Slu+P/rrb0lWqDMPL+GD9QC6qo7NWW1Po7PJdRTdy6hnqzWxxWyUjWy\nlkatppPca8oK8E/dbfAy88Wjy903nR/oKc58NgPf6Ap8Qzu971kR8U4YXKGvKSvAhu6k+dfRznB2\nC+D1/Uaa1WXdbQ3tTGGDFRGjF0RdAzwA+FFPcbQ4NdWdmrJaH8enprpTU1br45j0VSNradSuXuC6\nIagpK92h8Vpdw/Yr2g/Nqd1typk95ViImrKSmVMnG38e2KfPLPPJzDf2nWERRgvlFtoPqJ/oKYsW\np6a6U01W6+NY1VR3qslqfRyrXmpkLY3a+cCdgStph0ncBtjYTZX7O5l5Xp/hpqkia0QcAdwpM/+q\nWz4HuH1396sy8+97CzeLiPgHts+6swbYj5FpaAfmY8DdafN+OzOvm2f7PlWRddq3WTcyxOES0/7N\nTrkKOBf4myG91lNFMyJunpk/n297DUoVdacz+KzWxxVRRd3pDD6r9XH8+qqRtTRq/wKcmpmfAYiI\nxwCPAxL4a+DAHrNNV0vWVwFHjSzvTDvG/ebAe4HBFSLgLSO/bwG+m5mX9hVmJt0FG98MPAf4Lm3B\nvFNEvBd4bWb+Yq7Hr6SasnYGPzRiBhfTfsD7SLf828DlwD2AdwPP6CnXjUTEQ4D30A6VuUtE3A94\nfma+oN9kWoBa6g7UkdX6OCY11Z2asmJ9HLu+amQtjdoDM/OYqYXMPD0i3pyZL4+InfsMNoNasu6U\nmd8fWT4rM38M/Dgibt5XqLlk5udHlyPiYRFxfGa+sK9MM/gz2jfMvTPzaoCIuBVtEX0LMKQpcmvK\nWuMwCYD7Z+avjyz/Q0R8ITN/PSK+0Vuqmb0deCzteQ1k5tci4tfnfogGopa6A3VktT6OT011p5qs\n1scV0UuNrKVR+9+IeDXw0W75t4Eru+swbOsv1oxqyforowuZ+aKRxdszUBGxP+1UuQF8h+3XNRmK\n3wTukZm/PJyfmT+NiN+lvbbJYN7YqSsrEfGnwMWZedK09S8D7piZr+4n2ZxuHxF3yczvAUTEXYDJ\n7r7r+4s1s8z8fkSMrtraVxb9//buPEizqj7j+LdBERxFQ1zYjaJCKjgojhtaxKHcC3fzUwQNLgii\nAiKoERNi4pIYEdAoBjC4odbPRHFBFDc0RnGBICgYISAOUUtZAogoiJ0/zn2Zt3ve7umRue85z+3n\nU/VW9709UA9M93n63PfeczaISu+ARlb3Y3+Uekcmq/txOmp0pMpE7bnA0cBp3fHXu3ObUgaklqhk\n/VZEHJCZJ42fjIgDgW9XyjRRRNyfchvKPsBVlPvFZzJzddVgk82OD+ojmXlLRLS2u7xSViilueuE\n88cD5wMtFtGrKKuD/Q/leZx7Awd3V+VbW6hgTUTsAcx2e9scAlxUOZMtjUrvgEZW92N/lHpHKav7\nsX9VOlJiopaZVwKviIg7Zeav5n35khqZFiKU9ZXAaRHxXMrD3QAPptyL/7RqqSb7IfAfwJMz8xK4\n9SpRiy6MiOdn5py9ayJiP8p/R0uUskIpzXWuuGfm7yNipkag9cnMz0bE/YBdKEX0w7EHpI+rl2yi\ngyilvh1wBXAm0NptUzaBUO+oZHU/9kepd5Syuh/7V6UjJSZq3Qz2ZAQeclfJmpm/APaIiL2AP+tO\nn56ZX64YayHPpFwx/EpEfI5yy0yTAw/lh/bjEfFC4BzKikYPAbYAnl4z2ARKWQF+HRH3y8yLx092\nA/2NlTItKiLuCBwO3CszD4iI+0XEzpn5mdrZ5ut+gd63dg7bcCq9AxpZ3Y+9Uuodpazux57V6kiJ\niRpwLDoPuStlJTO/HBFfBe4J3K67R5jRPcMtyMxPAJ/o3g5/GuVq5z0j4gTK6mFnVg04JjP/F3jY\nWMHPAGdk5pfqJluXUsd0s6EAABa0SURBVNbO3wBnRMQbKaUJsAr4K+CwaqkWdwol6yO64ysoS2Y3\nU0QRsdhmqLOZ+fdTC2N/KKXekcnqftz4lHpHKSvux97U7shN+vyXb0zzVmCChh9yV8oaEa+gLIf6\nBcrmfafT2A/JSGbekJmnZubewPbAecBrK8eaqLvy+m7KMs4XR8SOo5JvjUrWzDyD8ovIauB93evR\nwDMz87PVgi1up8x8K3AzQGbeSHtXu2+Y8AJ4EW0+12ATKPWOSlb3Y39Uegc0srofe1W1I1XeUVN6\nyF0pK5RVi3bulh5uXrcy2D0p37tndK/mdAV/NKXkR/eNzwIrq4VagFLWzPw+8JcACzzj0pqbImIL\nuk09I2In4Ld1I82VmceMPo+IO1PGhBdQbqE6ZqF/zpqi1DtKWd2PPVHqHZWs7sd+1O5IlYma0kPu\nSlkB1lB2gm+eymDZUSp4paxqGzMfTdngd4eIOBV4JLB/1UQTRMRWlGcF9qWstrV7Zl5TN5VtAKXe\nUcrqfuyPUu/IZHU/9qNmR0pM1CY9wBftbjopk7VzKXBWRJzO2JWMzHx7vUgLkhksESp4tLKC0MbM\nmfmFiDgXeDjllo5DuzGiGRHxT8AzgBOBBwhchbV5lHpHKSvuxz4p9Y5SVvfjRla7I5ufqEXEdsA2\nwPmZeVNE3IPyYOT+wLY1s82nlHXMT7rXZt2rZUqDpVLBK2UFtDZm7n5xOh0gInaOiLdk5gGVY417\nFeXv/fXAUWP/X2coD0pvWSuYrZ9S7yhl7bgf+6PUO0pZ3Y8bX9WObHqiFhGHAUdR9la5Q0QcD7wd\n+ABlT5NmKGUdl5lvgFvvu51t/Gq60mCpVPBKWUHgGZeIWAm8jfLL52nAOykPoz+Mxp77ykyZRaVs\nLqXeUco64n7slVLvKGV1P25ktTuy6Yka8BLKW/lXdyvsXALsmZlnV841iVLWW0XErsAHga264yuB\n52fmD6oGm0xmsFQqeKWsHYVnXE4CTgC+CTyBsmnuh4F9xzb0NLutlHpHKSvgfuyTUu8oZcX9ODgz\ns7OztTMsKCLOzczdx46/n5m71sy0EKWs4yLiG8BRmfmV7vjRwJszc4+qwRahMFjOL3ig2YJXyqoi\nIs7LzAeOHa8B/iQzm70FxfQo9Y5S1hH3Y3+UekcpqwL344Zp/R217SPiHWPH9xg/zsxDKmRaiFLW\ncStGJQSQmWe1+mC32NXNE4HD5xX8SUCLBa+UlXk/ZyPXAt/NzE9OO88CNo+IB7F2T5hfASsjYgYg\nM8+tlsyGRKl3lLKOuB/7o9Q7Mlndj8PT+kTtyHnH50z8U21Qyjru0oj4a8oAD7AfcFnFPIuRGSwR\nKni0sgJsDuwCfKw7fibwA+BFEbE6Mw+rlmytn1Gevxn5+djxLLDX1BPZECn1jlLWEfdjf5R6Rymr\n+3Fgmp6oZeb7x48jYkVm3rDQn69JKes8LwTeAHyccnXja5SN/FqkNFgqFbxSVoD7Antl5u8AIuIE\nyn34jwUuqBlsJDNX185gw6fUO0pZx7gf+6PUO0pZ3Y8D0/REbSSENvBTygqQZcO+Fm85mURpsFQq\neKWsUB6SXsHapahXANtm5i0R8duF/7Hpi4iXAadm5v91x38E7JOZ766bzIZEqXeUsrofe6XUO0pZ\n3Y8DIzFRQ2gDP0SyRsRxmXlYRHya8lbzHJn5lAqx1kdmsFQqeKWsnbcC50XEWZTvgz2BN3dXj79Y\nM9gEB2Tmu0YHmXlNRBxAWYrYbGOR6J1O81ndj/1T6h2lrLgfB0dloqa2gZ9C1tFVt7dVTbEBFAZL\npYJXyjouM98bEZ8FHkopotdl5k+7L89/Dqa2TSJiJjNnASJiUwSWzjY9Ir0DSGR1P/ZEqXeUso64\nH4dHZaLW/AZ+YySyZuY53cevjs51bzvvkJnnVws2gdhgqVTwSlmJiN3nnVrTfdw6IrZudKWozwMZ\nEe+hfO8eBHyubiQbIIne6TSf1f3YK6XekcnqfhwulYmawgZ+I0pZ6d4efwrle+E84JcR8dXMPLxq\nsLlkBkulglfK2jmm+7g5sAr4HuWK4UrgW8CjKuVazGuAA4GXUrKeCZxcNZENkVLvyGR1P258Sr2j\nlBX342A1veG19S8i/iszHxQRL6YMPkdHxPmZubJ2tsU0PFgCkwseaK3gAa2sABHxUeBNmXlBd7wr\ncERm7l81mJkNivuxP0q9I5bV/TgwEu+oiWzgB2hl7dwuIrYBAjiqdpjFiFzdHLlLZl7XFfwpo4Kv\nHWoBSlkBdhmVEEBmfj8iHlgz0HwRkZkZEXEBk29HavoXPdOi1DtKWXE/9kmpd5Syuh8HZpPaAZZo\nc+CBwMXdayWwFWUDv+NqBptAKSvA31HuE74kM78TEfeh5G7RXTLzOuAZlMHywcBjKmdayHjBf6Z2\nmPVQygpwUUScHBGPjog/j4iTaOwZF+DQ7uPewJMnvMw2JqXeUcrqfuyPUu8oZXU/DozEO2oIbOA3\nRikrmfkx1u5gT2ZeStnJvkUyVzdZW/BfFyh4paxQlpx+KWsH+68BJ9SLs67M/Fn36cGZ+Zrxr0XE\nP1LuzTfbWJR6Ryar+7FXSr2jlNX9ODAqEzWZDfzQykpEvBV4I3AjZbWd3YDDMvNDVYNNJjNYKhW8\nUlaAzPwNcGz3at1jWbd0njjhnNltodQ7Mlndj/1R6h2xrO7HgVGZqClt4KeUFeBxmfnqiHg6ZQWu\nvwC+AjRXREqDpVLBq2RVuq89Il4KHAzsNO9ZhjsD/1knlQ2YUu8oZXU/9kSld0Ajq/txuCQmakob\n+Cll7dy++/gk4COZeXXM3Yi0GQqD5RiZgkcn6/h97a37MHAG8BbgtWPnr8/Mq+tEsqFS6h2lrLgf\n+6TSO6CR1f04UE0vJhIRu49ewDaUDfx+QtnAb/7mflUpZZ3n0xHxQ8q+G1+KiLsDv6mcaSGP6x6W\n3psyWN6f9op9ZJ2CrxlmPSSyju5rz8zLx1+U74Wm9ojJzGsz88fA64GfdznvDewXEXetGs4GQ6l3\nlLKOcT/2R6J3Os1ndT8OV+vvqClt4KeU9VaZ+dru4c3rumcEfg08tXauBchc3WRtwd8IHNx4wUtk\njYgtKRvjbgd8CvgC8HLgCMpy1KfWS7egfwdWRcR9gfdScn+Y8j1sdlsp9Y5SVsD92DOJ3uk0n9X9\nOFxNv6OWmaszczVwObB7Zq7qlpx9EHBJ3XRzKWUdFxF3pPxwj1YF2pZSoi2SubqZma8FHgGsysyb\ngWYLXijrB4GdKSvDvZiyUtyzgKdmZot5AX7frW73DOC4zHwl5d0Es9tMqXeUso64H/sj1DsqWd2P\nA9X0RG3MOhv4UfZhaZFSVoBTgJuAPbrjKyj3uTdHZLAEtApeKOt9MnP/zPwXYB9Kxr0z87zKuRZz\nc0TsAzyftfvv3H6RP2/2h1DqHaWs7seeCPWOSlb340C1fuvjyEURcTLlwc1ZYD/a28BvRCkrwE6Z\n+ezuh4XMvDEiZmqHmmRssNwReAllsNyZNjegPAU4h7kF/zGc9ba4efRJdxvSZZl5fc1AS/AC4CDg\nTZl5WUTcm7YeQLdhUOodpazux/6o9A5oZHU/DpTKRK35DfzGKGUFuCkitqBbzjUidgKa2stmjMJg\nOSJT8Ohk3S0irus+nwG26I5ngNnM3LJetMky80LgkLHjy4B/qJfIBkqpd5Syuh/7o9I7oJHV/ThQ\nEhO1FNrATylr52jKUr47RMSpwCOB/asmWpjCYDmiVPASWTNz09oZlkppTxvTp9Q7SllxP/ZJonc6\nzWd1Pw5X0xM1pb9Mpawj3SD+Q8qDnA+nXHk5NDOvrBpsYc0PlmOUCl4pqwqlPW1MlFLvKGUF9+MU\nKPWOUlYF7scN0PREDa2/TKWsAGTmbESc1q28dXrtPEsgMVgqFbxSViXje9rUzmKDptQ7Slndjz1S\n6h2lrCrcjxtmZnZ2nQtbzYuITYHnZGaL+0LM0XrWiHgX8L7M/E7tLIvpBsvtKStZjQbLs1sdLCPi\nnK7gm6eUVU1EXM+67x5cC3wXeFVmXjr9VDZ0rffOuJazuh/7o9Q7SlmVuB+Xpul31JQ28FPKOs9q\n4MCIuBy4gbUPnjZ1G4rg1c2zI+IhrRd8RymrmrcDP6Vs4jkDPAfYGvhv4F+BR1dLZvKUekcp6xj3\nY3+UekcpqxL34xI0PVGjbOB3DfBNygZ+RwKbUTbwa21vCKWs455YO8AGUBosJQq+o5RVzRMy82Fj\nxydGxNmZ+XcR8bpqqWwolHpHKeuI+7E/Sr2jlFWJ+3EJWp+o3SczHwDQ7btyJbBjo3tDKGUdNylf\nq5mVBkulglfKqub3ERHAv3XHzxr7mt5959Yapd5RyjrifuyPUu8oZVXiflyCTWoHWI85G/gBLW/g\np5R13LnAL4EfARd3n18WEedGRGv3ZD8R2AnYC3gy5aH0J1dNtLDrJ7x+WjXRwpSyqtkXeB7wi+71\nPGC/bnW2l9cMZoOg1DtKWUfcj/1R6h2lrErcj0vQ9GIiEXEL5coQdBv4UR6WbW4DP6Ws4yLiPcAn\nMvPz3fHjgCcACRw/723pqiJiqwmnr8/MmyecryoifgzsQLnVZwa4K/AzymB0QGaeUy/dXEpZzWwt\npd5RyjrifuyPUu8oZbXhafrWR6UN/JSyzrMqMw8aHWTmmRHx5sw8PCLuUDPYBOcyYbCMiBYHy8+x\ncMG/G2im4NHKKiUitgfeSVkqexb4OmVp5yuqBrNBUOodpaxj3I/9Ueodpawy3I9L0/qtj9a/qyPi\nNRFxr+71auCabsnk39cON8/ngCdl5t0y848pt3okcDBlsGzJqtGgDqXggT0z82ygtYJXyqrmFMoK\nd9tSVrv7dHfOzNrnfuyPUu8oZVXiflyCpt9Rs6l4LmWjzNO646935zYFolaoBShd3bw6Il4DfLQ7\nfjbtFrxSVjV3z8zx4nlfRBxWLY2ZbQj3Y3+UekcpqxL34xJ4orbMdRtiviIi7pSZv5r35UtqZFqE\n0mCpVPBKWdVcGRH7AR/pjvcBrqqYx8yWyP3YK6XeUcqqxP24BJ6oLXMRsQdwMnAnYMeI2A04MDMP\nrptsIpnBUqnglbIKeiHwz8CxlHvwvwG8oGoiM1sS92N/lHpHKasY9+MSeKJmxwKPp9wnTGZ+LyL2\nrBtpMqXBUqnglbKqycyfAE8ZP9fd2nFcnURmtgHcjz1R6h2lrErcj0vjxUSMzFwz79QtVYKsR0Ts\nEREXAhd2x7tFRIsPScPagr8KSsEDTRY8WlmH4PDaAcxsadyPvVHqHaWs6tyP83iiZmu6q0WzEbFZ\nRBwBXFQ71AKkBkuVggetrAMwUzuAmS2J+7FHSr2jlFWc+3Ee3/poBwHHU5ZGvQI4E3hZ1USLyMw1\nEXNut291sJxT8MAhtFvwSlmHYLZ2ADNbEvdjf5R6RymrOvfjPJ6oLXPdfe37jp+LiBWV4qyP0mCp\nVPBKWSVExPVMLpwZYIspxzGzP4D7sVdKvaOUtXnuxw3jidoyFhHbAdsA52fmTRFxD+AwYH/KBoSt\nkRkslQpeKauKzLxz7Qxm9odzP/ZLqXeUsipwP24YT9SWqW5lnaMoq0HdISKOB94OfAB4cM1sC1EZ\nLJUKXimrmdk0uB/7pdQ7SlltmDxRW75eAuycmVdHxI6UQtozM8+unGsilcFSqeCVspqZTZH7sSdK\nvaOU1YbLE7Xl6zeZeTWUvSwi4kcNl5DSYKlU8EpZzcymxf3YH6XeUcpqA+WJ2vK1fUS8Y+z4HuPH\nmXlIhUwLURosZQoeraxmZtPifuyPUu8oZbWB8kRt+Tpy3vE5VVIsjdJgqVTwSlnNzKbF/dgfpd5R\nymoD5YnaMpWZ7x8/jogVmXlDrTzroTRYKhW8UlYzs6lwP/ZKqXeUstpAeaK2zEXEI4D3AncCdoyI\n3YADM/PgusnmkBkslQpeKauZ2bS5Hzc+pd5RymrD5YmaHQc8HvgUQGZ+LyL2rBtpLsXBUqTgAa2s\nZmZT5H7siVLvKGW14dmkdgCrLzPXzDt1S5Ug6xERj4iIC4GLuuPdIuLdlWMtZFTwV0EpeKCpgh+j\nlNXMbGrcj71R6h2lrDYwnqjZmojYA5iNiM0i4gi6gb5BUoOlSsGDVlYzsylxP/ZIqXeUstqw+NZH\nOwg4HtgOuAI4E3hZ1USLyMw1ETF+qtXBck7BA4fQbsErZTUzmxb3Y3+Uekcpqw2MJ2rLXGZeCexb\nO8cSKQ2WSgWvlNXMbCrcj71S6h2lrDYwM7Ozs7UzWEXzlvUduRb4bmZ+ctp5FhMRd6MMlo8BZiiD\n5aGZeVXVYGZmNjjuRzOrzRO1ZS4iTgR2AT7WnXom8ANgB+DSzDysVjZlYgUvk9XMbFrcj/1R6h2l\nrDY8vvXR7gvslZm/A4iIEyhX4h4LXFAz2Hxig+XmTC74F0XE6sYKXimrmdm0uB/7o9Q7SlltYLzq\no20HrBg7XgFsm5m3AL+tE2lBmwMPBC7uXiuBrSiD5XE1g00wKvh3ZuY7Kbej/CnwdOBxVZOtSymr\nmdm0uB/7o9Q7SlltYPyOmr0VOC8izqLc174n8OaIWAF8sWawCWSubrK24K/tjm8t+IhoreCVspqZ\nTYv7sT9KvaOU1QbGE7VlLjPfGxGfBR5KKaLXZeZPuy8fWS/ZREqDpVLBK2U1M5sK92OvlHpHKasN\njBcTWaYiYvfFvp6Z504ry1JFxIuA1wNnMTZYAh8B/jYzmyrOiNiGtQX/7bGCb45SVjOzPrkfp0Op\nd5Sy2rB4orZMRcRXuk83B1YB36MMQCuBb2Xmo2plW0zrg6VSwStlNTObFvdjf5R6RymrDZdvfVym\nMnM1QER8FHhJZl7QHe8KHFEz23wTBss13cetI2LrxgbLY7qPEwseaKnglbKamU2F+7FXSr2jlNUG\nyqs+2i6jEgLIzO9TVo5qyTHd612UwfFE4KTu80lLEleTmau7kr8c2D0zV2Xmg4EHAZfUTTeXUlYz\nswrcjxuZUu8oZbXh8jtqdlFEnAx8CJgF9gMuqhtpLqWrm2PWKfiIaK3gR5SymplNi/uxP0q9o5TV\nBsYTNXsB8FLg0O74a8AJ9eIsSmmwbL7gxyhlNTObFvdjf5R6RymrDYwnastcZv4GOLZ7tU5psFQq\neKWsZmZT4X7slVLvKGW1gfGqj8tURGRmRkRcQBnU58jMlRViLSoiNqcMlnt2p74GnNCVqZmZ2W3m\nfjSzVvgdteVrdGVo76opNoDC1U2lglfKamY2Re7Hnij1jlJWGy5P1JapzPxZ9/Hy8fMRsSnwHMoq\nR00QGyyVCl4pq5nZVLgfe6XUO0pZbaA8UVumImJL4GXAdsCngC8AL6esEnUecGq9dOuQGSyVCl4p\nq5nZtLgf+6PUO0pZbbg8UVu+PghcA3wTeDFwJLAZ8NTMPK9msPmUBkulglfKamY2Re7Hnij1jlJW\nGy5P1Jav+2TmAwC6laKuBHbMzOvrxlqX2GApU/BoZTUzmxb3Y3+Uekcpqw2UJ2rL182jTzLzloi4\nrMUS6igNljIFj1ZWM7NpcT/2R6l3lLLaQHmitnztFhHXdZ/PAFt0xzPAbGZuWS/aOpQGS6WCV8pq\nZjYt7sf+KPWOUlYbKO+jZs2LiHMzc/eFjlsSEbcAN3SHM8AWwK9psOCVspqZ2bqU+hG0ekcpqw2X\nJ2rWPA+WZmZm63I/mg2bJ2pmZmZmZmaN2aR2ADMzMzMzM5vLEzUzMzMzM7PGeKJmZmZmZmbWGE/U\nzMzMzMzMGuOJmpmZmZmZWWP+H7TqyQc2qdM+AAAAAElFTkSuQmCC\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from matplotlib import pyplot as plt\n", "plt.style.use('ggplot')\n", "%matplotlib inline\n", "# Or notebook for interaction.\n", "\n", "names, acc, times = [], [], []\n", "for model in models:\n", " import time\n", " t = time.process_time()\n", " y_pred = model.predict(X_test)\n", " times.append((time.process_time()-t) * 1000)\n", " acc.append(accuracy(y_pred, y_test) * 100)\n", " names.append(type(model).__name__)\n", "\n", "plt.figure(figsize=(15,5))\n", "plt.subplot(121)\n", "plt.plot(acc, '.', markersize=20)\n", "plt.title('Accuracy [%]')\n", "plt.xticks(range(len(names)), names, rotation=90)\n", "\n", "plt.subplot(122)\n", "plt.plot(times, '.', markersize=20)\n", "plt.title('Prediction time [ms]')\n", "plt.xticks(range(len(names)), names, rotation=90)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 12 Is Python slow ?\n", "\n", "That is one of the most heard complain about Python. Because [CPython](https://en.wikipedia.org/wiki/CPython), the Python reference implementation, [interprets the language](https://en.wikipedia.org/wiki/Interpreted_language) (i.e. it compiles Python code to intermediate bytecode which is then interpreted by a virtual machine), it is inherentably slower than [compiled languages](https://en.wikipedia.org/wiki/Compiled_language), especially for computation heavy tasks such as number crunching.\n", "There are three ways around it:\n", "1. Use specialized libraries which provide efficient compiled implementations of the heavy computations. That is for example NumPy, which uses efficient BLAS and LAPACK implementations as a backend. SciPy and scikit-learn fall in the same category.\n", "1. Compile Python to machine code. [Numba](http://numba.pydata.org) is a [just-in-time (JIT)](https://en.wikipedia.org/wiki/Just-in-time_compilation) compiler for Python, using the [LLVM](http://llvm.org) compiler infrastructure. [Cython](http://cython.org), which requires type information, [transpiles](https://en.wikipedia.org/wiki/Source-to-source_compiler) Python to C then compiles the generated C code. While these two approaches offer maximal compatibility with the CPython and NumPy ecosystems, another approach is to use another Python implementation such as [PyPy](http://pypy.org), which features a just-in-time compiler and supports multiple back-ends (C, CLI, JVM). Alternatives are [Jython](http://www.jython.org), which runs Python on the Java platform, and [IronPython](http://ironpython.net) / [PythonNet](http://pythonnet.github.io) for the .NET platform.\n", "1. Call compiled code from Python.\n", "\n", "Let's compare the execution time of the function which computes the accuracy of a classifier. We test seven implementations: a pure Python loop, the implementations provided by numpy and scikit-learn (first option above), compiled versions of Python by Numba and Cython (second option above), and compile C and Fortran (third option above)." ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [], "source": [ "def accuracy_numpy(y_pred, y_true):\n", " \"\"\"Using NumPy, implemented in C.\"\"\"\n", " return accuracy(y_pred, y_true)\n", "\n", "def accuracy_sklearn(y_pred, y_true):\n", " return metrics.accuracy_score(y_pred, y_true)\n", "\n", "def accuracy_python(y_pred, y_true):\n", " \"\"\"Plain Python implementation.\"\"\"\n", " num_total = 0\n", " num_correct = 0\n", " for y_pred_i, y_true_i in zip(y_pred, y_true):\n", " num_total += 1\n", " if y_pred_i == y_true_i:\n", " num_correct += 1\n", " return num_correct / num_total" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 12.1 Compiled Python" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [], "source": [ "from numba import jit\n", "# Decorator, same as accuracy_numba = jit(accuracy_python)\n", "\n", "@jit\n", "def accuracy_numba(y_pred, y_true):\n", " \"\"\"Plain Python implementation, compiled by LLVM through Numba.\"\"\"\n", " num_total = 0\n", " num_correct = 0\n", " for y_pred_i, y_true_i in zip(y_pred, y_true):\n", " num_total += 1\n", " if y_pred_i == y_true_i:\n", " num_correct += 1\n", " return num_correct / num_total" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "%load_ext Cython" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [], "source": [ "%%cython\n", "cimport numpy as np\n", "cimport cython\n", "\n", "@cython.boundscheck(False) # Turn off bounds-checking for entire function.\n", "@cython.wraparound(False) # Turn off negative index wrapping for entire function.\n", "def accuracy_cython(np.ndarray[long, ndim=1] y_pred, np.ndarray[long, ndim=1] y_true):\n", " cdef int num_total = 0\n", " cdef int num_correct = 0\n", " cdef int n = y_pred.size\n", " for i in range(n):\n", " num_total += 1\n", " if y_pred[i] == y_true[i]:\n", " num_correct += 1\n", " return num_correct / num_total" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 12.2 Using C from Python" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting function.c\n" ] } ], "source": [ "%%file function.c\n", "\n", "double accuracy(long* y_pred, long* y_true, int n) {\n", " int num_total = 0;\n", " int num_correct = 0;\n", "\n", " for(int i = 0; i < n; i++) {\n", " num_total++;\n", " if(y_pred[i] == y_true[i])\n", " num_correct++;\n", " }\n", "\n", " return (double) num_correct / num_total;\n", "}" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "libfunction.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=cc99fc70e60c3c31e300acbd6840b1f6de8589bb, not stripped\n" ] } ], "source": [ "%%script sh\n", "FILE=function\n", "gcc -c -O3 -Wall -std=c11 -pedantic -fPIC -o $FILE.o $FILE.c\n", "gcc -o lib$FILE.so -shared $FILE.o\n", "file lib$FILE.so" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [], "source": [ "import ctypes\n", "\n", "libfunction = np.ctypeslib.load_library('libfunction', './')\n", "libfunction.accuracy.restype = ctypes.c_double\n", "libfunction.accuracy.argtypes = [\n", " np.ctypeslib.ndpointer(dtype=np.int),\n", " np.ctypeslib.ndpointer(dtype=np.int),\n", " ctypes.c_int\n", "]\n", "\n", "def accuracy_c(y_pred, y_true):\n", " n = y_pred.size\n", " return libfunction.accuracy(y_pred, y_true, n)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 12.3 Using Fortran from Python" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Overwriting function.f\n" ] } ], "source": [ "%%file function.f\n", "\n", " SUBROUTINE DACCURACY(YPRED, YTRUE, ACC, N)\n", "\n", "CF2PY INTENT(OUT) :: ACC\n", "CF2PY INTENT(HIDE) :: N\n", " INTEGER*4 YPRED(N)\n", " INTEGER*4 YTRUE(N)\n", " DOUBLE PRECISION ACC\n", " INTEGER N, NTOTAL, NCORRECT\n", "\n", " NTOTAL = 0\n", " NCORRECT = 0\n", "\n", " DO 10 J = 1, N\n", " NTOTAL = NTOTAL + 1\n", " IF (YPRED(J) == YTRUE(J)) THEN\n", " NCORRECT = NCORRECT + 1\n", " END IF\n", " 10 CONTINUE\n", "\n", " ACC = REAL(NCORRECT) / NTOTAL\n", " END" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [], "source": [ "!f2py -c -m function function.f >> /dev/null" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [], "source": [ "import function\n", "def accuracy_fortran(y_pred, y_true):\n", " return function.daccuracy(y_pred, y_true)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 12.4 Measurements\n", "\n", "It turns out the compiled versions by Numba and Cython are almost as fast as C ! Although much easier to write. In this case, they are even faster than Fortran, NumPy and scikit-learn. This gives us the best of both worlds ! An interpreted language for rapid development, which can then be compiled for efficient execution in production." ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2.16 s ± 147 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n", "1.13 s ± 110 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n", "35.1 ms ± 2.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)\n", "20.2 ms ± 984 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n", "9.57 ms ± 338 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n", "14 ms ± 1.51 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)\n", "12 ms ± 284 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" ] } ], "source": [ "n = int(1e7)\n", "y_pred = np.random.randint(2, size=n)\n", "y_true = np.random.randint(2, size=n)\n", "\n", "%timeit accuracy_python(y_pred, y_true)\n", "%timeit accuracy_sklearn(y_pred, y_true)\n", "%timeit accuracy_fortran(y_pred, y_true)\n", "%timeit accuracy_numpy(y_pred, y_true)\n", "%timeit accuracy_numba(y_pred, y_true)\n", "%timeit accuracy_cython(y_pred, y_true)\n", "%timeit accuracy_c(y_pred, y_true)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Todo\n", "\n", "* Introduce Data Science\n", "* Show some SQL ORM\n", "* High Performance Computing (HPC)\n", " * multiprocessing, IPython Parallel, MPI, OpenMP, OpenCL, CUDA\n", "* Big Data\n", " * MapReduce, Cluster Computing (Hadoop, Spark)\n", "* Graphs and Networks\n", " * Tools: networkx, graph-tool, gephi, graphviz, pygsp" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.3" } }, "nbformat": 4, "nbformat_minor": 1 }