{ "metadata": { "name": "", "signature": "sha256:8f94d037b8b415d618faa18cc783a5bdf6c8d4830a454dd9ee2686367f9d24b9" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\"Blaze\n", "\n", "# Getting started with `into`\n", "\n", "* Full tutorial available at http://github.com/ContinuumIO/blaze-tutorial\n", "* Install software with `conda install -c blaze blaze`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Datatypes for Performance tuning\n", "\n", "* *Do we store these as integers or floats? `int32` or `int64`? `int8`?*\n", "* *Do we store times as datetimes or as strings?*\n", "* *Do we store these strings as variable length or fixed length*?\n", "* *Do we know how large this array will be?*\n", "\n", "As we encode values as bits we make choices; those choices can affect performance. We encode how to convert values to bits and back as a *datatype*. You've seen data types before in many forms including C types like `long`, `double` and `double[100]`, numpy dtypes like `i4` and `f8` or Python types like `int`, and `float`. Other systems like SQL, HDF5, etc. have similar datatype systems with different names.\n", "\n", "To manage datatypes across different systems we use `datashape` a datatype system that maps cleanly on to all systems with which `into` interacts. This one system can translate into any of the others.\n", "\n", "In this section we'll talk about the following\n", "\n", "1. How to discover the datatype of your data, no matter how it is stored\n", "2. Minor tweaks you can do to that datatype to improve performance in certain storage systems\n", "3. How to create new datasets easily" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.1 DataShape and `discover`\n", "\n", "We introduce datashape, an all-encompassing datatype language, and `discover`, a function that does all of the work for you.\n", "\n", "The discover function returns the datashape of an object. Lets look at a few examples." ] }, { "cell_type": "code", "collapsed": false, "input": [ "from into import discover, into, resource\n", "discover(1)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "ctype(\"int64\")" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "discover([1, 2, 3])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "dshape(\"3 * int64\")" ] } ], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "discover([[1, 2, 3],\n", " [4, 5, 6]])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "dshape(\"2 * 3 * int64\")" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "discover([{'x': 1, 'y': 1.0},\n", " {'x': 2, 'y': 2.0},\n", " {'x': 3, 'y': 3.0}])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "dshape(\"3 * {x: int64, y: float64}\")" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "import pandas as pd\n", "df = pd.DataFrame([['Alice', 100],\n", " ['Bob', 200],\n", " ['Charlie', 300]], columns=['name', 'balance'])\n", "discover(df)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "dshape(\"3 * {name: string, balance: int64}\")" ] } ], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": [ "By looking closely at these examples we see the structure of datashape. Elements have types like `int64` or `string`. Records/structs/groups of elements have record dtypes like `{x: int64, y: float64}`. Lengths of collections are encoded by numbers like `3 * ` for \"three of\" or `2 * 3 * ` for \"a two-by-three grid of\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Construct data with the following datashapes. Use any container type (e.g. `list`, `pd.DataFrame`, `np.ndarray`).\n", "\n", " 2 * int64\n", " 2 * string\n", " {name: string, id: int}\n", " datetime\n", " {name: string, id: int, payments: 2 * datetime}\n", " 2 * {name: string, id: int, payments: 2 * datetime}\n", " 5 * 5 * 5 * float32\n", " \n", "Use `discover` to verify your answers." ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Should be 2 * int64\n", "discover([1, 2]) " ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "dshape(\"2 * int64\")" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Discover doesn't care about storage format\n", "\n", "The `discover` function doesn't care if your data lives in a Python list, Pandas DataFrame, NumPy Array, CSV file, PySpark RDD, or SQL database. \n", "\n", "In other words, using `into` preserves datashape." ] }, { "cell_type": "code", "collapsed": false, "input": [ "discover(df)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "dshape(\"3 * {name: string, balance: int64}\")" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "import numpy as np\n", "x = into(np.ndarray, df)\n", "discover(x) # different container, same datashape" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "dshape(\"3 * {name: string, balance: int64}\")" ] } ], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "t = into('sqlite:///:memory:::mydf', df)\n", "discover(t) # different container, mostly the same datashape" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "dshape(\"var * {name: string, balance: int64}\")" ] } ], "prompt_number": 9 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Discover works on nested structures\n", "\n", "Call discover on a single table of our baseball statistics database." ] }, { "cell_type": "code", "collapsed": false, "input": [ "salaries = resource('sqlite:///data/lahman2013.sqlite::Salaries')\n", "discover(salaries)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "dshape(\"\"\"var * {\n", " yearID: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " playerID: ?string,\n", " salary: ?float64\n", " }\"\"\")" ] } ], "prompt_number": 10 }, { "cell_type": "markdown", "metadata": {}, "source": [ "And then call it on the entire database" ] }, { "cell_type": "code", "collapsed": false, "input": [ "db = resource('sqlite:///data/lahman2013.sqlite')\n", "discover(db)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "dshape(\"\"\"{\n", " AllstarFull: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " gameNum: ?int32,\n", " gameID: ?string,\n", " teamID: ?string,\n", " lgID: ?string,\n", " GP: ?int32,\n", " startingPos: ?int32\n", " },\n", " Appearances: var * {\n", " yearID: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " playerID: ?string,\n", " G_all: ?int32,\n", " GS: ?int32,\n", " G_batting: ?int32,\n", " G_defense: ?int32,\n", " G_p: ?int32,\n", " G_c: ?int32,\n", " G_1b: ?int32,\n", " G_2b: ?int32,\n", " G_3b: ?int32,\n", " G_ss: ?int32,\n", " G_lf: ?int32,\n", " G_cf: ?int32,\n", " G_rf: ?int32,\n", " G_of: ?int32,\n", " G_dh: ?int32,\n", " G_ph: ?int32,\n", " G_pr: ?int32\n", " },\n", " AwardsManagers: var * {\n", " playerID: ?string,\n", " awardID: ?string,\n", " yearID: ?int32,\n", " lgID: ?string,\n", " tie: ?string,\n", " notes: ?string\n", " },\n", " AwardsPlayers: var * {\n", " playerID: ?string,\n", " awardID: ?string,\n", " yearID: ?int32,\n", " lgID: ?string,\n", " tie: ?string,\n", " notes: ?string\n", " },\n", " AwardsShareManagers: var * {\n", " awardID: ?string,\n", " yearID: ?int32,\n", " lgID: ?string,\n", " playerID: ?string,\n", " pointsWon: ?int32,\n", " pointsMax: ?int32,\n", " votesFirst: ?int32\n", " },\n", " AwardsSharePlayers: var * {\n", " awardID: ?string,\n", " yearID: ?int32,\n", " lgID: ?string,\n", " playerID: ?string,\n", " pointsWon: ?float64,\n", " pointsMax: ?int32,\n", " votesFirst: ?float64\n", " },\n", " Batting: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " stint: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " G: ?int32,\n", " G_batting: ?int32,\n", " AB: ?int32,\n", " R: ?int32,\n", " H: ?int32,\n", " 2B: ?int32,\n", " 3B: ?int32,\n", " HR: ?int32,\n", " RBI: ?int32,\n", " SB: ?int32,\n", " CS: ?int32,\n", " BB: ?int32,\n", " SO: ?int32,\n", " IBB: ?int32,\n", " HBP: ?int32,\n", " SH: ?int32,\n", " SF: ?int32,\n", " GIDP: ?int32,\n", " G_old: ?int32\n", " },\n", " BattingPost: var * {\n", " yearID: ?int32,\n", " round: ?string,\n", " playerID: ?string,\n", " teamID: ?string,\n", " lgID: ?string,\n", " G: ?int32,\n", " AB: ?int32,\n", " R: ?int32,\n", " H: ?int32,\n", " 2B: ?int32,\n", " 3B: ?int32,\n", " HR: ?int32,\n", " RBI: ?int32,\n", " SB: ?int32,\n", " CS: ?int32,\n", " BB: ?int32,\n", " SO: ?int32,\n", " IBB: ?int32,\n", " HBP: ?int32,\n", " SH: ?int32,\n", " SF: ?int32,\n", " GIDP: ?int32\n", " },\n", " Fielding: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " stint: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " POS: ?string,\n", " G: ?int32,\n", " GS: ?int32,\n", " InnOuts: ?int32,\n", " PO: ?int32,\n", " A: ?int32,\n", " E: ?int32,\n", " DP: ?int32,\n", " PB: ?int32,\n", " WP: ?int32,\n", " SB: ?int32,\n", " CS: ?int32,\n", " ZR: ?float64\n", " },\n", " FieldingOF: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " stint: ?int32,\n", " Glf: ?int32,\n", " Gcf: ?int32,\n", " Grf: ?int32\n", " },\n", " FieldingPost: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " round: ?string,\n", " POS: ?string,\n", " G: ?int32,\n", " GS: ?int32,\n", " InnOuts: ?int32,\n", " PO: ?int32,\n", " A: ?int32,\n", " E: ?int32,\n", " DP: ?int32,\n", " TP: ?int32,\n", " PB: ?int32,\n", " SB: ?int32,\n", " CS: ?int32\n", " },\n", " HallOfFame: var * {\n", " playerID: ?string,\n", " yearid: ?int32,\n", " votedBy: ?string,\n", " ballots: ?int32,\n", " needed: ?int32,\n", " votes: ?int32,\n", " inducted: ?string,\n", " category: ?string,\n", " needed_note: ?string\n", " },\n", " Managers: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " inseason: ?int32,\n", " G: ?int32,\n", " W: ?int32,\n", " L: ?int32,\n", " rank: ?int32,\n", " plyrMgr: ?string\n", " },\n", " ManagersHalf: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " inseason: ?int32,\n", " half: ?int32,\n", " G: ?int32,\n", " W: ?int32,\n", " L: ?int32,\n", " rank: ?int32\n", " },\n", " Master: var * {\n", " playerID: ?string,\n", " birthYear: ?int32,\n", " birthMonth: ?int32,\n", " birthDay: ?int32,\n", " birthCountry: ?string,\n", " birthState: ?string,\n", " birthCity: ?string,\n", " deathYear: ?int32,\n", " deathMonth: ?int32,\n", " deathDay: ?int32,\n", " deathCountry: ?string,\n", " deathState: ?string,\n", " deathCity: ?string,\n", " nameFirst: ?string,\n", " nameLast: ?string,\n", " nameGiven: ?string,\n", " weight: ?int32,\n", " height: ?float64,\n", " bats: ?string,\n", " throws: ?string,\n", " debut: ?float64,\n", " finalGame: ?float64,\n", " retroID: ?string,\n", " bbrefID: ?string\n", " },\n", " Pitching: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " stint: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " W: ?int32,\n", " L: ?int32,\n", " G: ?int32,\n", " GS: ?int32,\n", " CG: ?int32,\n", " SHO: ?int32,\n", " SV: ?int32,\n", " IPouts: ?int32,\n", " H: ?int32,\n", " ER: ?int32,\n", " HR: ?int32,\n", " BB: ?int32,\n", " SO: ?int32,\n", " BAOpp: ?float64,\n", " ERA: ?float64,\n", " IBB: ?int32,\n", " WP: ?int32,\n", " HBP: ?int32,\n", " BK: ?int32,\n", " BFP: ?int32,\n", " GF: ?int32,\n", " R: ?int32,\n", " SH: ?int32,\n", " SF: ?int32,\n", " GIDP: ?int32\n", " },\n", " PitchingPost: var * {\n", " playerID: ?string,\n", " yearID: ?int32,\n", " round: ?string,\n", " teamID: ?string,\n", " lgID: ?string,\n", " W: ?int32,\n", " L: ?int32,\n", " G: ?int32,\n", " GS: ?int32,\n", " CG: ?int32,\n", " SHO: ?int32,\n", " SV: ?int32,\n", " IPouts: ?int32,\n", " H: ?int32,\n", " ER: ?int32,\n", " HR: ?int32,\n", " BB: ?int32,\n", " SO: ?int32,\n", " BAOpp: ?float64,\n", " ERA: ?float64,\n", " IBB: ?int32,\n", " WP: ?int32,\n", " HBP: ?int32,\n", " BK: ?int32,\n", " BFP: ?int32,\n", " GF: ?int32,\n", " R: ?int32,\n", " SH: ?int32,\n", " SF: ?int32,\n", " GIDP: ?int32\n", " },\n", " Salaries: var * {\n", " yearID: ?int32,\n", " teamID: ?string,\n", " lgID: ?string,\n", " playerID: ?string,\n", " salary: ?float64\n", " },\n", " Schools: var * {\n", " schoolID: ?string,\n", " schoolName: ?string,\n", " schoolCity: ?string,\n", " schoolState: ?string,\n", " schoolNick: ?string\n", " },\n", " SchoolsPlayers: var * {\n", " playerID: ?string,\n", " schoolID: ?string,\n", " yearMin: ?int32,\n", " yearMax: ?int32\n", " },\n", " SeriesPost: var * {\n", " yearID: ?int32,\n", " round: ?string,\n", " teamIDwinner: ?string,\n", " lgIDwinner: ?string,\n", " teamIDloser: ?string,\n", " lgIDloser: ?string,\n", " wins: ?int32,\n", " losses: ?int32,\n", " ties: ?int32\n", " },\n", " Teams: var * {\n", " yearID: ?int32,\n", " lgID: ?string,\n", " teamID: ?string,\n", " franchID: ?string,\n", " divID: ?string,\n", " Rank: ?int32,\n", " G: ?int32,\n", " Ghome: ?int32,\n", " W: ?int32,\n", " L: ?int32,\n", " DivWin: ?string,\n", " WCWin: ?string,\n", " LgWin: ?string,\n", " WSWin: ?string,\n", " R: ?int32,\n", " AB: ?int32,\n", " H: ?int32,\n", " 2B: ?int32,\n", " 3B: ?int32,\n", " HR: ?int32,\n", " BB: ?int32,\n", " SO: ?int32,\n", " SB: ?int32,\n", " CS: ?int32,\n", " HBP: ?int32,\n", " SF: ?int32,\n", " RA: ?int32,\n", " ER: ?int32,\n", " ERA: ?float64,\n", " CG: ?int32,\n", " SHO: ?int32,\n", " SV: ?int32,\n", " IPouts: ?int32,\n", " HA: ?int32,\n", " HRA: ?int32,\n", " BBA: ?int32,\n", " SOA: ?int32,\n", " E: ?int32,\n", " DP: ?int32,\n", " FP: ?float64,\n", " name: ?string,\n", " park: ?string,\n", " attendance: ?int32,\n", " BPF: ?int32,\n", " PPF: ?int32,\n", " teamIDBR: ?string,\n", " teamIDlahman45: ?string,\n", " teamIDretro: ?string\n", " },\n", " TeamsFranchises: var * {\n", " franchID: ?string,\n", " franchName: ?string,\n", " active: ?string,\n", " NAassoc: ?string\n", " },\n", " TeamsHalf: var * {\n", " yearID: ?int32,\n", " lgID: ?string,\n", " teamID: ?string,\n", " Half: ?string,\n", " divID: ?string,\n", " DivWin: ?string,\n", " Rank: ?int32,\n", " G: ?int32,\n", " W: ?int32,\n", " L: ?int32\n", " },\n", " temp: var * {ID: ?int32, namefull: ?string, born: ?float64}\n", " }\"\"\")" ] } ], "prompt_number": 11 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2.2 Tweaking Datashapes for Performance\n", "\n", "Some storage systems don't cleanly support some datashapes. For example\n", "\n", "1. SQL doesn't support nested data\n", "2. HDF5 doesn't cleanly support datetimes\n", "3. Variable length strings are often costly in binary stores\n", "4. NumPy has poor missing value support\n", "\n", "Because of this we sometimes want to slightly change a datashape during migration." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Fixed Length Strings\n", "\n", "A common example is the use of strings with a known maximum length, called \"fixed length strings,\" and strings with particular character encodings, like ASCII or UTF-8.\n", "\n", "Consider moving the following data, including strings, into a numpy array." ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = [{'name': 'Alice', 'balance': 100},\n", " {'name': 'Bob', 'balance': 200}]\n", "into(np.ndarray, data)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "array([(100, 'Alice'), (200, 'Bob')], \n", " dtype=[('balance', '\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
01
0 33.1 -89.2
1 37.0-141.5
2 41.0-120.5
\n", "" ], "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ " 0 1\n", "0 33.1 -89.2\n", "1 37.0 -141.5\n", "2 41.0 -120.5" ] } ], "prompt_number": 17 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that, because our original data was stored in a format that didn't include the column names, the output lacks them as well. We complement our data with a datashape." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ds = dshape('var * {lat: float64, long: float64}')\n", "into(pd.DataFrame, data, dshape=ds)" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
latlong
0 33.1 -89.2
1 37.0-141.5
2 41.0-120.5
\n", "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ " lat long\n", "0 33.1 -89.2\n", "1 37.0 -141.5\n", "2 41.0 -120.5" ] } ], "prompt_number": 18 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create new datasets with `resource` and datashape\n", "\n", "In the last section we used `resource` to acquire existing datasets from string URIs. \n", "\n", "We also use the `resource` function to create new datasets given a URI and a datashape." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Example\n", "\n", "We create a new HDF5 dataset to store 100 by 100 integers." ] }, { "cell_type": "code", "collapsed": false, "input": [ "ds = dshape('100 * 100 * int64')\n", "dset = resource('myfile.hdf5::/x', dshape=ds)\n", "dset" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "" ] } ], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Exercise\n", "\n", "Create a new SQLite table named `transactions` in `'data/my.db'` with the following datashape\n", "\n", " var * {name: string, balance: int, timestamp: datetime}\n", " \n", "Here `var` stands for \"variable length\" or generally \"a dimension to which we don't know a fixed size.\"" ] }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 19 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 19 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that you could also have built this table using raw SQLAlchemy code. Knowing datashape lets you skip learning many libraries like SQLAlchemy and H5Py for simple tasks. `into` serves as a single interface over many useful libraries." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import sqlalchemy as sa\n", "\n", "engine = sa.create_engine('sqlite:///data/my.db')\n", "metadata = sa.MetaData(engine)\n", "transactions = sa.Table('transactions2', metadata,\n", " sa.Column('name', sa.String),\n", " sa.Column('balance', sa.Integer),\n", " sa.Column('timestamp', sa.DateTime))\n", "transactions.create()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 20 } ], "metadata": {} } ] }