{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# HIDDEN\n", "from datascience import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While there are many kinds of collections in Python, we will work primarily with arrays in this class. We've already seen that the `make_array` function can be used to create arrays of numbers.\n", "\n", "Arrays can also contain strings or other types of values, but a single array can only contain a single kind of data. (It usually doesn't make sense to group together unlike data anyway.) For example:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array(['noun', 'pronoun', 'verb', 'adverb', 'adjective', 'conjunction',\n", " 'preposition', 'interjection'], \n", " dtype='" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Arrays also have *methods*, which are functions that operate on the array values. The `mean` of a collection of numbers is its average value: the sum divided by the length. Each pair of parentheses in the examples below is part of a call expression; it's calling a function with no arguments to perform a computation on the array called `highs`." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "highs.size" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "57.736000000000004" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "highs.sum()" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "14.434000000000001" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "highs.mean()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Functions on Arrays\n", "The `numpy` package, abbreviated `np` in programs, provides Python programmers with convenient and powerful functions for creating and manipulating arrays." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For example, the `diff` function computes the difference between each adjacent pair of elements in an array. The first element of the `diff` is the second element minus the first. " ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 0.787, 0.198, 0.579])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.diff(highs)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [full Numpy reference](http://docs.scipy.org/doc/numpy/reference/) lists these functions exhaustively, but only a small subset are used commonly for data processing applications. These are grouped into different packages within `np`. Learning this vocabulary is an important part of learning the Python language, so refer back to this list often as you work through examples and problems.\n", "\n", "However, you **don't need to memorize these**. Use this as a reference.\n", "\n", "Each of these functions takes an array as an argument and returns a single value.\n", "\n", "| **Function** | Description |\n", "|--------------------|----------------------------------------------------------------------|\n", "| `np.prod` | Multiply all elements together |\n", "| `np.sum` | Add all elements together |\n", "| `np.all` | Test whether all elements are true values (non-zero numbers are true)|\n", "| `np.any` | Test whether any elements are true values (non-zero numbers are true)|\n", "| `np.count_nonzero` | Count the number of non-zero elements |\n", "\n", "Each of these functions takes an array as an argument and returns an array of values.\n", "\n", "| **Function** | Description |\n", "|--------------------|----------------------------------------------------------------------|\n", "| `np.diff` | Difference between adjacent elements |\n", "| `np.round` | Round each number to the nearest integer (whole number) |\n", "| `np.cumprod` | A cumulative product: for each element, multiply all elements so far |\n", "| `np.cumsum` | A cumulative sum: for each element, add all elements so far |\n", "| `np.exp` | Exponentiate each element |\n", "| `np.log` | Take the natural logarithm of each element |\n", "| `np.sqrt` | Take the square root of each element |\n", "| `np.sort` | Sort the elements |\n", "\n", "Each of these functions takes an array of strings and returns an array.\n", "\n", "| **Function** | **Description** |\n", "|---------------------|--------------------------------------------------------------|\n", "| `np.char.lower` | Lowercase each element |\n", "| `np.char.upper` | Uppercase each element |\n", "| `np.char.strip` | Remove spaces at the beginning or end of each element |\n", "| `np.char.isalpha` | Whether each element is only letters (no numbers or symbols) |\n", "| `np.char.isnumeric` | Whether each element is only numeric (no letters) \n", "\n", "Each of these functions takes both an array of strings and a *search string*; each returns an array.\n", "\n", "| **Function** | **Description** |\n", "|----------------------|----------------------------------------------------------------------------------|\n", "| `np.char.count` | Count the number of times a search string appears among the elements of an array |\n", "| `np.char.find` | The position within each element that a search string is found first |\n", "| `np.char.rfind` | The position within each element that a search string is found last |\n", "| `np.char.startswith` | Whether each element starts with the search string \n", "\n" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }