{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Numpy 数组及其索引" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "先导入numpy:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "from numpy import *" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 产生数组" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "从列表产生数组:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3])" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lst = [0, 1, 2, 3]\n", "a = array(lst)\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "或者直接将列表传入:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = array([1, 2, 3, 4])\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 数组属性" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看类型:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "numpy.ndarray" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看数组中的数据类型:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dtype('int32')" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 32比特的整数\n", "a.dtype" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看每个元素所占的字节:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.itemsize" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看形状,会返回一个元组,每个元素代表这一维的元素数目:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(4L,)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 1维数组,返回一个元组\n", "a.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "或者使用:" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(4L,)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "shape(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`shape` 的使用历史要比 `a.shape` 久,而且还可以作用于别的类型:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(4L,)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "lst = [1,2,3,4]\n", "shape(lst)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看元素数目:" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.size" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "size(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看所有元素所占的空间:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "16" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.nbytes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "但事实上,数组所占的存储空间要比这个数字大,因为要用一个header来保存shape,dtype这样的信息。\n", "\n", "查看数组维数:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.ndim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 使用fill方法设定初始值" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以使用 `fill` 方法将数组设为指定值:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([-4, -4, -4, -4])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.fill(-4.8)\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "但是与列表不同,数组中要求所有元素的 `dtype` 是一样的,如果传入参数的类型与数组类型不一样,需要按照已有的类型进行转换。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 索引与切片" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "和列表相似,数组也支持索引和切片操作。\n", "\n", "索引第一个元素:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = array([0, 1, 2, 3])\n", "a[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "修改第一个元素的值:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([10, 1, 2, 3])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[0] = 10\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "切片,支持负索引:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([12, 13])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = array([11,12,13,14,15])\n", "a[1:3]" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([12, 13])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[1:-2]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([12, 13])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[-4:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "省略参数:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([11, 13, 15])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[::2]" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([14, 15])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[-2:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设我们记录一辆汽车表盘上每天显示的里程数:" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [], "source": [ "od = array([21000, 21180, 21240, 22100, 22400])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以这样计算每天的旅程:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([180, 60, 860, 300])" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "dist = od[1:] - od[:-1]\n", "dist" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "在本质上,**Python**会将array的各种计算转换为类似这样的**C**代码:\n", "\n", "```c\n", "int compute_sum(int *arr, int N) {\n", " int sum = 0;\n", " int i;\n", " for (i = 0; i < N; i++) {\n", " sum += arr[i];\n", " }\n", " return sum;\n", "}\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 多维数组及其属性" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`array` 还可以用来生成多维数组:" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [10, 11, 12, 13]])" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = array([[ 0, 1, 2, 3],\n", " [10,11,12,13]])\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "事实上我们传入的是一个以列表为元素的列表,最终得到一个二维数组。\n", "\n", "甚至可以扩展到3D或者4D的情景。\n", "\n", "查看形状:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(2L, 4L)" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这里2代表行数,4代表列数。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看总的元素个数:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 2 * 4 = 8\n", "a.size" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "查看维数:" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.ndim" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 多维数组索引" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "对于二维数组,可以传入两个数字来索引:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "13" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[1, 3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "其中,1是行索引,3是列索引,中间用逗号隔开,事实上,**Python**会将它们看成一个元组(1,3),然后按照顺序进行对应。\n", "\n", "可以利用索引给它赋值:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [10, 11, 12, -1]])" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[1, 3] = -1\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "事实上,我们还可以使用单个索引来索引一整行内容:" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([10, 11, 12, -1])" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 返回第二行元组组成的array\n", "a[1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Python**会将这单个元组当成对第一维的索引,然后返回对应的内容。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 多维数组切片" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "多维数组,也支持切片操作:" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4, 5],\n", " [10, 11, 12, 13, 14, 15],\n", " [20, 21, 22, 23, 24, 25],\n", " [30, 31, 32, 33, 34, 35],\n", " [40, 41, 42, 43, 44, 45],\n", " [50, 51, 52, 53, 54, 55]])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = array([[ 0, 1, 2, 3, 4, 5],\n", " [10,11,12,13,14,15],\n", " [20,21,22,23,24,25],\n", " [30,31,32,33,34,35],\n", " [40,41,42,43,44,45],\n", " [50,51,52,53,54,55]])\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "想得到第一行的第 4 和第 5 两个元素:" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([3, 4])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[0, 3:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "得到最后两行的最后两列:" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[44, 45],\n", " [54, 55]])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[4:, 4:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "得到第三列:" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 2, 12, 22, 32, 42, 52])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[:, 2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "每一维都支持切片的规则,包括负索引,省略:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " [lower:upper:step]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "例如,取出3,5行的奇数列:" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[20, 22, 24],\n", " [40, 42, 44]])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[2::2, ::2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 切片是引用" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "切片在内存中使用的是引用机制。" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 3]\n" ] } ], "source": [ "a = array([0,1,2,3,4])\n", "b = a[2:4]\n", "print b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "引用机制意味着,**Python**并没有为 `b` 分配新的空间来存储它的值,而是让 `b` 指向了 `a` 所分配的内存空间,因此,改变 `b` 会改变 `a` 的值:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 10, 3, 4])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b[0] = 10\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "而这种现象在列表中并不会出现:" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 3, 4, 5]\n" ] } ], "source": [ "a = [1,2,3,4,5]\n", "b = a[2:3]\n", "b[0] = 13234\n", "print a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这样做的好处在于,对于很大的数组,不用大量复制多余的值,节约了空间。\n", "\n", "缺点在于,可能出现改变一个值改变另一个值的情况。\n", "\n", "一个解决方法是使用copy()方法产生一个复制,这个复制会申请新的内存:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = array([0,1,2,3,4])\n", "b = a[2:4].copy()\n", "b[0] = 10\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 花式索引" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "切片只能支持连续或者等间隔的切片操作,要想实现任意位置的操作,需要使用花式索引 `fancy slicing` 。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 一维花式索引" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "与 range 函数类似,我们可以使用 arange 函数来产生等差数组。" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "array([ 0, 10, 20, 30, 40, 50, 60, 70])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = arange(0, 80, 10)\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "花式索引需要指定索引位置:" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[10 20 50]\n" ] } ], "source": [ "indices = [1, 2, -3]\n", "y = a[indices]\n", "print y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "还可以使用布尔数组来花式索引:" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "collapsed": true }, "outputs": [], "source": [ "mask = array([0,1,1,0,0,1,0,0],\n", " dtype=bool)" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([10, 20, 50])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[mask]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "或者用布尔表达式生成 `mask`,选出了所有大于0.5的值:" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 0.37214708, 0.48594733, 0.73365131, 0.15769295, 0.30786017,\n", " 0.62068734, 0.36940654, 0.09424167, 0.53085308, 0.12248951])" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from numpy.random import rand\n", "a = rand(10)\n", "a" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 0.73365131, 0.62068734, 0.53085308])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mask = a > 0.5\n", "a[mask]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "mask 必须是布尔数组。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 二维花式索引" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4, 5],\n", " [10, 11, 12, 13, 14, 15],\n", " [20, 21, 22, 23, 24, 25],\n", " [30, 31, 32, 33, 34, 35],\n", " [40, 41, 42, 43, 44, 45],\n", " [50, 51, 52, 53, 54, 55]])" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = array([[ 0, 1, 2, 3, 4, 5],\n", " [10,11,12,13,14,15],\n", " [20,21,22,23,24,25],\n", " [30,31,32,33,34,35],\n", " [40,41,42,43,44,45],\n", " [50,51,52,53,54,55]])\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "对于二维花式索引,我们需要给定 `row` 和 `col` 的值:" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 1, 12, 23, 34, 45])" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[(0,1,2,3,4), (1,2,3,4,5)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "返回的是一条次对角线上的5个值。" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[30, 32, 35],\n", " [40, 42, 45],\n", " [50, 52, 55]])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[3:, [0,2,5]]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "返回的是最后三行的第1,3,5列。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "也可以使用mask进行索引:" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([ 2, 22, 52])" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mask = array([1,0,1,0,0,1],\n", " dtype=bool)\n", "a[mask, 2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "与切片不同,花式索引返回的是原对象的一个复制而不是引用。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### “不完全”索引" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "只给定行索引的时候,返回整行:" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4, 5],\n", " [10, 11, 12, 13, 14, 15],\n", " [20, 21, 22, 23, 24, 25]])" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y = a[:3]\n", "y" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "这时候也可以使用花式索引取出第2,3,5行:" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[10, 11, 12, 13, 14, 15],\n", " [20, 21, 22, 23, 24, 25],\n", " [40, 41, 42, 43, 44, 45]])" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "condition = array([0,1,1,0,1],\n", " dtype=bool)\n", "a[condition]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 三维花式索引" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11],\n", " [12, 13, 14, 15]],\n", "\n", " [[16, 17, 18, 19],\n", " [20, 21, 22, 23],\n", " [24, 25, 26, 27],\n", " [28, 29, 30, 31]],\n", "\n", " [[32, 33, 34, 35],\n", " [36, 37, 38, 39],\n", " [40, 41, 42, 43],\n", " [44, 45, 46, 47]],\n", "\n", " [[48, 49, 50, 51],\n", " [52, 53, 54, 55],\n", " [56, 57, 58, 59],\n", " [60, 61, 62, 63]]])" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = arange(64)\n", "a.shape = 4,4,4\n", "a" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[[ 2, 3],\n", " [ 6, 7],\n", " [10, 11],\n", " [14, 15]],\n", "\n", " [[18, 19],\n", " [22, 23],\n", " [26, 27],\n", " [30, 31]],\n", "\n", " [[34, 35],\n", " [38, 39],\n", " [42, 43],\n", " [46, 47]],\n", "\n", " [[50, 51],\n", " [54, 55],\n", " [58, 59],\n", " [62, 63]]])" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y = a[:,:,[2, -1]]\n", "y" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## where语句" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " where(array)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`where` 函数会返回所有非零元素的索引。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 一维数组" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "先看一维的例子:" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = array([0, 12, 5, 20])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "判断数组中的元素是不是大于10:" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([False, True, False, True], dtype=bool)" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a > 10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "数组中所有大于10的元素的索引位置:" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(array([1, 3], dtype=int64),)" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "where(a > 10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "注意到 `where` 的返回值是一个元组。\n", "\n", "使用元组是由于 where 可以对多维数组使用,此时返回值就是多维的。\n", "\n", "在使用的时候,我们可以这样:" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 3], dtype=int64)" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "indices = where(a > 10)\n", "indices = indices[0]\n", "indices" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "或者:" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 3], dtype=int64)" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "indices = where(a>10)[0]\n", "indices" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以直接用 `where` 的返回值进行索引:" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([12, 20])" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loc = where(a > 10)\n", "a[loc]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 多维数组" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "考虑二维数组:" ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = array([[0, 12, 5, 20],\n", " [1, 2, 11, 15]])\n", "loc = where(a > 10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "返回结果是一个二维的元组,每一维代表这一维的索引值:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "collapsed": false, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "(array([0, 0, 1, 1], dtype=int64), array([1, 3, 2, 3], dtype=int64))" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "loc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "也可以直接用来索引a:" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([12, 20, 11, 15])" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[loc]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "或者可以这样:" ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "collapsed": true }, "outputs": [], "source": [ "rows, cols = where(a>10)" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([0, 0, 1, 1], dtype=int64)" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rows" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([1, 3, 2, 3], dtype=int64)" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "cols" ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([12, 20, 11, 15])" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[rows, cols]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "再看另一个例子:" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4],\n", " [ 5, 6, 7, 8, 9],\n", " [10, 11, 12, 13, 14],\n", " [15, 16, 17, 18, 19],\n", " [20, 21, 22, 23, 24]])" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = arange(25)\n", "a.shape = 5,5\n", "a" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "array([[False, False, False, False, False],\n", " [False, False, False, False, False],\n", " [False, False, False, True, True],\n", " [ True, True, True, True, True],\n", " [ True, True, True, True, True]], dtype=bool)" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a > 12" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(array([2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4], dtype=int64),\n", " array([3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4], dtype=int64))" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "where(a > 12)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }