{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "### 数组的转置和轴对换" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 转置可以使用transpose()方法,也可以使用T属性" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3, 4],\n", " [ 5, 6, 7, 8, 9],\n", " [10, 11, 12, 13, 14]])" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.arange(15).reshape((3,5))\n", "arr" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 5, 10],\n", " [ 1, 6, 11],\n", " [ 2, 7, 12],\n", " [ 3, 8, 13],\n", " [ 4, 9, 14]])" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.transpose()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 5, 10],\n", " [ 1, 6, 11],\n", " [ 2, 7, 12],\n", " [ 3, 8, 13],\n", " [ 4, 9, 14]])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.T" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 在矩阵计算时经常使用转置操作,比如,计算矩阵的内积np.dot()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1.24971097, -1.15270121, 1.41855366],\n", " [-1.21579815, 0.67722079, -0.63992388],\n", " [-1.14607563, 0.59604011, 1.46181227],\n", " [ 0.39345074, 1.47096921, 0.55652236]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.random.randn(4,3)\n", "arr" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 4.50823549, -2.36826025, 1.09441706],\n", " [-2.36826025, 4.3063623 , -0.37861226],\n", " [ 1.09441706, -0.37861226, 4.86840931]])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(arr.T,arr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 对于高维数组(>3),transpose需要得到一个由编号组成的元祖才能对这些轴进行转置" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7]],\n", "\n", " [[ 8, 9, 10, 11],\n", " [12, 13, 14, 15]]])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.arange(16).reshape((2,2,4))\n", "arr" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[[ 0, 1, 2, 3],\n", " [ 8, 9, 10, 11]],\n", "\n", " [[ 4, 5, 6, 7],\n", " [12, 13, 14, 15]]])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.transpose((1,0,2))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* swapaxes方法接收一对轴编号。swapaxes也是返回源数据的视图(不会进行任何复制操作" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7]],\n", "\n", " [[ 8, 9, 10, 11],\n", " [12, 13, 14, 15]]])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[[ 0, 4],\n", " [ 1, 5],\n", " [ 2, 6],\n", " [ 3, 7]],\n", "\n", " [[ 8, 12],\n", " [ 9, 13],\n", " [10, 14],\n", " [11, 15]]])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.swapaxes(1,2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 4.2通用函数:快速的元素级数组函数\n", "* 通用函数(即ufunc)是一种对ndarray中的数据执行元素级运算的函数。你可以将其看做简单函数(接受一个或多个标量值,并产生一个或多个标量值)的矢量化包装器" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.arange(10)\n", "arr" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0. , 1. , 1.41421356, 1.73205081, 2. ,\n", " 2.23606798, 2.44948974, 2.64575131, 2.82842712, 3. ])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.sqrt(arr)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,\n", " 5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,\n", " 2.98095799e+03, 8.10308393e+03])" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.exp(arr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 上述方法也叫一元通用函数。下面介绍几个二元通用函数" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-1.7266767 , 1.2232217 , 0.13346529, 0.20789797, 0.05153854,\n", " -1.39971978, 0.11192313, 0.00818701])" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = np.random.randn(8)\n", "y = np.random.randn(8)\n", "x" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-0.58431702, 1.18389044, 0.23911619, -0.20529655, -0.22218649,\n", " -0.23151421, -1.83659486, 0.91349072])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-0.58431702, 1.2232217 , 0.23911619, 0.20789797, 0.05153854,\n", " -0.23151421, 0.11192313, 0.91349072])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.maximum(x,y)# 返回对应位置元素的较大值" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* modf可以返回多个数组,它会返回浮点数的整数部分和小数部分" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 2.11214818, 5.9728119 , -0.25401792, -10.69556551,\n", " -2.15374894, -0.32658578, -2.63945653])" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.random.randn(7)*5\n", "arr" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0.11214818, 0.9728119 , -0.25401792, -0.69556551, -0.15374894,\n", " -0.32658578, -0.63945653])" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "remainder, whole_part = np.modf(arr)\n", "remainder" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 2., 5., -0., -10., -2., -0., -2.])" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "whole_part" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 下表是一些常见的函数\n", "![一元1](https://upload-images.jianshu.io/upload_images/7178691-1d494e73b61c7ced.png)\n", "![一元2](https://upload-images.jianshu.io/upload_images/7178691-2be79faf68ab6ff8.png)\n", "![一元3](https://upload-images.jianshu.io/upload_images/7178691-4e38d02a66481530.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 二元函数\n", "![二元1](https://upload-images.jianshu.io/upload_images/7178691-eff1e61e5464159f.png)\n", "![二元2](https://upload-images.jianshu.io/upload_images/7178691-eff1e61e5464159f.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.3 利用数组进行数据处理\n", "* 用数组表达式代替循环的做法,通常被称为矢量化" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* np.meshgrid函数接受两个一维数组,并产生两个二维矩阵(对应于两个数组中所有的(x,y)对)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-5. , -5. , -5. , ..., -5. , -5. , -5. ],\n", " [-4.99, -4.99, -4.99, ..., -4.99, -4.99, -4.99],\n", " [-4.98, -4.98, -4.98, ..., -4.98, -4.98, -4.98],\n", " ...,\n", " [ 4.97, 4.97, 4.97, ..., 4.97, 4.97, 4.97],\n", " [ 4.98, 4.98, 4.98, ..., 4.98, 4.98, 4.98],\n", " [ 4.99, 4.99, 4.99, ..., 4.99, 4.99, 4.99]])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "points = np.arange(-5,5,0.01)\n", "xs,ys = np.meshgrid(points,points)\n", "ys" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[7.07106781, 7.06400028, 7.05693985, ..., 7.04988652, 7.05693985,\n", " 7.06400028],\n", " [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,\n", " 7.05692568],\n", " [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,\n", " 7.04985815],\n", " ...,\n", " [7.04988652, 7.04279774, 7.03571603, ..., 7.0286414 , 7.03571603,\n", " 7.04279774],\n", " [7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,\n", " 7.04985815],\n", " [7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,\n", " 7.05692568]])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "z = np.sqrt(xs**2 + ys**2)\n", "z" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "import matplotlib.pyplot as plt\n", "plt.imshow(z,cmap=plt.cm.gray)\n", "plt.colorbar()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 将条件逻辑表述为数组运算\n", "* numpy.where函数是三元表达式x if condition else y的矢量化版本。" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1.2, 2.2, 1.3, 1.4, 2.5]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "xarr = np.array([1.2,1.2,1.3,1.4,1.5])\n", "yarr = np.array([2.1,2.2,2.3,2.4,2.5])\n", "cond = np.array([True,False,True,True,False])\n", "result = [(x if c else y) for x,y,c in zip(xarr, yarr, cond)]\n", "result" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1.2, 2.2, 1.3, 1.4, 2.5])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# 使用where()改进\n", "result = np.where(cond,xarr,yarr)\n", "result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* np.where的第二个和第三个参数不必是数组,它们都可以是标量值。在数据分析工作中,where通常用于根据另一个数组而产生一个新的数组。" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[False, False, False, False],\n", " [False, True, True, False],\n", " [ True, True, False, True],\n", " [ True, False, True, False]])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.random.randn(4,4)\n", "arr > 0" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-0.02459259, -0.6428239 , -0.76112065, -2.14235712],\n", " [-0.10120677, 0.14290032, 0.56280688, -1.39125445],\n", " [ 0.39828175, 0.07024995, -1.22357901, 0.90811459],\n", " [ 0.83422951, -0.10858237, 0.90184316, -2.57063453]])" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-2, -2, -2, -2],\n", " [-2, 2, 2, -2],\n", " [ 2, 2, -2, 2],\n", " [ 2, -2, 2, -2]])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.where(arr>0,2,-2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 也可以将标量和数组结合起来" ] }, { "cell_type": "code", "execution_count": 44, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-0.02459259, -0.6428239 , -0.76112065, -2.14235712],\n", " [-0.10120677, 2. , 2. , -1.39125445],\n", " [ 2. , 2. , -1.22357901, 2. ],\n", " [ 2. , -0.10858237, 2. , -2.57063453]])" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.where(arr>0,2,arr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 数学和统计方法\n", "* sum、mean以及标准差std等聚合计算(aggregation,通常叫做约简(reduction))既可以当做数组的实例方法调用,也可以当做顶级NumPy函数使用。" ] }, { "cell_type": "code", "execution_count": 47, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0.31996205, 0.69908917, 1.60390967, -0.41760093],\n", " [-0.36385331, -0.2618905 , 1.15295921, -0.10163846],\n", " [ 1.28356236, 0.45163684, 0.3690968 , -0.81384686],\n", " [ 0.14891047, 0.75320511, -1.49658404, 0.61913003],\n", " [ 0.0395804 , 0.97813093, 1.96525592, -1.07168954]])" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.random.randn(5,4)\n", "arr" ] }, { "cell_type": "code", "execution_count": 48, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.2928662657820542" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.mean()" ] }, { "cell_type": "code", "execution_count": 49, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.2928662657820542" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.mean(arr)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5.857325315641084" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.sum()" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "5.857325315641084" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.sum(arr)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* mean和sum这类的函数可以接受一个axis选项参数,用于计算该轴向上的统计值,最终结果是一个少一维的数组\n", "* axis = 1,按行计算;axis=0,按列计算" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.55133999, 0.10639423, 0.32261228, 0.00616539, 0.47781943])" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.mean(axis=1)" ] }, { "cell_type": "code", "execution_count": 54, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 1.42816197, 2.62017154, 3.59463756, -1.78564575])" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.sum(axis=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* cumsum和cumprod之类的方法则不聚合,而是产生一个由中间结果组成的数组" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4, 5, 6, 7])" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.arange(8)\n", "arr" ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 3, 6, 10, 15, 21, 28], dtype=int32)" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.cumsum()# 后一个元素值是其位置前面元素的和" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 在多维数组中,累加函数(如cumsum)返回的是同样大小的数组,但是会根据每个低维的切片沿着标记轴计算部分聚类:" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0, 1, 2],\n", " [3, 4, 5],\n", " [6, 7, 8]])" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.arange(9).reshape(3,3)\n", "arr" ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2],\n", " [ 3, 5, 7],\n", " [ 9, 12, 15]], dtype=int32)" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.cumsum(axis=0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 表4-5列出了全部的基本数组统计方法" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "![1](https://upload-images.jianshu.io/upload_images/7178691-a6c6df3ca8e0b98e.png)\n", "![2](https://upload-images.jianshu.io/upload_images/7178691-866fcde885b1d357.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 用于布尔型数组的方法\n", "* 在上面这些方法中,布尔值会被强制转换为1(True)和0(False)。因此,sum经常被用来对布尔型数组中的True值计数:" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "47" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.random.randn(100)\n", "(arr > 0).sum()# 用于计算arr中大于零的值的个数" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* any用于测试数组中是否存在一个或多个True,而all则检查数组中所有值是否都是True:\n", "* 这两个方法也能用于非布尔型数组,所有非0元素将会被当做True。" ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bools = np.array([True,False,True,True,False])\n", "bools.any()" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bools.all()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 排序" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 1.02405122, -0.6697276 , -0.21968547, -0.72406011, 0.1925428 ,\n", " 0.07636505])" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.random.randn(6)\n", "arr" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([-0.72406011, -0.6697276 , -0.21968547, 0.07636505, 0.1925428 ,\n", " 1.02405122])" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr.sort()\n", "arr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 多维数组可以在任何一个轴向上进行排序,只需将轴编号传给sort即可:" ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 0.71087653, 0.07317549, -0.5169106 ],\n", " [-0.79285818, 0.2562815 , 0.52803003],\n", " [ 0.32807348, -0.15266984, -0.65285013],\n", " [-0.06786031, 2.69566713, -0.44730785],\n", " [ 0.42299494, 0.95989471, -0.51781003]])" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.random.randn(5,3)\n", "arr" ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [], "source": [ "arr.sort(axis=1)# 按行排序" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-0.5169106 , 0.07317549, 0.71087653],\n", " [-0.79285818, 0.2562815 , 0.52803003],\n", " [-0.65285013, -0.15266984, 0.32807348],\n", " [-0.44730785, -0.06786031, 2.69566713],\n", " [-0.51781003, 0.42299494, 0.95989471]])" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* 顶级方法np.sort返回的是数组的已排序副本,而就地排序则会修改数组本身。计算数组分位数最简单的办法是对其进行排序,然后选取特定位置的值:" ] }, { "cell_type": "code", "execution_count": 77, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-1.7547949961699647" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "large_arr = np.random.randn(1000)\n", "large_arr.sort()\n", "large_arr[int(0.05*len(large_arr))]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 唯一化以及其它的集合逻辑\n", "* np.unique,它用于找出数组中的唯一值并返回已排序的结果:" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['Bob', 'Joe', 'Will'], dtype='