{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 索引和分片" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 索引" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "对于一个有序序列,可以通过索引的方法来访问对应位置的值。字符串便是一个有序序列的例子,**Python**使用 `[]` 来对有序序列进行索引。" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'h'" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s = \"hello world\"\n", "s[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Python**中索引是从 `0` 开始的,所以索引 `0` 对应与序列的第 `1` 个元素。为了得到第 `5` 个元素,需要使用索引值 `4` 。" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'o'" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "除了正向索引,**Python**还引入了负索引值的用法,即从后向前开始计数,例如,索引 `-2` 表示倒数第 `2` 个元素:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'l'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[-2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "单个索引大于等于字符串的长度时,会报错:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "ename": "IndexError", "evalue": "string index out of range", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0ms\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m11\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mIndexError\u001b[0m: string index out of range" ] } ], "source": [ "s[11]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 分片" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "分片用来从序列中提取出想要的子序列,其用法为:\n", "\n", " var[lower:upper:step]\n", "\n", "其范围包括 `lower` ,但不包括 `upper` ,即 `[lower, upper)`, `step` 表示取值间隔大小,如果没有默认为`1`。" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'hello world'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'el'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[1:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "分片中包含的元素的个数为 `3-1=2` 。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "也可以使用负索引来指定分片的范围:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'ello wor'" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[1:-2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "包括索引 `1` 但是不包括索引 `-2` 。\n", "\n", "lower和upper可以省略,省略lower意味着从开头开始分片,省略upper意味着一直分片到结尾。" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'hel'" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[:3]" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'rld'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[-3:]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'hello world'" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "每隔两个取一个值:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'hlowrd'" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[::2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "当step的值为负时,省略lower意味着从结尾开始分片,省略upper意味着一直分片到开头。" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'dlrow olleh'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[::-1]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "当给定的upper超出字符串的长度(注意:因为不包含upper,所以可以等于)时,Python并不会报错,不过只会计算到结尾。" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "'hello world'" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s[:100]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 使用“0”作为索引开头的原因" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 使用`[low, up)`形式的原因" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设需要表示字符串 `hello` 中的内部子串 `el` :\n", "\n", "|方式|`[low, up)`|`(low, up]`|`(lower, upper)`|`[lower, upper]`\n", "|--|--|--|--|--|\n", "|表示|`[1,3)`|`(0,2]`|`(0,3)`|`[1,2]`\n", "|序列长度|`up - low`|`up - low`|`up - low - 1`|`up - low + 1`\n", "\n", "对长度来说,前两种方式比较好,因为不需要烦人的加一减一。\n", "\n", "现在只考虑前两种方法,假设要表示字符串`hello`中的从头开始的子串`hel`:\n", "\n", "|方式|`[low, up)`|`(low, up]`\n", "|--|--|\n", "|表示|`[0,3)`|`(-1,2]`|\n", "|序列长度|`up - low`|`up - low`|\n", "\n", "第二种表示方法从`-1`开始,不是很好,所以选择使用第一种`[low, up)`的形式。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 使用0-base的形式" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> Just too beautiful to ignore. \n", "----Guido van Rossum" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "两种简单的情况:\n", "\n", "- 从头开始的n个元素;\n", " - 使用0-base:`[0, n)`\n", " - 使用1-base:`[1, n+1)`\n", "\n", "- 第`i+1`个元素到第`i+n`个元素。\n", " - 使用0-base:`[i, n+i)`\n", " - 使用1-base:`[i+1, n+i+1)`\n", "\n", "1-base有个`+1`部分,所以不推荐。\n", "\n", "综合这两种原因,**Python**使用0-base的方法来进行索引。" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.11" } }, "nbformat": 4, "nbformat_minor": 0 }