{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 集合" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "之前看到的列表和字符串都是一种有序序列,而集合 `set` 是一种无序的序列。\n", "\n", "因为集合是无序的,所以当集合中存在两个同样的元素的时候,Python只会保存其中的一个(唯一性);同时为了确保其中不包含同样的元素,集合中放入的元素只能是不可变的对象(确定性)。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 集合生成" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以用`set()`函数来显示的生成空集合:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "set" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = set()\n", "type(a)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "也可以使用一个列表来初始化一个集合:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3}" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = set([1, 2, 3, 1])\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "集合会自动去除重复元素 `1`。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "可以看到,集合中的元素是用大括号`{}`包含起来的,这意味着可以用`{}`的形式来创建集合:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3}" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = {1, 2, 3, 1}\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "但是创建空集合的时候只能用`set`来创建,因为在Python中`{}`创建的是一个空的字典:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "dict" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s = {}\n", "type(s)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 集合操作" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设有这样两个集合:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = {1, 2, 3, 4}\n", "b = {3, 4, 5, 6}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 并" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "两个集合的并,返回包含两个集合所有元素的集合(去除重复)。\n", "\n", "可以用方法 `a.union(b)` 或者操作 `a | b` 实现。" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.union(b)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6}" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.union(a)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a | b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 交" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "两个集合的交,返回包含两个集合共有元素的集合。\n", "\n", "可以用方法 `a.intersection(b)` 或者操作 `a & b` 实现。" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{3, 4}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.intersection(b)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{3, 4}" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.intersection(a)" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{3, 4}" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a & b" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "set([3, 4])\n" ] } ], "source": [ "print(a & b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "注意:一般使用print打印set的结果与表示方法并不一致。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 差" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`a` 和 `b` 的差集,返回只在 `a` 不在 `b` 的元素组成的集合。\n", "\n", "可以用方法 `a.difference(b)` 或者操作 `a - b` 实现。" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2}" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.difference(b)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2}" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a - b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "注意,`a - b` 与 `b - a`并不一样,`b - a` 返回的是返回 b 不在 a 的元素组成的集合:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{5, 6}" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.difference(a)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{5, 6}" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b - a " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 对称差" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`a` 和`b` 的对称差集,返回在 `a` 或在 `b` 中,但是不同时在 `a` 和 `b` 中的元素组成的集合。\n", "\n", "可以用方法 `a.symmetric_difference(b)` 或者操作 `a ^ b` 实现(异或操作符)。" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 5, 6}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.symmetric_difference(b)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 5, 6}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.symmetric_difference(a)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 5, 6}" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a ^ b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 包含关系" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "假设现在有这样两个集合:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": true }, "outputs": [], "source": [ "a = {1, 2, 3}\n", "b = {1, 2}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "要判断 `b` 是不是 `a` 的子集,可以用 `b.issubset(a)` 方法,或者更简单的用操作 `b <= a` :" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.issubset(a)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b <= a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "与之对应,也可以用 `a.issuperset(b)` 或者 `a >= b` 来判断:" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.issuperset(b)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a >= b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "方法只能用来测试子集,但是操作符可以用来判断真子集:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a <= a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "自己不是自己的真子集:" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a < a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 集合方法" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `add` 方法向集合添加单个元素" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "跟列表的 `append` 方法类似,用来向集合添加单个元素。\n", "\n", " s.add(a)\n", "\n", "将元素 `a` 加入集合 `s` 中。" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 5}" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t = {1, 2, 3}\n", "t.add(5)\n", "t" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "如果添加的是已有元素,集合不改变:" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 5}" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.add(3)\n", "t" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `update` 方法向集合添加多个元素" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "跟列表的`extend`方法类似,用来向集合添加多个元素。\n", "\n", " s.update(seq)\n", "\n", "将`seq`中的元素添加到`s`中。" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 5, 6, 7}" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.update([5, 6, 7])\n", "t" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### `remove` 方法移除单个元素" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " s.remove(ob)\n", "\n", "从集合`s`中移除元素`ob`,如果不存在会报错。" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{2, 3, 5, 6, 7}" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.remove(1)\n", "t" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false }, "outputs": [ { "ename": "KeyError", "evalue": "10", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mKeyError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mt\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mremove\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m10\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mKeyError\u001b[0m: 10" ] } ], "source": [ "t.remove(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### pop方法弹出元素" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "由于集合没有顺序,不能像列表一样按照位置弹出元素,所以`pop` 方法删除并返回集合中任意一个元素,如果集合中没有元素会报错。" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{3, 5, 6, 7}" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t.pop()" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "set([3, 5, 6, 7])\n" ] } ], "source": [ "print t" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "ename": "KeyError", "evalue": "'pop from an empty set'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mKeyError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[0ms\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mset\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;31m# 报错\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0ms\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mpop\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mKeyError\u001b[0m: 'pop from an empty set'" ] } ], "source": [ "s = set()\n", "# 报错\n", "s.pop()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### discard 方法" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "作用与 `remove` 一样,但是当元素在集合中不存在的时候不会报错。" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": true }, "outputs": [], "source": [ "t.discard(3)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{5, 6, 7}" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "不存在的元素不会报错:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": true }, "outputs": [], "source": [ "t.discard(20)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "{5, 6, 7}" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### difference_update方法" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " a.difference_update(b)\n", "\n", "从a中去除所有属于b的元素:" ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.10" } }, "nbformat": 4, "nbformat_minor": 0 }