{ "cells": [ { "cell_type": "markdown", "id": "917503c7", "metadata": {}, "source": [ "# 集合" ] }, { "cell_type": "markdown", "id": "8e16c824", "metadata": {}, "source": [ "之前看到的列表和字符串都是一种有序序列,而集合是一种无序的序列。\n", "\n", "因为集合是无序的,所以当集合中存在两个同样的元素的时候,Python只会保存其中的一个(唯一性);同时为了确保其中不包含同样的元素,集合中放入的元素只能是不可变的对象(确定性)。" ] }, { "cell_type": "markdown", "id": "a3fe86ff", "metadata": {}, "source": [ "## 集合生成" ] }, { "cell_type": "markdown", "id": "7f76bfed", "metadata": {}, "source": [ "可以用`set()`函数来显示的生成空集合:" ] }, { "cell_type": "code", "execution_count": 1, "id": "9a36d941", "metadata": {}, "outputs": [], "source": [ "a = set()" ] }, { "cell_type": "code", "execution_count": 2, "id": "09823fe0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "set" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(a)" ] }, { "cell_type": "markdown", "id": "f7041dbc", "metadata": {}, "source": [ "也可以使用一个列表来初始化一个集合:" ] }, { "cell_type": "code", "execution_count": 3, "id": "5abd2789", "metadata": {}, "outputs": [], "source": [ "a = set([1, 2, 3, 1])" ] }, { "cell_type": "code", "execution_count": 4, "id": "2bbf546c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "id": "e944847d", "metadata": {}, "source": [ "集合会自动去除重复元素 1。\n", "\n", "集合中的元素是用大括号{}包含起来的,这意味着可以用{}的形式来创建集合:" ] }, { "cell_type": "code", "execution_count": 5, "id": "36f0069a", "metadata": {}, "outputs": [], "source": [ "a = {1, 2, 3, 1}" ] }, { "cell_type": "code", "execution_count": 6, "id": "7aa135d8", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "id": "4c7a4bd6", "metadata": {}, "source": [ "创建空集合的时候只能用`set()`函数来创建,因为在Python中`{}`创建的是一个空的字典:" ] }, { "cell_type": "code", "execution_count": 7, "id": "3c931eaf", "metadata": {}, "outputs": [], "source": [ "s = {}" ] }, { "cell_type": "code", "execution_count": 8, "id": "b5d36119", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(s)" ] }, { "cell_type": "markdown", "id": "62582fc6", "metadata": {}, "source": [ "## 集合操作" ] }, { "cell_type": "code", "execution_count": 9, "id": "001afdaa", "metadata": {}, "outputs": [], "source": [ "a = {1, 2, 3, 4}" ] }, { "cell_type": "code", "execution_count": 10, "id": "d2c03739", "metadata": {}, "outputs": [], "source": [ "b = {3, 4, 5, 6}" ] }, { "cell_type": "markdown", "id": "07586eb4", "metadata": {}, "source": [ "两个集合的并,返回包含两个集合所有元素的集合(去除重复)。可以用方法 `a.union(b)` 或者操作 `a | b` 实现:" ] }, { "cell_type": "code", "execution_count": 11, "id": "5d750f30", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6}" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.union(b)" ] }, { "cell_type": "code", "execution_count": 12, "id": "b0e4cd8b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a | b" ] }, { "cell_type": "markdown", "id": "026a0ad7", "metadata": {}, "source": [ "两个集合的交,返回包含两个集合共有元素的集合。可以用方法 `a.intersection(b)` 或者操作 `a & b` 实现:" ] }, { "cell_type": "code", "execution_count": 13, "id": "17bf0d05", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{3, 4}" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.intersection(b)" ] }, { "cell_type": "code", "execution_count": 14, "id": "069791d6", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{3, 4}" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a & b" ] }, { "cell_type": "code", "execution_count": 15, "id": "78788eb3", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{3, 4}" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b & a" ] }, { "cell_type": "markdown", "id": "3efbcd55", "metadata": {}, "source": [ "a 和 b 的差集,返回只在 a 不在 b 的元素组成的集合。可以用方法 `a.difference(b)` 或者操作 `a - b` 实现:" ] }, { "cell_type": "code", "execution_count": 16, "id": "70bd8d5c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2}" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a - b" ] }, { "cell_type": "code", "execution_count": 17, "id": "58ec24a2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{5, 6}" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b - a" ] }, { "cell_type": "code", "execution_count": 18, "id": "e1db9069", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2}" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.difference(b)" ] }, { "cell_type": "code", "execution_count": 19, "id": "dfa41597", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{5, 6}" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.difference(a)" ] }, { "cell_type": "markdown", "id": "b3327b1d", "metadata": {}, "source": [ "a 和b 的对称差集,返回在 a 或在 b 中,但是不同时在 a 和 b 中的元素组成的集合。可以用方法 `a.symmetric_difference(b)` 或者操作 `a ^ b` 实现(异或操作符):" ] }, { "cell_type": "code", "execution_count": 20, "id": "86f598cb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 5, 6}" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.symmetric_difference(b)" ] }, { "cell_type": "code", "execution_count": 21, "id": "7da81797", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 5, 6}" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.symmetric_difference(a)" ] }, { "cell_type": "code", "execution_count": 22, "id": "7fad485b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 5, 6}" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b ^ a" ] }, { "cell_type": "code", "execution_count": 23, "id": "d26e579c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 5, 6}" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a ^ b" ] }, { "cell_type": "markdown", "id": "5958a36a", "metadata": {}, "source": [ "## 包含关系" ] }, { "cell_type": "code", "execution_count": 24, "id": "6d6041f1", "metadata": {}, "outputs": [], "source": [ "a = {1, 2, 3}" ] }, { "cell_type": "code", "execution_count": 25, "id": "fc550b35", "metadata": {}, "outputs": [], "source": [ "b = {1, 2}" ] }, { "cell_type": "markdown", "id": "60555390", "metadata": {}, "source": [ "`.issubset()` 方法或者`b <= a`判断子集:" ] }, { "cell_type": "code", "execution_count": 26, "id": "53ca6daa", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b.issubset(a)" ] }, { "cell_type": "code", "execution_count": 27, "id": "0e8e5e42", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b <= a" ] }, { "cell_type": "markdown", "id": "14d162c3", "metadata": {}, "source": [ "与之对应,也可以用`.issuperset()`方法或者`a >= b`来判断:" ] }, { "cell_type": "code", "execution_count": 28, "id": "925e8b5d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.issuperset(b)" ] }, { "cell_type": "code", "execution_count": 29, "id": "ed202f5e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a >= b" ] }, { "cell_type": "markdown", "id": "c66308cf", "metadata": {}, "source": [ "操作符可以用来判断真子集:" ] }, { "cell_type": "code", "execution_count": 30, "id": "791a6235", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b > a" ] }, { "cell_type": "code", "execution_count": 31, "id": "eaf27b96", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a > b" ] }, { "cell_type": "markdown", "id": "d1346d70", "metadata": {}, "source": [ "## 集合方法" ] }, { "cell_type": "markdown", "id": "580a76b9", "metadata": {}, "source": [ "`.add()`方法添加单个元素:" ] }, { "cell_type": "code", "execution_count": 32, "id": "1e7c39cc", "metadata": {}, "outputs": [], "source": [ "t = {1, 2, 3}" ] }, { "cell_type": "code", "execution_count": 33, "id": "52472d19", "metadata": {}, "outputs": [], "source": [ "t.add(5)" ] }, { "cell_type": "code", "execution_count": 34, "id": "296d57d2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 5}" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t" ] }, { "cell_type": "markdown", "id": "030366cb", "metadata": {}, "source": [ "如果添加的是已有元素,集合不改变:" ] }, { "cell_type": "code", "execution_count": 35, "id": "65edef32", "metadata": {}, "outputs": [], "source": [ "t.add(3)" ] }, { "cell_type": "code", "execution_count": 36, "id": "c95cc4cd", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 5}" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t" ] }, { "cell_type": "markdown", "id": "c952e685", "metadata": {}, "source": [ "`.update()`方法更新多个元素:" ] }, { "cell_type": "code", "execution_count": 37, "id": "cbd3b751", "metadata": {}, "outputs": [], "source": [ "t.update([5, 6, 7])" ] }, { "cell_type": "code", "execution_count": 38, "id": "7e59b43b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 5, 6, 7}" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t" ] }, { "cell_type": "markdown", "id": "5f9a6312", "metadata": {}, "source": [ "## 不可变集合\n", "\n", "对应于元组与列表的关系,对于集合,Python提供了一种叫做不可变集合的数据结构。\n", "\n", "不可变集合使用frozenset()函数来进行创建:" ] }, { "cell_type": "code", "execution_count": 39, "id": "a88ef871", "metadata": {}, "outputs": [], "source": [ "s = frozenset([1, 2, 3, 'a', 1])" ] }, { "cell_type": "code", "execution_count": 40, "id": "dcacfceb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "frozenset({1, 2, 3, 'a'})" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "s" ] }, { "cell_type": "markdown", "id": "1d488c0e", "metadata": {}, "source": [ "与集合不同的是,不可变集合一旦创建就不可以改变。\n", "\n", "不可变集合的一个主要应用是用来作为字典的键。例如,用一个字典来记录两个城市之间的距离:" ] }, { "cell_type": "code", "execution_count": 41, "id": "c49dedc2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{frozenset({'Los Angeles', 'New York'}): 2498,\n", " frozenset({'Austin', 'Los Angeles'}): 1233,\n", " frozenset({'Austin', 'New York'}): 1515}" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flight_distance = {}\n", "city_pair = frozenset(['Los Angeles', 'New York'])\n", "flight_distance[city_pair] = 2498\n", "flight_distance[frozenset(['Austin', 'Los Angeles'])] = 1233\n", "flight_distance[frozenset(['Austin', 'New York'])] = 1515\n", "flight_distance" ] }, { "cell_type": "markdown", "id": "47072697", "metadata": {}, "source": [ "由于集合不分顺序,所以不同顺序不会影响查阅结果:" ] }, { "cell_type": "code", "execution_count": 42, "id": "81eb8df1", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1515" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flight_distance[frozenset(['New York','Austin'])]" ] }, { "cell_type": "code", "execution_count": 43, "id": "04c0a781", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1515" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "flight_distance[frozenset(['Austin','New York'])]" ] }, { "cell_type": "code", "execution_count": null, "id": "a21639af", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.10" } }, "nbformat": 4, "nbformat_minor": 5 }