{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "917503c7",
   "metadata": {},
   "source": [
    "# 集合"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8e16c824",
   "metadata": {},
   "source": [
    "之前看到的列表和字符串都是一种有序序列,而集合是一种无序的序列。\n",
    "\n",
    "因为集合是无序的,所以当集合中存在两个同样的元素的时候,Python只会保存其中的一个(唯一性);同时为了确保其中不包含同样的元素,集合中放入的元素只能是不可变的对象(确定性)。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a3fe86ff",
   "metadata": {},
   "source": [
    "## 集合生成"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7f76bfed",
   "metadata": {},
   "source": [
    "可以用`set()`函数来显示的生成空集合:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "9a36d941",
   "metadata": {},
   "outputs": [],
   "source": [
    "a = set()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "09823fe0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "set"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f7041dbc",
   "metadata": {},
   "source": [
    "也可以使用一个列表来初始化一个集合:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "5abd2789",
   "metadata": {},
   "outputs": [],
   "source": [
    "a = set([1, 2, 3, 1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "2bbf546c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 3}"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e944847d",
   "metadata": {},
   "source": [
    "集合会自动去除重复元素 1。\n",
    "\n",
    "集合中的元素是用大括号{}包含起来的,这意味着可以用{}的形式来创建集合:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "36f0069a",
   "metadata": {},
   "outputs": [],
   "source": [
    "a = {1, 2, 3, 1}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "7aa135d8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 3}"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4c7a4bd6",
   "metadata": {},
   "source": [
    "创建空集合的时候只能用`set()`函数来创建,因为在Python中`{}`创建的是一个空的字典:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "3c931eaf",
   "metadata": {},
   "outputs": [],
   "source": [
    "s = {}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "b5d36119",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dict"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "type(s)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "62582fc6",
   "metadata": {},
   "source": [
    "## 集合操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "001afdaa",
   "metadata": {},
   "outputs": [],
   "source": [
    "a = {1, 2, 3, 4}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "d2c03739",
   "metadata": {},
   "outputs": [],
   "source": [
    "b = {3, 4, 5, 6}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "07586eb4",
   "metadata": {},
   "source": [
    "两个集合的并,返回包含两个集合所有元素的集合(去除重复)。可以用方法 `a.union(b)` 或者操作 `a | b` 实现:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "5d750f30",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 3, 4, 5, 6}"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.union(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "b0e4cd8b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 3, 4, 5, 6}"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a | b"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "026a0ad7",
   "metadata": {},
   "source": [
    "两个集合的交,返回包含两个集合共有元素的集合。可以用方法 `a.intersection(b)` 或者操作 `a & b` 实现:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "17bf0d05",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{3, 4}"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.intersection(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "069791d6",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{3, 4}"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a & b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "78788eb3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{3, 4}"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b & a"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3efbcd55",
   "metadata": {},
   "source": [
    "a 和 b 的差集,返回只在 a 不在 b 的元素组成的集合。可以用方法 `a.difference(b)` 或者操作 `a - b` 实现:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "70bd8d5c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2}"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a - b"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "58ec24a2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{5, 6}"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b - a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "e1db9069",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2}"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.difference(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "dfa41597",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{5, 6}"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b.difference(a)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b3327b1d",
   "metadata": {},
   "source": [
    "a 和b 的对称差集,返回在 a 或在 b 中,但是不同时在 a 和 b 中的元素组成的集合。可以用方法 `a.symmetric_difference(b)` 或者操作 `a ^ b` 实现(异或操作符):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "86f598cb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 5, 6}"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.symmetric_difference(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "7da81797",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 5, 6}"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b.symmetric_difference(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "7fad485b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 5, 6}"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b ^ a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "d26e579c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 5, 6}"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a ^ b"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5958a36a",
   "metadata": {},
   "source": [
    "## 包含关系"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "6d6041f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "a = {1, 2, 3}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "fc550b35",
   "metadata": {},
   "outputs": [],
   "source": [
    "b = {1, 2}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "60555390",
   "metadata": {},
   "source": [
    "`.issubset()` 方法或者`b <= a`判断子集:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "53ca6daa",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b.issubset(a)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "0e8e5e42",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b <= a"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14d162c3",
   "metadata": {},
   "source": [
    "与之对应,也可以用`.issuperset()`方法或者`a >= b`来判断:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "925e8b5d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a.issuperset(b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "id": "ed202f5e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a >= b"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c66308cf",
   "metadata": {},
   "source": [
    "操作符可以用来判断真子集:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "id": "791a6235",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "b > a"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "id": "eaf27b96",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 31,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "a > b"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d1346d70",
   "metadata": {},
   "source": [
    "## 集合方法"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "580a76b9",
   "metadata": {},
   "source": [
    "`.add()`方法添加单个元素:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "1e7c39cc",
   "metadata": {},
   "outputs": [],
   "source": [
    "t = {1, 2, 3}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "id": "52472d19",
   "metadata": {},
   "outputs": [],
   "source": [
    "t.add(5)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "id": "296d57d2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 3, 5}"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "030366cb",
   "metadata": {},
   "source": [
    "如果添加的是已有元素,集合不改变:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "65edef32",
   "metadata": {},
   "outputs": [],
   "source": [
    "t.add(3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "c95cc4cd",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 3, 5}"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c952e685",
   "metadata": {},
   "source": [
    "`.update()`方法更新多个元素:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "cbd3b751",
   "metadata": {},
   "outputs": [],
   "source": [
    "t.update([5, 6, 7])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "7e59b43b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{1, 2, 3, 5, 6, 7}"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f9a6312",
   "metadata": {},
   "source": [
    "## 不可变集合\n",
    "\n",
    "对应于元组与列表的关系,对于集合,Python提供了一种叫做不可变集合的数据结构。\n",
    "\n",
    "不可变集合使用frozenset()函数来进行创建:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "a88ef871",
   "metadata": {},
   "outputs": [],
   "source": [
    "s = frozenset([1, 2, 3, 'a', 1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "id": "dcacfceb",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "frozenset({1, 2, 3, 'a'})"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "s"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1d488c0e",
   "metadata": {},
   "source": [
    "与集合不同的是,不可变集合一旦创建就不可以改变。\n",
    "\n",
    "不可变集合的一个主要应用是用来作为字典的键。例如,用一个字典来记录两个城市之间的距离:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "id": "c49dedc2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{frozenset({'Los Angeles', 'New York'}): 2498,\n",
       " frozenset({'Austin', 'Los Angeles'}): 1233,\n",
       " frozenset({'Austin', 'New York'}): 1515}"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "flight_distance = {}\n",
    "city_pair = frozenset(['Los Angeles', 'New York'])\n",
    "flight_distance[city_pair] = 2498\n",
    "flight_distance[frozenset(['Austin', 'Los Angeles'])] = 1233\n",
    "flight_distance[frozenset(['Austin', 'New York'])] = 1515\n",
    "flight_distance"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "47072697",
   "metadata": {},
   "source": [
    "由于集合不分顺序,所以不同顺序不会影响查阅结果:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "id": "81eb8df1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1515"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "flight_distance[frozenset(['New York','Austin'])]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "id": "04c0a781",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1515"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "flight_distance[frozenset(['Austin','New York'])]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a21639af",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}