{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 7. Set\n", "\n", "**Sets** are unordered collections of unique elements in Python. They are similar to lists but with the key difference that they do not allow duplicate elements. Sets are often used for membership testing, removing duplicates from a list, and performing set operations like union, intersection, and difference.\n", "\n", "We'll learn about the following topics:\n", "\n", " - [7.1. Creating Sets](#Creating_Sets)\n", " - [7.2. Set Properties](#Set_Properties)\n", " - [7.3. Set Operators](#Set_Operators)\n", " - [7.4. Built-in Set Methods](#Builtin_Set_Methods)\n", " - [7.5. Frozen Sets](#Frozen_Sets)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p align=\"center\">\n", " <img width=\"550\" height=\"300\" src=\"https://realpython.com/cdn-cgi/image/width=960,format=auto/https://files.realpython.com/media/Sets-in-Python_Watermarked.cd8d2e9563c3.jpg\">\n", "</p>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<table>\n", " <thead>\n", " <tr>\n", " <th>Name</th>\n", " <th>Type in Python</th>\n", " <th>Description</th>\n", " <th>Example</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <td>Sets</td>\n", " <td>set</td>\n", " <td>unordered collection of unique elements.</td>\n", " <td>{10, 'hello'}</td> \n", " </tr>\n", " </tbody>\n", "</table>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a name='Creating_Sets'></a>\n", "\n", "## 7.1. Creating Sets:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Sets in Python are created using curly braces `{}`. You can include elements within the braces, separated by commas." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "set1 = {10, \"hello\", 3.14, True}" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "set" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(set1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also a set can be created with the built-in `set()` function." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "a = set()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "set" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(a)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "# We add to sets with the add() method\n", "a.add(89)\n", "a.add('hello')\n", "a.add(2.0)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, 89, 'hello'}" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a name='Set_Properties'></a>\n", "\n", "## 7.2. Set Properties:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Unique Elements**" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "#Create a list with repeated items\n", "lst = [1,1,2,2,3,4,5,6,1,1]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{1, 2, 3, 4, 5, 6}" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Cast as set to get unique values\n", "set(lst)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After converting the list to set with `set()` function, only unique items have remained. That's because a set is only concerned with unique elements." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Unordered**" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "a = {89, 'hello', 2.0}\n", "b = {2.0, 89, 'hello'}" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a == b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a name='Set_Operators'></a>\n", "\n", "## 7.3. Set Operators:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "c = set()\n", "\n", "c.add('world')\n", "c.add('20')\n", "c.add(89)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Union**: Sets union can be performed with the `|` operator. The union of two sets contains all elements that are in either set or both sets." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, '20', 89, 'hello', 'world'}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a | c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Intersection**: Sets intersection can be performed with the `&` operator. The intersection of two sets contains only the elements that are present in both sets. " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{89}" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a & c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Difference**: `a - c` return the set of all elements that are in a but not in c." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, 'hello'}" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a - c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Symmetric Difference**: Sets symmetric difference can be performed with the `^` operator. The symmetric difference of two sets contains the elements that are in either set but not in both sets." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, '20', 'hello', 'world'}" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a ^ c" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Subset**: The `<=` operator is used to check if one set is a subset of another set in Python." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d = set()\n", "d.add('hello')\n", "\n", "d <= a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Proper Subset**: The proper subset relationship between two sets can be determined using the `<` operator in Python. A proper subset is the same as a subset, except that the sets can’t be identical. While a set is considered a subset of itself, it is not a proper subset of itself." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a < a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Superset**: The `>=` operator is used to check if one set is a superset of another. A superset contains all the elements of another set and possibly more." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a >= d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- **Proper Superset**: The `>` operator is used to check if one set is a proper superset of another set in Python. A proper superset is the same as a superset, except that the sets can’t be identical." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a > d" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a name='Builtin_Set_Methods'></a>\n", "\n", "## 7.4. Built-in Set Methods:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<table>\n", " <thead>\n", " <tr>\n", " <th>Method</th>\n", " <th>Description</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <td>union(set)</td>\n", " <td>merge sets and keep unique elements from all sets</td>\n", " </tr> \n", " <tr>\n", " <td>intersection(set)</td>\n", " <td>return the set of elements present in all sets</td>\n", " </tr>\n", " <tr>\n", " <td>difference(set)</td>\n", " <td>x1.difference(x2) return the set of all elements that are in x1 but not in x2</td>\n", " </tr> \n", " <tr>\n", " <td>symmetric_difference(set)</td>\n", " <td>return the set of all elements in either sets</td>\n", " </tr>\n", " <tr>\n", " <td>isdisjoint(set)</td>\n", " <td>determines whether or not two sets have any elements in common. returns True if they have no elements in common</td>\n", " </tr>\n", " <tr>\n", " <td>issubset(set)</td>\n", " <td>determine whether one set is a subset of the other</td>\n", " </tr>\n", " <tr>\n", " <td>issuperset(set)</td>\n", " <td>set a is considered as the superset of b, if all the elements of set b are the elements of set a</td>\n", " </tr>\n", " <tr>\n", " <td>update(set)</td>\n", " <td>adds any elements in new set that our set does not already have</td>\n", " </tr>\n", " <tr>\n", " <td>intersection_update(set)</td>\n", " <td>retain only elements found in both</td>\n", " </tr>\n", " <tr>\n", " <td>difference_update(set)</td>\n", " <td>it's like difference method except it updates the original set</td>\n", " </tr> \n", " <tr>\n", " <td>symmetric_difference_update(set)</td>\n", " <td>it's like symmetric difference method except it updates the original set</td>\n", " </tr>\n", " <tr>\n", " <td>add(set)</td>\n", " <td>add an item to the set</td>\n", " </tr>\n", " <tr>\n", " <td>remove(m)</td>\n", " <td>remove m from the set</td>\n", " </tr>\n", " <tr>\n", " <td>discard(m)</td>\n", " <td>remove m from the set. However, if m is not in set, discard does nothing instead of raising an exception</td>\n", " </tr> \n", " <tr>\n", " <td>pop()</td>\n", " <td>removes a random element from the set</td>\n", " </tr>\n", " <tr>\n", " <td>clear()</td>\n", " <td>removes all elements from the set</td>\n", " </tr> \n", " </tbody>\n", "</table>" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p align=\"center\">\n", " <img width=\"300\" height=\"100\" src=\"https://files.realpython.com/media/t.ca57b915cec6.png\">\n", "</p> " ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, '20', 89, 'hello', 'world'}" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.union(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "There is a subtle difference between `|` operator and `.union()`. When you use the `|` operator, both operands must be sets. The `.union()` method, on the other hand, will take any iterable as an argument, convert it to a set, and then perform the union." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, 28, 89, 'a', 'b', 'hello'}" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.union(('a', 'b', 28))" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "unsupported operand type(s) for |: 'set' and 'tuple'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_10548\\564823127.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0ma\u001b[0m \u001b[1;33m|\u001b[0m \u001b[1;33m(\u001b[0m\u001b[1;34m'a'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'b'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m28\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for |: 'set' and 'tuple'" ] } ], "source": [ "a | ('a', 'b', 28)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p align=\"center\">\n", " <img width=\"300\" height=\"100\" src=\"https://files.realpython.com/media/t.9c6d33717cdc.png\">\n", "</p> " ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{89}" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.intersection(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p align=\"center\">\n", " <img width=\"300\" height=\"100\" src=\"https://files.realpython.com/media/t.a90b4c323d99.png\">\n", "</p> " ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, 'hello'}" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.difference(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<p align=\"center\">\n", " <img width=\"300\" height=\"100\" src=\"https://files.realpython.com/media/t.604de51646cc.png\">\n", "</p>" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, '20', 'hello', 'world'}" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.symmetric_difference(c)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.isdisjoint(c)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "d.issubset(a)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.issuperset(d)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{2.0, 89, 'A', 'a', 'hello'}" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.update(['a', 'A'])\n", "\n", "a" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "#permanently changes the set\n", "a.intersection_update(d)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'hello'}" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "a.remove('hello')" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "set()" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [], "source": [ "a.discard('hello')" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "ename": "KeyError", "evalue": "'hello'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mKeyError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_10548\\918437660.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0ma\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mremove\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'hello'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mKeyError\u001b[0m: 'hello'" ] } ], "source": [ "a.remove('hello')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<a name='Frozen_Sets'></a>\n", "\n", "## 7.5. Frozen Sets:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Python provides another built-in type called a frozenset, which is in all respects exactly like a set, except that a frozenset is immutable. " ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [], "source": [ "x = frozenset(['a', 45, '78'])" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "frozenset({45, '78', 'a'})" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Any attempt to modify a frozenset will fail." ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "ename": "AttributeError", "evalue": "'frozenset' object has no attribute 'add'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mAttributeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m~\\AppData\\Local\\Temp\\ipykernel_10548\\3997461639.py\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mx\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0madd\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'b'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mAttributeError\u001b[0m: 'frozenset' object has no attribute 'add'" ] } ], "source": [ "x.add('b')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 4 }