{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 6. Object-Oriented Programming III\n", " \n", "Third and final session on OOP. In this session we'll cover the concept of functors and how to build custom iterators." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Functor\n", "A functor is an object that can be called as though it were a function.\n", "Any class that has a `__call__` special method is a functor.\n", "\n", "So, when is it then useful to have a class behaving as a function?\n", "The answer is, when you need it to be able to remember. A function is in this sense quite dumb and cannot bring with it any kind of memory. It would have to pass out it's findings as an output and in again as an argument or inplace changing a variable (not good design patterns as it disturbs the ussage of the function and lowers the abstraction level/increases complexity at ussage)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Functor example\n", "\n", "The examples considers how to clean a text string polluted with various special charactors." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# string to be stripped of special characters\n", "noisy_str = 't#hi?s i\"s m¤y n%ois=y& str/in(g'\n", "\n", "# list of special chars to be stripped for\n", "special_chars = ['!', \"'\", '\"', '#', '¤', '%', '&', '/', '(', ')', '=', '?']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's start by getting the task done using a classic function definition:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "def strip_string(string, special_chars):\n", " clean_str = ''.join([ c for c in string if c not in special_chars ])\n", " return clean_str" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cleaned string is: \"this is my noisy string\"\n" ] } ], "source": [ "clean_str = strip_string(noisy_str, special_chars)\n", "print(f'Cleaned string is: \"{clean_str}\"')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's instead use a functor for the exact same task:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "class StripStringOf():\n", " \n", " def __init__(self, special_chars):\n", " self.special_chars = special_chars\n", " \n", " def __call__(self, string):\n", " clean_str = ''.join([ c for c in string if c not in self.special_chars ])\n", " return clean_str" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cleaned string is: \"this is my noisy string\"\n" ] } ], "source": [ "strip_functor = StripStringOf(special_chars) # initialize functor\n", "clean_str = strip_functor(noisy_str) # use functor as a function\n", "print(f'Cleaned string is: \"{clean_str}\"')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Because of the `__call__` special method we can now use the class instance just as if it's a function. Notice also how we saves the special_chars to self in the class constructor `__init__`, so they're remembered for later use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To underline the functor's ability to remember we'll extend the class with a memory of how many charactors it have stripped:" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "class StripStringOf():\n", " \n", " def __init__(self, special_chars):\n", " self.special_chars = special_chars\n", " self.strip_nos = 0 # number of stripped characters\n", " \n", " def __call__(self, string):\n", " clean_str = ''.join([ c for c in string if c not in self.special_chars ])\n", " self.strip_nos += len(string) - len(clean_str)\n", " return clean_str" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cleaned string is: \"this is my noisy string\"\n", "So far I've stripped 9 special characters\n" ] } ], "source": [ "strip_functor = StripStringOf(special_chars) # initialize functor\n", "clean_str = strip_functor(noisy_str)\n", "print(f'Cleaned string is: \"{clean_str}\"')\n", "print(f\"So far I've stripped {strip_functor.strip_nos} special characters\")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Cleaned string is: \"this is my noisy string\"\n", "So far I've stripped 18 special characters\n" ] } ], "source": [ "clean_str = strip_functor(noisy_str)\n", "print(f'Cleaned string is: \"{clean_str}\"')\n", "print(f\"So far I've stripped {strip_functor.strip_nos} special characters\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So beside calling methods of a class instance you now also know how to make the instance itself callable. A useful concept when you need a slightly more intelligent function with memory capabilities." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Iterator\n", "Iterators are a very memory-efficient way of looping over something you would rather not load into memory at once.\n", "They're refered to as “lazy” because they only compute the next item once you need it and are therefore also ideal if you to not expect to complete a for-loop (e.g. by breaking out once the item you searched for is found).\n", "\n", "Note that an `iterable` is something you can iterate over while an `iterator` is the object that does the actual iterating.\n", "\n", "More and more of the built-in functions like `filter`, `map`, `enumerate`, `zip`, `reversed` partly covered in session 3 now returns iterator objects instead of a list objects, which is good for performance but sometimes requires you to do a `list()` call to force the lazy object to do its computations." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Type = \n", "Memory size = 9088 bytes\n", "Sum = 499500\n" ] } ], "source": [ "# without iterator\n", "long_list = list(range(1000))\n", "print('Type =', type(long_list))\n", "print(f'Memory size = {long_list.__sizeof__()} bytes')\n", "print('Sum =', sum(long_list))" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Type = \n", "Memory size = 32 bytes\n", "Sum = 499500\n" ] } ], "source": [ "# with iterator\n", "long_list_iter = iter(range(1000))\n", "print('Type =', type(long_list_iter))\n", "print(f'Memory size = {long_list_iter.__sizeof__()} bytes')\n", "print('Sum =', sum(long_list_iter))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All of the built-in iterable data structures can be converted into iterators via `iter()`:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iter([1,2,3])" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iter((1,2,3))" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iter('string')" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "iter({'1':2, '2':3})" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This feature is utilised in e.g. `for` loops where an iterator object is automatically created and followed by a `next()` method call for each loop until the iterator is exusted." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generator\n", "A special type of iterator is the generator (and coroutines) previously presented in session 3.\n", "A generator is built by a function that has one or more yield expressions but can also be defined via the more compact generator expression." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "# defines a generator that keeps squaring a number\n", "def squares(num):\n", " while True:\n", " num = num**2\n", " yield num" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Type = \n" ] } ], "source": [ "# initialize generator\n", "generator = squares(2)\n", "print('Type =', type(generator))" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(generator)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "16\n", "256\n", "65536\n", "4294967296\n" ] } ], "source": [ "for x in generator:\n", " print(x)\n", " if x > 1000000: break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generators provide a very convenient way to implement the iterator protocol with very limited code." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Generator subclass\n", "We can confirm that the `Generator class` is a subclass of the `Iterator class` by:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from collections.abc import Iterator, Generator # imports the respective abstract base classes\n", "issubclass(Generator, Iterator)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "issubclass(Iterator, Generator)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also confirm that the generator instance above is an instance of both the Generator and the Iterator class while the long_list_iter is only an instance of the Iterator class" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isinstance(generator, Generator)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isinstance(generator, Iterator)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isinstance(long_list_iter, Generator)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isinstance(long_list_iter, Iterator)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Custom iterator\n", "So in most cases we can either convert one of the typical data structures into an interator or use the very convinient generator.\n", "\n", "However, what if we need to build our own custom iterator not limited by the simple convient implementations above?\n", "\n", "For this we'll have to define it as a class with the special methods `__iter__` and `__next__`. \n", "The squaring generator example from before is reproduced below as a proper iterator:" ] }, { "cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [], "source": [ "class Squares():\n", " \n", " def __init__(self, num):\n", " self.num = num\n", " \n", " def __iter__(self):\n", " return self # returns self if an iterator is requested from a iterator\n", " \n", " def __next__(self):\n", " self.num = self.num**2\n", " return self.num" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "# initialize instance\n", "iterator = Squares(2)" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(iterator)" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "16\n", "256\n", "65536\n", "4294967296\n" ] } ], "source": [ "for x in iterator:\n", " print(x)\n", " if x > 1000000: break" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is e.g. useful if you're trying to loop over content on a remote service where you don't want to download everything up front and you need an intelligent object capable of dealing with connection issues. Maybe you would like it to try to reconnect 3 times before it throws an ConnectionLost Exception." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercises\n", "*Remember to keep using git for version control of your code. Preferably this should become a new good habit of yours.*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 1\n", "To get some hands-on expereince with functors you shall now implement your own Accumulator class.\n", "Instances of this class shall continuously be accumulating all incomming function call arguments. The remembered current sum should be stored in a private attribute and made available as a property.\n", "\n", "Below is illustrated how your functor should behaive:" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [], "source": [ "from solution_module import Accumulator" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": [ "# initializing two instances with different starting values\n", "adder_A = Accumulator(start=2)\n", "adder_B = Accumulator(start=3)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I've now added 3 to my sum\n" ] } ], "source": [ "adder_A(3)" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I've now added 4 to my sum\n" ] } ], "source": [ "adder_B(4)" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I've now added 5 to my sum\n" ] } ], "source": [ "adder_A(5)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "I've now added 6 to my sum\n" ] } ], "source": [ "adder_B(6)" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adder_A.current_sum" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "13" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "adder_B.current_sum" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Hint\n", "Remember that a property (aka. getter method) is defined by adding the @property decorator to a method definition. The name of the method reflects the property name." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise 2\n", "Duplicate the \"count_to_10\" counting generator from the session 3 material using the custom class definition. The constructor (`__init__`) should use the default arguments of start=0 and step=1. Beside \n", "the `__iter__` and `__next__` special methods the object should also have a `set_step` method, so the user can adjust the step size along the way.\n", "\n", "Below is illustrated how your iterator should behaive:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "from solution_module import Count_to_10\n", "counter = Count_to_10(start=0, step=1)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(counter)" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(counter)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n", "6\n", "8\n", "10\n", "I can only count to 10! :(\n" ] } ], "source": [ "counter.set_step(2)\n", "for count in counter:\n", " print(count)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Hint\n", "Remember that an iterator throws a StopIteration exception when it is exhausted. This is done by `raise(StopIteration)`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# End of exercises\n", "*The cell below is for setting the style of this document. It's not part of the exercises.*" ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [ { "data": { "text/html": [ "" ], "text/plain": [ "" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Apply css theme to notebook\n", "from IPython.display import HTML\n", "HTML(''.format(open('../css/cowi.css').read()))" ] } ], "metadata": { "hide_input": false, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Table of Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }