{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Sobrecarga de Operdadores em Python\n", "\n", "By Paulo Scardine - http://goo.gl/Ke1P0p\n", "\n", "## O problema\n", "\n", "No site StackOverflow um usuário de R perguntou como implementar o pipe-operator do pacote [dplyr](https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html) (`%>%`), onde `x %>% f(y)` é equivalente a `f(x, y)`. Adicionalmente, ele gostaria de usar uma sintaxe parecida com o pacote [Pipe](http://pypi.python.org/pypi/pipe/) do cheese shop:\n", "\n", " df = df | select('one') | rename(one='new_one')\n", " \n", "No pacote Pipe esta sintaxe é chamada de \"infix notation\", e é equivalente a:\n", "\n", " df = rename(select(df, 'one'), one='new_one')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "import pandas as pd\n", "\n", "df = pd.DataFrame({'one' : [1., 2., 3., 4., 4.],\n", " 'two' : [4., 3., 2., 1., 3.]})\n", "\n", "def select(df, *args):\n", " return df[list(args)]\n", "\n", "\n", "def rename(df, **kwargs):\n", " for name, value in kwargs.items():\n", " df = df.rename(columns={'%s' % name: '%s' % value})\n", " return df" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwo
014
123
232
341
443
\n", "
" ], "text/plain": [ " one two\n", "0 1 4\n", "1 2 3\n", "2 3 2\n", "3 4 1\n", "4 4 3" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
one
01
12
23
34
44
\n", "
" ], "text/plain": [ " one\n", "0 1\n", "1 2\n", "2 3\n", "3 4\n", "4 4" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "select(df, 'one')" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
other
01
12
23
34
44
\n", "
" ], "text/plain": [ " other\n", "0 1\n", "1 2\n", "2 3\n", "3 4\n", "4 4" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rename(select(df, 'one'), one='other')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Como sobrecarregar operadores em Python\n", "\n", "Para cada operador em Python existe um ou mais métodos mágicos `__dunder__`, um para a operação normal e um para a operação \"à direita\". Por exemplo, para implementar o operador `+`, você precisa sobrecarregar o método `__add__`.\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "class Idem(object):\n", " def __add__(self, other):\n", " return other * 2\n", "\n", " \n", "idem = Idem()" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "idem + 5" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "TypeError", "evalue": "unsupported operand type(s) for +: 'int' and 'Idem'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[1;36m5\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0midem\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for +: 'int' and 'Idem'" ] } ], "source": [ "5 + idem" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "class Idem(object):\n", " def __add__(self, other):\n", " return other * 2\n", " \n", " def __radd__(self, other):\n", " return self.__add__(other)\n", " \n", "idem = Idem()" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "5 + idem" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Um exemplo polêmico\n", "\n", "Somar `datetime.date` com `datetime.time`:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": true, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "import datetime\n", "\n", "data = datetime.date.today()\n", "hora = datetime.time(19)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "SmartDate(2015, 11, 23)" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "datetime.time(19, 0)" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hora" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "datetime.datetime(2015, 11, 23, 20, 24, 15, 858188)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "datetime.datetime.now()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "TypeError", "evalue": "unsupported operand type(s) for +: 'datetime.date' and 'datetime.time'", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mdata\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0mhora\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for +: 'datetime.date' and 'datetime.time'" ] } ], "source": [ "data + hora" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "class SmartDate(datetime.date):\n", " def __add__(self, other):\n", " if isinstance(other, datetime.time):\n", " return datetime.datetime.combine(self, other)\n", " return super(SmartDate, self).__add__(other)\n" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "data = SmartDate(*data.timetuple()[:3])" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "datetime.datetime(2015, 11, 23, 19, 0)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data + hora" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Princípio da Menor Surpresa\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true, "slideshow": { "slide_type": "slide" } }, "source": [ "## Finalmente" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false, "slideshow": { "slide_type": "subslide" } }, "outputs": [], "source": [ "def pipe(original):\n", " class PipeInto(object):\n", " data = {'function': original}\n", "\n", " def __init__(self, *args, **kwargs):\n", " self.data['args'] = args\n", " self.data['kwargs'] = kwargs\n", "\n", " def __rrshift__(self, other):\n", " return self.data['function'](\n", " other, \n", " *self.data['args'], \n", " **self.data['kwargs']\n", " )\n", "\n", " return PipeInto\n", "\n", "\n", "@pipe\n", "def select(df, *args):\n", " return df[list(cols)]\n", "\n", "\n", "@pipe\n", "def rename(df, **kwargs):\n", " for name, value in kwargs.items():\n", " df = df.rename(columns={'%s' % name: '%s' % value})\n", " return df" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
twoone
041
132
223
314
434
\n", "
" ], "text/plain": [ " two one\n", "0 4 1\n", "1 3 2\n", "2 2 3\n", "3 1 4\n", "4 3 4" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df >> select('two', 'one')" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
onetwo
014
123
232
341
443
\n", "
" ], "text/plain": [ " one two\n", "0 1 4\n", "1 2 3\n", "2 3 2\n", "3 4 1\n", "4 4 3" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
first
01
12
23
34
44
\n", "
" ], "text/plain": [ " first\n", "0 1\n", "1 2\n", "2 3\n", "3 4\n", "4 4" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df >> select('one') >> rename(one='first') " ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "32" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "16 << 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Takeaways\n", "\n", " \n", "\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ " * Python is awesome\n", " * Grupy is awesome\n", " * VivaReal is awesome\n", " * Tenham juízo, crianças!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# http://goo.gl/Ke1P0p\n", "\n", "## Perguntas ???\n", "\n", "\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.9" } }, "nbformat": 4, "nbformat_minor": 0 }