{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Sobrecarga de Operdadores em Python\n",
"\n",
"By Paulo Scardine - http://goo.gl/Ke1P0p\n",
"\n",
"## O problema\n",
"\n",
"No site StackOverflow um usuário de R perguntou como implementar o pipe-operator do pacote [dplyr](https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html) (`%>%`), onde `x %>% f(y)` é equivalente a `f(x, y)`. Adicionalmente, ele gostaria de usar uma sintaxe parecida com o pacote [Pipe](http://pypi.python.org/pypi/pipe/) do cheese shop:\n",
"\n",
" df = df | select('one') | rename(one='new_one')\n",
" \n",
"No pacote Pipe esta sintaxe é chamada de \"infix notation\", e é equivalente a:\n",
"\n",
" df = rename(select(df, 'one'), one='new_one')"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"import pandas as pd\n",
"\n",
"df = pd.DataFrame({'one' : [1., 2., 3., 4., 4.],\n",
" 'two' : [4., 3., 2., 1., 3.]})\n",
"\n",
"def select(df, *args):\n",
" return df[list(args)]\n",
"\n",
"\n",
"def rename(df, **kwargs):\n",
" for name, value in kwargs.items():\n",
" df = df.rename(columns={'%s' % name: '%s' % value})\n",
" return df"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"
\n",
" \n",
" \n",
" | \n",
" one | \n",
" two | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" 4 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 3 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 2 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 1 | \n",
"
\n",
" \n",
" 4 | \n",
" 4 | \n",
" 3 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" one two\n",
"0 1 4\n",
"1 2 3\n",
"2 3 2\n",
"3 4 1\n",
"4 4 3"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" one | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
"
\n",
" \n",
" 4 | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" one\n",
"0 1\n",
"1 2\n",
"2 3\n",
"3 4\n",
"4 4"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"select(df, 'one')"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" other | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
"
\n",
" \n",
" 4 | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" other\n",
"0 1\n",
"1 2\n",
"2 3\n",
"3 4\n",
"4 4"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"rename(select(df, 'one'), one='other')"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Como sobrecarregar operadores em Python\n",
"\n",
"Para cada operador em Python existe um ou mais métodos mágicos `__dunder__`, um para a operação normal e um para a operação \"à direita\". Por exemplo, para implementar o operador `+`, você precisa sobrecarregar o método `__add__`.\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"class Idem(object):\n",
" def __add__(self, other):\n",
" return other * 2\n",
"\n",
" \n",
"idem = Idem()"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"10"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"idem + 5"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"ename": "TypeError",
"evalue": "unsupported operand type(s) for +: 'int' and 'Idem'",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[1;36m5\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0midem\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for +: 'int' and 'Idem'"
]
}
],
"source": [
"5 + idem"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"class Idem(object):\n",
" def __add__(self, other):\n",
" return other * 2\n",
" \n",
" def __radd__(self, other):\n",
" return self.__add__(other)\n",
" \n",
"idem = Idem()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"10"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"5 + idem"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Um exemplo polêmico\n",
"\n",
"Somar `datetime.date` com `datetime.time`:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": true,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"import datetime\n",
"\n",
"data = datetime.date.today()\n",
"hora = datetime.time(19)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"SmartDate(2015, 11, 23)"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"datetime.time(19, 0)"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hora"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"datetime.datetime(2015, 11, 23, 20, 24, 15, 858188)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"datetime.datetime.now()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"ename": "TypeError",
"evalue": "unsupported operand type(s) for +: 'datetime.date' and 'datetime.time'",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mdata\u001b[0m \u001b[1;33m+\u001b[0m \u001b[0mhora\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[1;31mTypeError\u001b[0m: unsupported operand type(s) for +: 'datetime.date' and 'datetime.time'"
]
}
],
"source": [
"data + hora"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"class SmartDate(datetime.date):\n",
" def __add__(self, other):\n",
" if isinstance(other, datetime.time):\n",
" return datetime.datetime.combine(self, other)\n",
" return super(SmartDate, self).__add__(other)\n"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": true,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"data = SmartDate(*data.timetuple()[:3])"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"datetime.datetime(2015, 11, 23, 19, 0)"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data + hora"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Princípio da Menor Surpresa\n",
"\n",
""
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Finalmente"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"def pipe(original):\n",
" class PipeInto(object):\n",
" data = {'function': original}\n",
"\n",
" def __init__(self, *args, **kwargs):\n",
" self.data['args'] = args\n",
" self.data['kwargs'] = kwargs\n",
"\n",
" def __rrshift__(self, other):\n",
" return self.data['function'](\n",
" other, \n",
" *self.data['args'], \n",
" **self.data['kwargs']\n",
" )\n",
"\n",
" return PipeInto\n",
"\n",
"\n",
"@pipe\n",
"def select(df, *args):\n",
" return df[list(cols)]\n",
"\n",
"\n",
"@pipe\n",
"def rename(df, **kwargs):\n",
" for name, value in kwargs.items():\n",
" df = df.rename(columns={'%s' % name: '%s' % value})\n",
" return df"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" two | \n",
" one | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 4 | \n",
" 1 | \n",
"
\n",
" \n",
" 1 | \n",
" 3 | \n",
" 2 | \n",
"
\n",
" \n",
" 2 | \n",
" 2 | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" 1 | \n",
" 4 | \n",
"
\n",
" \n",
" 4 | \n",
" 3 | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" two one\n",
"0 4 1\n",
"1 3 2\n",
"2 2 3\n",
"3 1 4\n",
"4 3 4"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df >> select('two', 'one')"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" one | \n",
" two | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
" 4 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
" 3 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
" 2 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
" 1 | \n",
"
\n",
" \n",
" 4 | \n",
" 4 | \n",
" 3 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" one two\n",
"0 1 4\n",
"1 2 3\n",
"2 3 2\n",
"3 4 1\n",
"4 4 3"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"
\n",
" \n",
" \n",
" | \n",
" first | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 1 | \n",
"
\n",
" \n",
" 1 | \n",
" 2 | \n",
"
\n",
" \n",
" 2 | \n",
" 3 | \n",
"
\n",
" \n",
" 3 | \n",
" 4 | \n",
"
\n",
" \n",
" 4 | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" first\n",
"0 1\n",
"1 2\n",
"2 3\n",
"3 4\n",
"4 4"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df >> select('one') >> rename(one='first') "
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"32"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"16 << 1"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Takeaways\n",
"\n",
" \n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"source": [
" * Python is awesome\n",
" * Grupy is awesome\n",
" * VivaReal is awesome\n",
" * Tenham juízo, crianças!"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# http://goo.gl/Ke1P0p\n",
"\n",
"## Perguntas ???\n",
"\n",
"\n",
""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 0
}