{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# WTF is this? \n", "\n", "or, When is _is_ what you think _is_ is?\n", "\n", "Also, rabbit hole alert..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%HTML\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# The Problem" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%HTML\n", "

Pay no mind.... pic.twitter.com/mnIPHJXE1h

— David Beazley (@dabeaz) July 27, 2017
\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# let's reproduce it\n", "class A():\n", " pass\n", "A.__dict__ is A.__dict__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ... and more robustly...\n", "a = A()\n", "a.__class__.__dict__ is a.__class__.__dict__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Our path...\n", "\n", "The code in question involves class objects and instances, the `is` operator, and attribute access via the dot notation. Let's explore how those objects and operations work.\n", "\n", "Out of scope:\n", "* we're not gonna talk about properties (by name)\n", "* we're not gonna talk about descriptors (by name)\n", "* we're not gonna talk about slots\n", "\n", "...but you _will_ run into these concepts if you investigate beyond this tutorial." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1) Python class construction" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "class B():\n", " pass" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "C = type('C',(),dict())" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "D = type('C',(),dict())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "D" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Takeaways:\n", "\n", "* two forms of class definition\n", "* variables point to objects\n", "\n", "Reminder:\n", "\n", "* all objects in python 3 are instances of `object`, including objects that are class definitions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2) Python class comparision\n", "\n", "How do these class definitions compare?" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Start with the equivalence operator (==)\n", "# --> remember that this will be defined by the \".__eq__()\" method of the argument on the left" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B == B" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B == C" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "C == D" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B == D" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# check the directory of the object's attributes (more about this later)\n", "\n", "vars(B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vars(B) == vars(B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vars(B) == vars(C)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vars(C) == vars(D)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# let's cast it to a real 'dict'\n", "dict(vars(D))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dict(vars(B)) == dict(vars(C))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dict(vars(C)) == dict(vars(D))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# check the directory of attributes (more about this later)\n", "\n", "dir(B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dir(B) == dir(B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dir(B) == dir(C)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dir(C) == dir(D)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Takeaways:\n", "\n", "* Class definitions are objects with attributes\n", "* Class descriptions (`vars`, `dir`, etc.) are equivalent for self-comparison\n", "* Only the objects' list of attribute names are equivalent for separately-constructed objects" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3) Python class identity\n", "\n", "What are these objects?" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# instance and type\n", "\n", "isinstance(B,type)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "isinstance(B,object)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B.__class__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B.__base__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B.__bases__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "id(B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the 'is' operator compares the result of the 'id' function's application to the arguments\n", "\n", "B is B" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "id(B) == id(B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# now use B's callability to create an instance of it\n", "b = B()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "isinstance(b,B)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type(b).__bases__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# FWIW\n", "type(type)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "type.__bases__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Takeaways:\n", "\n", "* class objects are instances of the type 'type'\n", "* class objects are classes that inherit from 'object'\n", "\n", "WTF?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 4) Object attributes\n", "\n", "In addition to various notions of identity, we also need to investigate attribute access. \n", "\n", "Apart from the problem we're investigating, Python places a lot of importance on interfaces, in which an object is described and classified in terms of its function and attributes, rather than its identity or inheritance properties. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# set some attributes of some objects\n", "setattr(b,'an_instance_attr',1)\n", "setattr(B,'a_class_attr',2)\n", "setattr(B,'a_class_method',lambda x: 3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vars(b)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.__dict__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vars(B)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Conclusion: `__dict__` / `vars()` returns an _instance's_ attributes.\n", "\n", "Let iterate through `b`'s inheritance tree, and look at the instance attributes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vars(type)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vars(object)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# collect all the instance attributes of the inheritance tree (don't include type)\n", "\n", "attribute_keys = set( list(vars(b).keys()) + list(vars(B).keys()) + list(vars(object).keys()))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "for attribute_key in attribute_keys:\n", " print('{} : {}'.format(attribute_key,getattr(b,attribute_key)))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# our manual attributes collection should match that from 'dir'\n", "attribute_keys - set(dir(b))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "NOTE: `dir` is not always reliable.\n", "\n", "Take-aways:\n", "\n", "* The `__dict__` attribute lists the instance attributes of an object\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 5) Instance and class attributes" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.an_instance_attr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B.an_instance_attr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B.a_class_attr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.a_class_attr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.a_class_method" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b.a_class_method()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take-aways:\n", "\n", "* instance attributes do not affect the associated class attribute set\n", "* class attributes are available for lookup by an instance\n", "\n", "Out of scope:\n", "\n", "* how do instance attributes get added at construction?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 6) Attribute access\n", "\n", "The dot notation searches through the attributes of the instance, then the class, the through parent classes, to find an attribute of the requested name. \n", "\n", "The _method resolution order_ defines how complex inheritance structures are traversed." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "B.mro()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Python's MRO invokes a smart algorithm that accounts for circularity in the inheritance tree\n", "# https://en.wikipedia.org/wiki/C3_linearization\n", "\n", "class X():\n", " a = 1\n", "class Y():\n", " b = 2\n", "class Z(X,Y):\n", " c = 3\n", "Z.mro()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Z.c" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Z.b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Z.a" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# get an attribute defined only by the base class\n", "Z.__repr__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To locate the attribute named `my_attr`, Python:\n", "\n", "* searchs the `__dict__` attribute of the instance for key `my_attr`\n", "* searches the `__dict__` attributes of all the objects in the MRO\n", "* searches in all the places for a `__getattr__` method, and calls `object.__getattr__('my_attr')`\n", "* ...other things...\n", "\n", "Take-aways:\n", "\n", "* the method resolution order manages the order and sources for object attribute lookup\n", "* attribute lookup is potentially complicated" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 7) An optimization\n", "\n", "Because attribute lookup is common and potentially complicated, \n", "the Python authors decided to enforce some simplifications to the process. \n", "Most important for our problem here: **class-level attributes and methods must by referenced with strings**." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# let's start with the instance-level attribute dictionary\n", "\n", "b.__dict__['an_attr'] = 'value'\n", "b.__dict__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# I don't know why anyone would want to do this, but we'll allow it at the level of instance objects. \n", "# Any hashable object can be a key in an ordinary dictionary.\n", "\n", "b.__dict__[1] = [3,4]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# what happens if we do the same to `b`'s class?\n", "\n", "b.__class__.__dict__[1] = [3,4]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# right, we've seen this \"mappingproxy\" before\n", "b.__class__.__dict__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# also equivalent\n", "B.__dict__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The [MappingProxyType](https://docs.python.org/3/library/types.html#types.MappingProxyType) type is a read-only view of a mapping (dictionary). So we can't set instance attributes via this attribute. This requires that attributes be set with `setattr`, which calls `__setattr__`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# turns out, it's a method of 'object'\n", "B.__setattr__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "setattr(B,1,2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take-away:\n", "\n", "* class attributes are required to be referenced by strings, due to the implementation of `object.__setattr__`, thus speeding up attribute lookup.\n", "* the class-level attribute mapping is returned by a read-only mappingproxy object" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 8) Tying it together\n", "\n", "Now we know _why_ an object's `__dict__` attribute returns a read-only `mappingproxy` object. Let's return to the Tweet and address the question of the `mappingproxy` object's identity." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%%HTML\n", "

Pay no mind.... pic.twitter.com/mnIPHJXE1h

— David Beazley (@dabeaz) July 27, 2017
\n", "" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# the example\n", "A.__dict__ is A.__dict__" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# run this a few times\n", "id(A.__dict__)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Takeaway: a new `mappingproxy` object is created for every call to `__dict__`, and since two objects can't share the same memory address at the same time, this form of comparison will never be true. The reason that a new `mappingproxy` is created for each call to `__dict__` is, unfortunately, out of scope.\n", "\n", "Bonus questions below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# what about this?\n", "id(A.__dict__) == id(A.__dict__)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# or this?\n", "x = id(A.__dict__)\n", "y = id(A.__dict__)\n", "x == y" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# or this?\n", "x = A.__dict__\n", "y = A.__dict__\n", "id(x) == id(y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remember: the return value of the `id` builtin function \"is an integer which is guaranteed to be unique and constant for this object during its lifetime.\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.1" } }, "nbformat": 4, "nbformat_minor": 2 }