{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Types and Dispatch in Julia" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "One of the most important goals of high-level languages is to provide *polymorphism*: the ability for the same code to operate on different kinds of values.\n", "\n", "Julia uses a vocabulary of *types* for this purpose. Types play the following roles:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- Describe \"what kind of thing is this\"\n", "- Describe the representation of a value\n", "- Driving *dispatch*: selecting one of several pieces of code\n", "- Driving *specialization*: code is optimized by assuming values have certain types" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Describing values" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "typeof(3)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sizeof(Int64)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Int64.size" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "isbits(Int64)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Int64.mutable" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "supertype(Int64)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "supertype(Signed)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "supertype(Integer)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "supertype(Real)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "supertype(Number)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The subtype operator/relation\n", "Integer <: Real" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "String <: Real" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Any >: String" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# The `isa` operator/relation\n", "1 isa Int" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1 isa String" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Julia has roughly 5 kinds of types. We just saw two:\n", "\n", "1. Data types - describing concrete data objects\n", "2. Abstract types - group those together\n", "\n", "There are three more:\n", "\n", "1. Union types\n", "2. UnionAll types\n", "3. The empty type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Union types" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Expresses a *set union* of types." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1 isa Union{Int,String}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "\"hi\" isa Union{Int,String}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## UnionAll types" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Expresses an *iterated set union* of types." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[1] isa Vector{Int}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[1] isa (Vector{T} where T<:Real)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$\\bigcup\\limits_{T<:Real} \\tt{Vector}\\{T\\}$" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Union{Vector{Any},Vector{Real}} <: Vector{T} where T>:Real" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "T where T<:Real" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "rand(1:10,2,2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dump(Array)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Vector" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Vector{Int} <: Vector" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Vector <: Array" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Vector{Int} <: Vector{Any}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "typeintersect((Array{T} where T<:Real), (Array{T,2} where T>:Int))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[[2]] isa (Vector{T} where T<:Vector{S} where S<:Integer)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## The empty type" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Corresponds to the empty set." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "1 isa Union{}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Union{} <: Int" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Union{} <: String" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "Union{} <: Array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This represents situations where there can't be any value; e.g. an exception is thrown or the program doesn't terminate." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dispatch" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f(a, b::Any) = \"fallback\"\n", "f(a::Number, b::Number) = \"a and b are both numbers\"\n", "f(a::Number, b) = \"a is a number\"\n", "f(a, b::Number) = \"b is a number\"\n", "f(a::Integer, b::Integer) = \"a and b are both integers\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "methods(f)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f(1.5, 2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f(1, \"string\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f(1, 2)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "f(1, 2, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tuples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A tuple is an immutable container of any combination of values.\n", "\n", "Often used to represent e.g. ordered pairs, or for returning \"multiple\" values from functions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t = (1, \"hi\", 0.33, pi)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "t[2]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# \"destructuring\"\n", "a, b, c = t" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "a" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "b" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "typeof(t)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Tuple types represent the arguments to a function." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "first(methods(f)).sig" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For every function call, the method that gets called is the most specific one such that the argument tuple type is a subtype of the signature." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## \"Diagonal\" dispatch" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d(x::T, y::T) where {T} = \"same type\"\n", "d(x, y) = \"different types\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d(1, 1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "d(1, 2.0)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "[ m.sig for m in methods(d) ]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variadic (or varargs) methods" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v(x...) = (x, \"zero or more\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v(x, xs...) = (xs, \"one or more\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v(1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "v(1, 2, 3, 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Variadic tuple types" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "foo(a::Array, Is::Int...) = 0" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "first(methods(foo)).sig" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "vt = Tuple{Array, Vararg{Int}}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "isa(([1],1,2,3), vt)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "isa(([1],1,0.02,3), vt)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Specialization in action" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Internally, the compiler generates specializations for particular types.\n", "\n", "Example: For a 3-argument function `f`, the compiler might decide to generate a specialization for `Tuple{Int, Any, Int}`, if for some reason the second argument isn't important." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "addall(t) = +(t...) # \"splat\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@code_typed addall((1,2))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@code_typed addall((1,2,3))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "function alltrue(f, itr)\n", " @inbounds for x in itr\n", " f(x) || return false\n", " end\n", " return true\n", "end" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@which isinteger(1)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@code_typed alltrue(isinteger, [1,2,3])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@code_llvm alltrue(isinteger, [1,2,3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Dispatch, specialization, and performance" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Dynamic dispatch is traditionally considered \"slow\".\n", "\n", "Instead of a `call` instruction, you need to do a table lookup procedure first.\n", "\n", "However:\n", "1. If types are known, the call target can be looked up at compile time.\n", "2. The cost of dynamic dispatch is well worth it *if* you're dispatching to an optimized kernel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What to specialize on?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can't specialize on *everything* because it would take too long and generate too much code.\n", "\n", "There's no fully general and automatic approach.\n", "\n", "We specialize on types. That's a reasonable default. If the default's not good enough, move more information into types!\n", "\n", "A classic: specializing on the value of an integer." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "function sum1n(::Val{N}) where {N} # given `struct Val{N} end`\n", " s = 0\n", " for i = 1:N\n", " s += i\n", " end\n", " return s\n", "end" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sum1n(Val{10}())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@code_llvm sum1n(Val{10}())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sum1n(n::Integer) = sum1n(Val{n}())" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sum1n(20)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sum1n(rand(1:100)) # dynamic dispatch to specialized code" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# \"Stupid Dispatch Tricks\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Trick 1: processing arguments recursively" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The compiler's optimizations can be exploited to move parts of your own computations to compile time (thus saving time at run time). The general idea is to represent more information within types, instead of using values.\n", "\n", "Example: drop the first element of a tuple." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tuple_tail1(t) = t[2:end]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tuple_tail1((1,2,\"hi\"))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@code_typed tuple_tail1((1,2,\"hi\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not good. Key information is represented as integers, and when the compiler sees an integer it generally assumes it doesn't know its value.\n", "\n", "- The compiler counts 1, infinity\n", "- The compiler can match things but cannot do arithmetic or comparisons\n", "- It's very good at knowing the types of function arguments" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "argtail(a, rest...) = rest\n", "tupletail(t) = argtail(t...)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tupletail((1,2,\"hi\"))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "@code_typed tupletail((1,2,\"hi\"))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Exercise\n", "\n", "Write a type-inferable function to...\n", "\n", "1. reverse a tuple\n", "1. take every other element of a tuple\n", "2. interleave the elements of two tuples" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Real-ish example: computing the shape of an indexing operation" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "index_shape(a::Array, idxs) = ish(a, 1, idxs...)\n", "\n", "ish(a, i, ::Real...) = ()\n", "ish(a, i, ::Colon, rest...) = (size(a,i), ish(a,i+1,rest...)...)\n", "ish(a, i, iv::Vector, rest...) = (length(iv), ish(a,i+1,rest...)...)\n", "ish(a, i, ::Real, rest...) = ish(a,i+1,rest...)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "index_shape(rand(3,4,5), (1,:,[1,2]))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "index_shape(rand(3,4,5), (:,2,[1,2,1,2,1,2,1,2]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Trick 2: look up \"trait\" values and re-dispatch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Functions of types" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "widen(::Type{Float32}) = Float64" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "widen(Float32)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# We use this for type promotion\n", "promote_type(Int64, Float64)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This can be used to compute attributes of types, then dispatch on those values." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Sample trait\n", "abstract IteratorSize\n", "immutable SizeUnknown <: IteratorSize end\n", "immutable HasLength <: IteratorSize end\n", "immutable HasShape <: IteratorSize end\n", "immutable IsInfinite <: IteratorSize end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can define a method that says which value of the trait a certain type has.\n", "\n", "This is like using dispatch as a lookup table to find out properties of a combination of values." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [ "iteratorsize{T<:AbstractArray}(::Type{T}) = HasShape()\n", "\n", "iteratorsize{I1,I2}(::Type{Zip2{I1,I2}}) = zip_iteratorsize(iteratorsize(I1),iteratorsize(I2))\n", "\n", "zip_iteratorsize(a, b) = SizeUnknown()\n", "zip_iteratorsize{T}(isz::T, ::T) = isz\n", "zip_iteratorsize(::HasLength, ::HasShape) = HasLength()\n", "zip_iteratorsize(::HasShape, ::HasLength) = HasLength()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# `collect` gives you all the elements from an iterator as an array\n", "collec(itr) = _collec(itr, eltype(itr), Base.iteratorsize(itr))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "function _collec(itr, T, ::Base.HasLength)\n", " a = Array{T,1}(length(itr))\n", " i = 0\n", " for x in itr\n", " a[i+=1] = x\n", " end\n", " return a\n", "end" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "function _collec(itr, T, ::Base.SizeUnknown)\n", " a = Array{T,1}(0)\n", " for x in itr\n", " push!(a, x)\n", " end\n", " return a\n", "end" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Julia 0.6.2-pre", "language": "julia", "name": "julia-0.6" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "0.6.0" } }, "nbformat": 4, "nbformat_minor": 1 }