{ "metadata": { "language": "Julia", "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Julia and Statistical Computing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Why Use Julia?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "![](files/julia.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Additional Reasons to Use Julia\n", "\n", "* Clean language design:\n", " * Conservative, highly selective choice of features\n", " * Appropriate for both prototype and production code\n", " * Familiar C-like syntax\n", " * Strong metaprogramming facilities derived from Lisp\n", "* Smooth interoperability with C/Fortran\n", "* Growing, dedicated community" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Why You Shouldn't Use Julia" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Julia is very young\n", "* The Julia package ecosystem is even younger\n", "* Breaking changes are still coming in core and will be quite frequent outside of core Julia\n", "* Language features are still being added: your favorite may not exist yet\n", "* Code quality for packages varies from reasonably well tested to never tested to broken" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "So... Should You Use Julia?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That depends on your use case:\n", "\n", "* If you tend to build lots of tools from scratch, Julia is usable, but a little rough\n", "* If you tend to build upon lots of other packages, Julia isn't ready for you yet" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The Basics of Julia" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Julia has all of the basic types and functions you need to do math" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Integers\n", "1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "1" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "# Floating point numbers\n", "1.0" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "1.0" ] } ], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "# Strings\n", "\"a\"" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "\"a\"" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "# Complex numbers\n", "1 + 2im" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "1 + 2im" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "# Vectors\n", "[1, 2, 3]" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "3-element Array{Int64,1}:\n", " 1\n", " 2\n", " 3" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "# Matrices\n", "[1.0 2.0; 3.1 4.2]" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "2x2 Array{Float64,2}:\n", " 1.0 2.0\n", " 3.1 4.2" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "# Elementary functions\n", "sin(1.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "0.8414709848078965" ] } ], "prompt_number": 7 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Julia's Type System" ] }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Any)[1:4]" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "4-element Array{Any,1}:\n", " AbstractArray{T,N}\n", " AbstractCmd \n", " AbstractRNG \n", " Algorithm " ] } ], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Number)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "5-element Array{Any,1}:\n", " Complex{Float16}\n", " Complex{Float32}\n", " Complex{Float64}\n", " Complex{T<:Real}\n", " Real " ] } ], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Real)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "4-element Array{Any,1}:\n", " FloatingPoint \n", " Integer \n", " MathConst{sym} \n", " Rational{T<:Integer}" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Integer)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "5-element Array{Any,1}:\n", " BigInt \n", " Bool \n", " Char \n", " Signed \n", " Unsigned" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(None)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method subtypes(Type{None})\nwhile loading In[12], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method subtypes(Type{None})\nwhile loading In[12], in expression starting on line 1" ] } ], "prompt_number": 12 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Why Are Types Important?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "* **Abstraction**: Assert that a computation works for many related types of objects\n", "* **Parametric Polymorphism**: Function behavior can be defined over many types at once\n", "* **Ad Hoc Polymorphism**: Function behavior can vary systematically across types\n", "* **Type Safety**: How should a function work if the wrong type of argument is used?" ] }, { "cell_type": "heading", "level": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Abstraction / Parametric Polymorphism" ] }, { "cell_type": "code", "collapsed": false, "input": [ "sqrt(2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "1.4142135623730951" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "sqrt(2.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "1.4142135623730951" ] } ], "prompt_number": 2 }, { "cell_type": "heading", "level": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Ad Hoc Polymorphism" ] }, { "cell_type": "code", "collapsed": false, "input": [ "1 + 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "2" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "1.0 + 1.0" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "2.0" ] } ], "prompt_number": 4 }, { "cell_type": "heading", "level": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Type Safety" ] }, { "cell_type": "code", "collapsed": false, "input": [ "\"wat\" + 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method +(ASCIIString, Int64)\nwhile loading In[5], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method +(ASCIIString, Int64)\nwhile loading In[5], in expression starting on line 1" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "1 + \"wat\"" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method +(Int64, ASCIIString)\nwhile loading In[6], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method +(Int64, ASCIIString)\nwhile loading In[6], in expression starting on line 1" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "\"wat\" - 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method -(ASCIIString, Int64)\nwhile loading In[7], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method -(ASCIIString, Int64)\nwhile loading In[7], in expression starting on line 1" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "1 - \"wat\"" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method -(Int64, ASCIIString)\nwhile loading In[8], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method -(Int64, ASCIIString)\nwhile loading In[8], in expression starting on line 1" ] } ], "prompt_number": 8 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Multiple Dispatch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The meaning of a function can depend upon the types of its inputs" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function foo(a, b)\n", " return a + b\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "foo (generic function with 1 method)" ] } ], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "foo(1, 2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "3" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "foo(1.0, 2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "3.0" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(a, b) = a * b" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "bar (generic function with 1 method)" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(2, 3)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "6" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(1, \"wat\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method *(Int64, ASCIIString)\nwhile loading In[14], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method *(Int64, ASCIIString)\nwhile loading In[14], in expression starting on line 1", " in bar at In[12]:1" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(\"NaN\", \"NaN\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ "\"NaNNaN\"" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(a::Integer, b::Integer) = 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ "folly (generic function with 1 method)" ] } ], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(a::Float64, b::Float64) = 2" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "folly (generic function with 2 methods)" ] } ], "prompt_number": 17 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(1, 2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "1" ] } ], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(1.0, 2.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "2" ] } ], "prompt_number": 19 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "User-Defined Types" ] }, { "cell_type": "code", "collapsed": false, "input": [ "type NewType\n", " a::Int\n", " b::UTF8String\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "x = NewType(1, \"this is a string\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 21, "text": [ "NewType(1,\"this is a string\")" ] } ], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "x.a, x.b" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 22, "text": [ "(1,\"this is a string\")" ] } ], "prompt_number": 22 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Modules / Namespaces" ] }, { "cell_type": "code", "collapsed": false, "input": [ "module NewNamespace\n", " c = 42\n", " export c\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "c" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "c not defined\nwhile loading In[24], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "c not defined\nwhile loading In[24], in expression starting on line 1" ] } ], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "using NewNamespace" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "c" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 26, "text": [ "42" ] } ], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "c = 41" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 27, "text": [ "41" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "Warning: imported binding for c overwritten in module Main\n" ] } ], "prompt_number": 27 }, { "cell_type": "code", "collapsed": false, "input": [ "NewNamespace.c" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ "42" ] } ], "prompt_number": 28 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "What's the Catch?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "If Julia looks like most dynamic languages, why is it so fast?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Designing for Efficiency\n", "\n", "* Julia's strict language definition makes it possible for the Julia compiler to do a lot of work:\n", " * Infer very specific types for variables\n", " * Perform static analysis of code\n", " * Provide custom binary implementations of high-level constructs based on specific types\n", " * Represent data in a maximally efficient way\n", "* Lots of valid code in Python and Ruby is illegal in Julia\n", "* Every Julia type has an obvious deconstruction into machine types" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Static Analysis" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "Julia generates custom function implementations for every input type signatures" ] }, { "cell_type": "code", "collapsed": false, "input": [ "foo(a, b) = a + b" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 29, "text": [ "foo (generic function with 1 method)" ] } ], "prompt_number": 29 }, { "cell_type": "code", "collapsed": false, "input": [ "code_llvm(foo, (Int, Int))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "define i64 @julia_foo1292(i64, i64) {\n", "top:\n", " %2 = add i64 %1, %0, !dbg !3947\n", " ret i64 %2, !dbg !3947\n", "}\n" ] } ], "prompt_number": 30 }, { "cell_type": "code", "collapsed": false, "input": [ "code_llvm(foo, (Float64, Float64))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "define double @julia_foo1305(double, double) {\n", "top:\n", " %2 = fadd double %0, %1, !dbg !3986\n", " ret double %2, !dbg !3986\n", "}\n" ] } ], "prompt_number": 31 }, { "cell_type": "code", "collapsed": false, "input": [ "code_native(foo, (Int, Int))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\t.section\t__TEXT,__text,regular,pure_instructions\n", "Filename: In[29]\n", "Source line: 1\n", "\tpush\tRBP\n", "\tmov\tRBP, RSP\n", "Source line: 1\n", "\tadd\tRDI, RSI\n", "\tmov\tRAX, RDI\n", "\tpop\tRBP\n", "\tret\n" ] } ], "prompt_number": 32 }, { "cell_type": "code", "collapsed": false, "input": [ "code_native(foo, (Float64, Float64))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\t.section\t__TEXT,__text,regular,pure_instructions\n", "Filename: In[29]\n", "Source line: 1\n", "\tpush\tRBP\n", "\tmov\tRBP, RSP\n", "Source line: 1\n", "\tvaddsd\tXMM0, XMM0, XMM1\n", "\tpop\tRBP\n", "\tret\n" ] } ], "prompt_number": 33 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "For simple functions, Julia's specialized implementation is similar to a natural C implementation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "As Julia gets smarter, its static analyses tools improve all code you write:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function foo()\n", "\tr = 0\n", "\tfor i in 1:12\n", "\t\tr += 2\n", "\tend\n", "\treturn r\n", "end\n", "\n", "code_llvm(foo, ())" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "define i64 @julia_foo1307() {\n", "pass2:\n", " ret i64 24, !dbg !3993\n", "}\n" ] } ], "prompt_number": 34 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "What You Lose" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "To make static analysis effective, Julia rules out certain types of code:\n", "\n", "* Reification of scope\n", "* Evaluating code in arbitrary scopes\n", "* Unexpected changes of type\n", "* Mature community and package system" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "R's reification of scope is ruled out by Julia's design:\n", "\n", "```\n", "f1 <- function(scope.number) {\n", "\tls(env = sys.frame(scope.number))\n", "}\n", "\n", "f2 <- function() {\n", " a <- 1\n", "\tmy.scope <- sys.nframe()\n", "\tf1(my.scope)\n", "}\n", "\n", "f2()\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "R's ability to mutate ad hoc scopes is ruled out by Julia's design:\n", "\n", "```\n", "g1 <- function(var.name, scope.number)\n", "{\n", "\tassign(\"a\", -1, envir = sys.frame(scope.number))\n", "}\n", " \n", "g2 <- function()\n", "{\n", "\ta <- 1\n", "\tscope.number <- sys.nframe()\n", "\tprint(a)\n", "\tg1(\"a\", scope.number)\n", "\tprint(a)\n", "}\n", "\n", "g2()\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "R's type mutation by assignment is ruled out by Julia's design:\n", "\n", "```\n", "v <- c(1, 2, 3)\n", "v[1] <- \"a\"\n", "print(v)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "To make defaults efficient, Julia assumes you want to work close to the metal:\n", "\n", "* Machine precision integers, not infinite precision integers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function fib{T <: Integer}(n::T)\n", " if n == zero(T)\n", " return zero(T)\n", " elseif n == one(T)\n", " return one(T)\n", " else\n", " a, b = zero(T), one(T)\n", " i = 1\n", " while i < n\n", " a, b = b, a + b\n", " i += 1\n", " end\n", " return b\n", " end\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 35, "text": [ "fib (generic function with 1 method)" ] } ], "prompt_number": 35 }, { "cell_type": "code", "collapsed": false, "input": [ "fib(95)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 36, "text": [ "-4953053512429003327" ] } ], "prompt_number": 36 }, { "cell_type": "code", "collapsed": false, "input": [ "fib(BigInt(95))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 37, "text": [ "31940434634990099905" ] } ], "prompt_number": 37 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Statistics in Julia" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* StatsBase.jl (Mature, but incomplete)\n", "* DataArrays.jl (Usable, but immature)\n", "* DataFrames.jl (Usable, but immature)\n", "* Distributions.jl (Mature, complete)\n", "* Gadfly.jl (Mature, but incomplete)\n", "* GLM.jl (Usable, but immature)\n", "* Optim.jl, NLopt.jl, DualNumbers.jl, Calculus.jl, JuMP.jl (Mature, fairly complete)" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "StatsBase" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using StatsBase" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 38 }, { "cell_type": "code", "collapsed": false, "input": [ "modes([1, 1, 2, 2, 3])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 39, "text": [ "2-element Array{Int64,1}:\n", " 2\n", " 1" ] } ], "prompt_number": 39 }, { "cell_type": "code", "collapsed": false, "input": [ "corspearman(rand(100), rand(100))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 40, "text": [ "-0.08795679567956798" ] } ], "prompt_number": 40 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "DataArrays" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using DataArrays" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 41 }, { "cell_type": "code", "collapsed": false, "input": [ "da = @data([1, 2, NA, 4])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 42, "text": [ "4-element DataArray{Int64,1}:\n", " 1 \n", " 2 \n", " NA\n", " 4 " ] } ], "prompt_number": 42 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "DataFrames" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using DataFrames" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 43 }, { "cell_type": "code", "collapsed": false, "input": [ "df = DataFrame(\n", " A = [1, 2, 3],\n", " B = [\"a\", \"b\", \"c\"],\n", " C = [1//2, 3//4, 5//6]\n", ")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "html": [ "
ABC
11a1//2
22b3//4
33c5//6
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 44, "text": [ "3x3 DataFrame\n", "|-------|---|-----|------|\n", "| Row # | A | B | C |\n", "| 1 | 1 | \"a\" | 1//2 |\n", "| 2 | 2 | \"b\" | 3//4 |\n", "| 3 | 3 | \"c\" | 5//6 |" ] } ], "prompt_number": 44 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Distributions" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using Distributions" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 45 }, { "cell_type": "code", "collapsed": false, "input": [ "srand(1)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 46 }, { "cell_type": "code", "collapsed": false, "input": [ "x = rand(Normal(10, 1), 10)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 47, "text": [ "10-element Array{Float64,1}:\n", " 10.6701 \n", " 10.5509 \n", " 9.93663\n", " 11.3369 \n", " 9.92685\n", " 9.25454\n", " 8.77994\n", " 9.94682\n", " 9.83486\n", " 7.88463" ] } ], "prompt_number": 47 }, { "cell_type": "code", "collapsed": false, "input": [ "pdf(Normal(10, 1), 1.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 48, "text": [ "1.0279773571668915e-18" ] } ], "prompt_number": 48 }, { "cell_type": "code", "collapsed": false, "input": [ "loglikelihood(Normal(10, 1), x)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 49, "text": [ "-13.738557913500014" ] } ], "prompt_number": 49 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "RDatasets" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using RDatasets" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 50 }, { "cell_type": "code", "collapsed": false, "input": [ "iris = dataset(\"datasets\", \"iris\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "html": [ "
SepalLengthSepalWidthPetalLengthPetalWidthSpecies
15.13.51.40.2setosa
24.93.01.40.2setosa
34.73.21.30.2setosa
44.63.11.50.2setosa
55.03.61.40.2setosa
65.43.91.70.4setosa
74.63.41.40.3setosa
85.03.41.50.2setosa
94.42.91.40.2setosa
104.93.11.50.1setosa
115.43.71.50.2setosa
124.83.41.60.2setosa
134.83.01.40.1setosa
144.33.01.10.1setosa
155.84.01.20.2setosa
165.74.41.50.4setosa
175.43.91.30.4setosa
185.13.51.40.3setosa
195.73.81.70.3setosa
205.13.81.50.3setosa
215.43.41.70.2setosa
225.13.71.50.4setosa
234.63.61.00.2setosa
245.13.31.70.5setosa
254.83.41.90.2setosa
265.03.01.60.2setosa
275.03.41.60.4setosa
285.23.51.50.2setosa
295.23.41.40.2setosa
304.73.21.60.2setosa
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 51, "text": [ "150x5 DataFrame\n", "|-------|-------------|------------|-------------|------------|-------------|\n", "| Row # | SepalLength | SepalWidth | PetalLength | PetalWidth | Species |\n", "| 1 | 5.1 | 3.5 | 1.4 | 0.2 | \"setosa\" |\n", "| 2 | 4.9 | 3.0 | 1.4 | 0.2 | \"setosa\" |\n", "| 3 | 4.7 | 3.2 | 1.3 | 0.2 | \"setosa\" |\n", "| 4 | 4.6 | 3.1 | 1.5 | 0.2 | \"setosa\" |\n", "| 5 | 5.0 | 3.6 | 1.4 | 0.2 | \"setosa\" |\n", "| 6 | 5.4 | 3.9 | 1.7 | 0.4 | \"setosa\" |\n", "| 7 | 4.6 | 3.4 | 1.4 | 0.3 | \"setosa\" |\n", "| 8 | 5.0 | 3.4 | 1.5 | 0.2 | \"setosa\" |\n", "| 9 | 4.4 | 2.9 | 1.4 | 0.2 | \"setosa\" |\n", "| 10 | 4.9 | 3.1 | 1.5 | 0.1 | \"setosa\" |\n", "| 11 | 5.4 | 3.7 | 1.5 | 0.2 | \"setosa\" |\n", "\u22ee\n", "| 139 | 6.0 | 3.0 | 4.8 | 1.8 | \"virginica\" |\n", "| 140 | 6.9 | 3.1 | 5.4 | 2.1 | \"virginica\" |\n", "| 141 | 6.7 | 3.1 | 5.6 | 2.4 | \"virginica\" |\n", "| 142 | 6.9 | 3.1 | 5.1 | 2.3 | \"virginica\" |\n", "| 143 | 5.8 | 2.7 | 5.1 | 1.9 | \"virginica\" |\n", "| 144 | 6.8 | 3.2 | 5.9 | 2.3 | \"virginica\" |\n", "| 145 | 6.7 | 3.3 | 5.7 | 2.5 | \"virginica\" |\n", "| 146 | 6.7 | 3.0 | 5.2 | 2.3 | \"virginica\" |\n", "| 147 | 6.3 | 2.5 | 5.0 | 1.9 | \"virginica\" |\n", "| 148 | 6.5 | 3.0 | 5.2 | 2.0 | \"virginica\" |\n", "| 149 | 6.2 | 3.4 | 5.4 | 2.3 | \"virginica\" |\n", "| 150 | 5.9 | 3.0 | 5.1 | 1.8 | \"virginica\" |" ] } ], "prompt_number": 51 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Optim" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using Optim" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 52 }, { "cell_type": "code", "collapsed": false, "input": [ "x = rand(Normal(31, 11), 1_000)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 53, "text": [ "1000-element Array{Float64,1}:\n", " 22.3331\n", " 41.9854\n", " 25.5507\n", " 35.0022\n", " 38.4288\n", " 37.6098\n", " 14.4765\n", " 26.8999\n", " 16.957 \n", " 38.0962\n", " 51.9647\n", " 13.5306\n", " 34.2712\n", " \u22ee \n", " 15.7977\n", " 21.9261\n", " 23.4996\n", " 40.0516\n", " 27.46 \n", " 36.5742\n", " 33.2723\n", " 21.4413\n", " 31.422 \n", " 39.2109\n", " 24.5928\n", " 39.204 " ] } ], "prompt_number": 53 }, { "cell_type": "code", "collapsed": false, "input": [ "function nll(theta)\n", " -loglikelihood(Normal(theta[1], exp(theta[2])), x)\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 54, "text": [ "nll (generic function with 1 method)" ] } ], "prompt_number": 54 }, { "cell_type": "code", "collapsed": false, "input": [ "nll([1.0, 1.0])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 55, "text": [ "70556.9217598127" ] } ], "prompt_number": 55 }, { "cell_type": "code", "collapsed": false, "input": [ "fit = optimize(nll, [0.0, 0.0])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 56, "text": [ "Results of Optimization Algorithm\n", " * Algorithm: Nelder-Mead\n", " * Starting Point: [0.0,0.0]\n", " * Minimum: [30.87859403957906,2.4004058957100334]\n", " * Value of Function at Minimum: 3819.342967\n", " * Iterations: 54\n", " * Convergence: true\n", " * |x - x'| < NaN: false\n", " * |f(x) - f(x')| / |f(x)| < 1.0e-08: true\n", " * |g(x)| < NaN: false\n", " * Exceeded Maximum Number of Iterations: false\n", " * Objective Function Calls: 106\n", " * Gradient Call: 0" ] } ], "prompt_number": 56 }, { "cell_type": "code", "collapsed": false, "input": [ "theta = fit.minimum" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 57, "text": [ "2-element Array{Float64,1}:\n", " 30.8786 \n", " 2.40041" ] } ], "prompt_number": 57 }, { "cell_type": "code", "collapsed": false, "input": [ "Normal(theta[1], exp(theta[2]))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 58, "text": [ "Normal( \u03bc=30.87859403957906 \u03c3=11.027651548809786 )" ] } ], "prompt_number": 58 } ], "metadata": {} } ] }