{ "metadata": { "language": "Julia", "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Julia and Statistical Computing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Why Use Julia?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "![](files/julia.png)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Additional Reasons to Use Julia\n", "\n", "* Clean language design:\n", " * Conservative, highly selective choice of features\n", " * Appropriate for both prototype and production code\n", " * Familiar C-like syntax\n", " * Strong metaprogramming facilities derived from Lisp\n", "* Smooth interoperability with C/Fortran\n", "* Growing, dedicated community" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Why You Shouldn't Use Julia" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* Julia is very young\n", "* The Julia package ecosystem is even younger\n", "* Breaking changes are still coming in core and will be quite frequent outside of core Julia\n", "* Language features are still being added: your favorite may not exist yet\n", "* Code quality for packages varies from reasonably well tested to never tested to broken" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "So... Should You Use Julia?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That depends on your use case:\n", "\n", "* If you tend to build lots of tools from scratch, Julia is usable, but a little rough\n", "* If you tend to build upon lots of other packages, Julia isn't ready for you yet" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The Basics of Julia" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Julia has all of the basic types and functions you need to do math" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# Integers\n", "1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "1" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "# Floating point numbers\n", "1.0" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "1.0" ] } ], "prompt_number": 2 }, { "cell_type": "code", "collapsed": false, "input": [ "# Strings\n", "\"a\"" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "\"a\"" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "# Complex numbers\n", "1 + 2im" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "1 + 2im" ] } ], "prompt_number": 4 }, { "cell_type": "code", "collapsed": false, "input": [ "# Vectors\n", "[1, 2, 3]" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ "3-element Array{Int64,1}:\n", " 1\n", " 2\n", " 3" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "# Matrices\n", "[1.0 2.0; 3.1 4.2]" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ "2x2 Array{Float64,2}:\n", " 1.0 2.0\n", " 3.1 4.2" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "# Elementary functions\n", "sin(1.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "0.8414709848078965" ] } ], "prompt_number": 7 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Julia's Type System" ] }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Any)[1:4]" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 8, "text": [ "4-element Array{Any,1}:\n", " AbstractArray{T,N}\n", " AbstractCmd \n", " AbstractRNG \n", " Algorithm " ] } ], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Number)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "5-element Array{Any,1}:\n", " Complex{Float16}\n", " Complex{Float32}\n", " Complex{Float64}\n", " Complex{T<:Real}\n", " Real " ] } ], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Real)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "4-element Array{Any,1}:\n", " FloatingPoint \n", " Integer \n", " MathConst{sym} \n", " Rational{T<:Integer}" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(Integer)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "5-element Array{Any,1}:\n", " BigInt \n", " Bool \n", " Char \n", " Signed \n", " Unsigned" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "subtypes(None)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method subtypes(Type{None})\nwhile loading In[12], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method subtypes(Type{None})\nwhile loading In[12], in expression starting on line 1" ] } ], "prompt_number": 12 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Why Are Types Important?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "* **Abstraction**: Assert that a computation works for many related types of objects\n", "* **Parametric Polymorphism**: Function behavior can be defined over many types at once\n", "* **Ad Hoc Polymorphism**: Function behavior can vary systematically across types\n", "* **Type Safety**: How should a function work if the wrong type of argument is used?" ] }, { "cell_type": "heading", "level": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Abstraction / Parametric Polymorphism" ] }, { "cell_type": "code", "collapsed": false, "input": [ "sqrt(2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "1.4142135623730951" ] } ], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": [ "sqrt(2.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ "1.4142135623730951" ] } ], "prompt_number": 2 }, { "cell_type": "heading", "level": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Ad Hoc Polymorphism" ] }, { "cell_type": "code", "collapsed": false, "input": [ "1 + 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ "2" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "1.0 + 1.0" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ "2.0" ] } ], "prompt_number": 4 }, { "cell_type": "heading", "level": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Type Safety" ] }, { "cell_type": "code", "collapsed": false, "input": [ "\"wat\" + 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method +(ASCIIString, Int64)\nwhile loading In[5], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method +(ASCIIString, Int64)\nwhile loading In[5], in expression starting on line 1" ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "1 + \"wat\"" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method +(Int64, ASCIIString)\nwhile loading In[6], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method +(Int64, ASCIIString)\nwhile loading In[6], in expression starting on line 1" ] } ], "prompt_number": 6 }, { "cell_type": "code", "collapsed": false, "input": [ "\"wat\" - 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method -(ASCIIString, Int64)\nwhile loading In[7], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method -(ASCIIString, Int64)\nwhile loading In[7], in expression starting on line 1" ] } ], "prompt_number": 7 }, { "cell_type": "code", "collapsed": false, "input": [ "1 - \"wat\"" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method -(Int64, ASCIIString)\nwhile loading In[8], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method -(Int64, ASCIIString)\nwhile loading In[8], in expression starting on line 1" ] } ], "prompt_number": 8 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Multiple Dispatch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The meaning of a function can depend upon the types of its inputs" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function foo(a, b)\n", " return a + b\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "foo (generic function with 1 method)" ] } ], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "foo(1, 2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "3" ] } ], "prompt_number": 10 }, { "cell_type": "code", "collapsed": false, "input": [ "foo(1.0, 2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 11, "text": [ "3.0" ] } ], "prompt_number": 11 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(a, b) = a * b" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "bar (generic function with 1 method)" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(2, 3)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "6" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(1, \"wat\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "no method *(Int64, ASCIIString)\nwhile loading In[14], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "no method *(Int64, ASCIIString)\nwhile loading In[14], in expression starting on line 1", " in bar at In[12]:1" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "bar(\"NaN\", \"NaN\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ "\"NaNNaN\"" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(a::Integer, b::Integer) = 1" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ "folly (generic function with 1 method)" ] } ], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(a::Float64, b::Float64) = 2" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "folly (generic function with 2 methods)" ] } ], "prompt_number": 17 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(1, 2)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "1" ] } ], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "folly(1.0, 2.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "2" ] } ], "prompt_number": 19 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "User-Defined Types" ] }, { "cell_type": "code", "collapsed": false, "input": [ "type NewType\n", " a::Int\n", " b::UTF8String\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 20 }, { "cell_type": "code", "collapsed": false, "input": [ "x = NewType(1, \"this is a string\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 21, "text": [ "NewType(1,\"this is a string\")" ] } ], "prompt_number": 21 }, { "cell_type": "code", "collapsed": false, "input": [ "x.a, x.b" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 22, "text": [ "(1,\"this is a string\")" ] } ], "prompt_number": 22 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Modules / Namespaces" ] }, { "cell_type": "code", "collapsed": false, "input": [ "module NewNamespace\n", " c = 42\n", " export c\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "c" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "LoadError", "evalue": "c not defined\nwhile loading In[24], in expression starting on line 1", "output_type": "pyerr", "traceback": [ "c not defined\nwhile loading In[24], in expression starting on line 1" ] } ], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "using NewNamespace" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 25 }, { "cell_type": "code", "collapsed": false, "input": [ "c" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 26, "text": [ "42" ] } ], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "c = 41" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 27, "text": [ "41" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "Warning: imported binding for c overwritten in module Main\n" ] } ], "prompt_number": 27 }, { "cell_type": "code", "collapsed": false, "input": [ "NewNamespace.c" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ "42" ] } ], "prompt_number": 28 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "What's the Catch?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "If Julia looks like most dynamic languages, why is it so fast?" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Designing for Efficiency\n", "\n", "* Julia's strict language definition makes it possible for the Julia compiler to do a lot of work:\n", " * Infer very specific types for variables\n", " * Perform static analysis of code\n", " * Provide custom binary implementations of high-level constructs based on specific types\n", " * Represent data in a maximally efficient way\n", "* Lots of valid code in Python and Ruby is illegal in Julia\n", "* Every Julia type has an obvious deconstruction into machine types" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Static Analysis" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" } }, "source": [ "Julia generates custom function implementations for every input type signatures" ] }, { "cell_type": "code", "collapsed": false, "input": [ "foo(a, b) = a + b" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 29, "text": [ "foo (generic function with 1 method)" ] } ], "prompt_number": 29 }, { "cell_type": "code", "collapsed": false, "input": [ "code_llvm(foo, (Int, Int))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "define i64 @julia_foo1292(i64, i64) {\n", "top:\n", " %2 = add i64 %1, %0, !dbg !3947\n", " ret i64 %2, !dbg !3947\n", "}\n" ] } ], "prompt_number": 30 }, { "cell_type": "code", "collapsed": false, "input": [ "code_llvm(foo, (Float64, Float64))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "define double @julia_foo1305(double, double) {\n", "top:\n", " %2 = fadd double %0, %1, !dbg !3986\n", " ret double %2, !dbg !3986\n", "}\n" ] } ], "prompt_number": 31 }, { "cell_type": "code", "collapsed": false, "input": [ "code_native(foo, (Int, Int))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\t.section\t__TEXT,__text,regular,pure_instructions\n", "Filename: In[29]\n", "Source line: 1\n", "\tpush\tRBP\n", "\tmov\tRBP, RSP\n", "Source line: 1\n", "\tadd\tRDI, RSI\n", "\tmov\tRAX, RDI\n", "\tpop\tRBP\n", "\tret\n" ] } ], "prompt_number": 32 }, { "cell_type": "code", "collapsed": false, "input": [ "code_native(foo, (Float64, Float64))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\t.section\t__TEXT,__text,regular,pure_instructions\n", "Filename: In[29]\n", "Source line: 1\n", "\tpush\tRBP\n", "\tmov\tRBP, RSP\n", "Source line: 1\n", "\tvaddsd\tXMM0, XMM0, XMM1\n", "\tpop\tRBP\n", "\tret\n" ] } ], "prompt_number": 33 }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "For simple functions, Julia's specialized implementation is similar to a natural C implementation" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "As Julia gets smarter, its static analyses tools improve all code you write:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function foo()\n", "\tr = 0\n", "\tfor i in 1:12\n", "\t\tr += 2\n", "\tend\n", "\treturn r\n", "end\n", "\n", "code_llvm(foo, ())" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "define i64 @julia_foo1307() {\n", "pass2:\n", " ret i64 24, !dbg !3993\n", "}\n" ] } ], "prompt_number": 34 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "What You Lose" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "To make static analysis effective, Julia rules out certain types of code:\n", "\n", "* Reification of scope\n", "* Evaluating code in arbitrary scopes\n", "* Unexpected changes of type\n", "* Mature community and package system" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "R's reification of scope is ruled out by Julia's design:\n", "\n", "```\n", "f1 <- function(scope.number) {\n", "\tls(env = sys.frame(scope.number))\n", "}\n", "\n", "f2 <- function() {\n", " a <- 1\n", "\tmy.scope <- sys.nframe()\n", "\tf1(my.scope)\n", "}\n", "\n", "f2()\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "R's ability to mutate ad hoc scopes is ruled out by Julia's design:\n", "\n", "```\n", "g1 <- function(var.name, scope.number)\n", "{\n", "\tassign(\"a\", -1, envir = sys.frame(scope.number))\n", "}\n", " \n", "g2 <- function()\n", "{\n", "\ta <- 1\n", "\tscope.number <- sys.nframe()\n", "\tprint(a)\n", "\tg1(\"a\", scope.number)\n", "\tprint(a)\n", "}\n", "\n", "g2()\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "R's type mutation by assignment is ruled out by Julia's design:\n", "\n", "```\n", "v <- c(1, 2, 3)\n", "v[1] <- \"a\"\n", "print(v)\n", "```" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "To make defaults efficient, Julia assumes you want to work close to the metal:\n", "\n", "* Machine precision integers, not infinite precision integers" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function fib{T <: Integer}(n::T)\n", " if n == zero(T)\n", " return zero(T)\n", " elseif n == one(T)\n", " return one(T)\n", " else\n", " a, b = zero(T), one(T)\n", " i = 1\n", " while i < n\n", " a, b = b, a + b\n", " i += 1\n", " end\n", " return b\n", " end\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 35, "text": [ "fib (generic function with 1 method)" ] } ], "prompt_number": 35 }, { "cell_type": "code", "collapsed": false, "input": [ "fib(95)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 36, "text": [ "-4953053512429003327" ] } ], "prompt_number": 36 }, { "cell_type": "code", "collapsed": false, "input": [ "fib(BigInt(95))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 37, "text": [ "31940434634990099905" ] } ], "prompt_number": 37 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Statistics in Julia" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* StatsBase.jl (Mature, but incomplete)\n", "* DataArrays.jl (Usable, but immature)\n", "* DataFrames.jl (Usable, but immature)\n", "* Distributions.jl (Mature, complete)\n", "* Gadfly.jl (Mature, but incomplete)\n", "* GLM.jl (Usable, but immature)\n", "* Optim.jl, NLopt.jl, DualNumbers.jl, Calculus.jl, JuMP.jl (Mature, fairly complete)" ] }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "StatsBase" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using StatsBase" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 38 }, { "cell_type": "code", "collapsed": false, "input": [ "modes([1, 1, 2, 2, 3])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 39, "text": [ "2-element Array{Int64,1}:\n", " 2\n", " 1" ] } ], "prompt_number": 39 }, { "cell_type": "code", "collapsed": false, "input": [ "corspearman(rand(100), rand(100))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 40, "text": [ "-0.08795679567956798" ] } ], "prompt_number": 40 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "DataArrays" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using DataArrays" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 41 }, { "cell_type": "code", "collapsed": false, "input": [ "da = @data([1, 2, NA, 4])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 42, "text": [ "4-element DataArray{Int64,1}:\n", " 1 \n", " 2 \n", " NA\n", " 4 " ] } ], "prompt_number": 42 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "DataFrames" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using DataFrames" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 43 }, { "cell_type": "code", "collapsed": false, "input": [ "df = DataFrame(\n", " A = [1, 2, 3],\n", " B = [\"a\", \"b\", \"c\"],\n", " C = [1//2, 3//4, 5//6]\n", ")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "html": [ "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 44, "text": [ "3x3 DataFrame\n", "|-------|---|-----|------|\n", "| Row # | A | B | C |\n", "| 1 | 1 | \"a\" | 1//2 |\n", "| 2 | 2 | \"b\" | 3//4 |\n", "| 3 | 3 | \"c\" | 5//6 |" ] } ], "prompt_number": 44 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Distributions" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using Distributions" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 45 }, { "cell_type": "code", "collapsed": false, "input": [ "srand(1)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 46 }, { "cell_type": "code", "collapsed": false, "input": [ "x = rand(Normal(10, 1), 10)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 47, "text": [ "10-element Array{Float64,1}:\n", " 10.6701 \n", " 10.5509 \n", " 9.93663\n", " 11.3369 \n", " 9.92685\n", " 9.25454\n", " 8.77994\n", " 9.94682\n", " 9.83486\n", " 7.88463" ] } ], "prompt_number": 47 }, { "cell_type": "code", "collapsed": false, "input": [ "pdf(Normal(10, 1), 1.0)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 48, "text": [ "1.0279773571668915e-18" ] } ], "prompt_number": 48 }, { "cell_type": "code", "collapsed": false, "input": [ "loglikelihood(Normal(10, 1), x)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 49, "text": [ "-13.738557913500014" ] } ], "prompt_number": 49 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "RDatasets" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using RDatasets" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 50 }, { "cell_type": "code", "collapsed": false, "input": [ "iris = dataset(\"datasets\", \"iris\")" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "html": [ "
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 51, "text": [ "150x5 DataFrame\n", "|-------|-------------|------------|-------------|------------|-------------|\n", "| Row # | SepalLength | SepalWidth | PetalLength | PetalWidth | Species |\n", "| 1 | 5.1 | 3.5 | 1.4 | 0.2 | \"setosa\" |\n", "| 2 | 4.9 | 3.0 | 1.4 | 0.2 | \"setosa\" |\n", "| 3 | 4.7 | 3.2 | 1.3 | 0.2 | \"setosa\" |\n", "| 4 | 4.6 | 3.1 | 1.5 | 0.2 | \"setosa\" |\n", "| 5 | 5.0 | 3.6 | 1.4 | 0.2 | \"setosa\" |\n", "| 6 | 5.4 | 3.9 | 1.7 | 0.4 | \"setosa\" |\n", "| 7 | 4.6 | 3.4 | 1.4 | 0.3 | \"setosa\" |\n", "| 8 | 5.0 | 3.4 | 1.5 | 0.2 | \"setosa\" |\n", "| 9 | 4.4 | 2.9 | 1.4 | 0.2 | \"setosa\" |\n", "| 10 | 4.9 | 3.1 | 1.5 | 0.1 | \"setosa\" |\n", "| 11 | 5.4 | 3.7 | 1.5 | 0.2 | \"setosa\" |\n", "\u22ee\n", "| 139 | 6.0 | 3.0 | 4.8 | 1.8 | \"virginica\" |\n", "| 140 | 6.9 | 3.1 | 5.4 | 2.1 | \"virginica\" |\n", "| 141 | 6.7 | 3.1 | 5.6 | 2.4 | \"virginica\" |\n", "| 142 | 6.9 | 3.1 | 5.1 | 2.3 | \"virginica\" |\n", "| 143 | 5.8 | 2.7 | 5.1 | 1.9 | \"virginica\" |\n", "| 144 | 6.8 | 3.2 | 5.9 | 2.3 | \"virginica\" |\n", "| 145 | 6.7 | 3.3 | 5.7 | 2.5 | \"virginica\" |\n", "| 146 | 6.7 | 3.0 | 5.2 | 2.3 | \"virginica\" |\n", "| 147 | 6.3 | 2.5 | 5.0 | 1.9 | \"virginica\" |\n", "| 148 | 6.5 | 3.0 | 5.2 | 2.0 | \"virginica\" |\n", "| 149 | 6.2 | 3.4 | 5.4 | 2.3 | \"virginica\" |\n", "| 150 | 5.9 | 3.0 | 5.1 | 1.8 | \"virginica\" |" ] } ], "prompt_number": 51 }, { "cell_type": "heading", "level": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Optim" ] }, { "cell_type": "code", "collapsed": false, "input": [ "using Optim" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "prompt_number": 52 }, { "cell_type": "code", "collapsed": false, "input": [ "x = rand(Normal(31, 11), 1_000)" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 53, "text": [ "1000-element Array{Float64,1}:\n", " 22.3331\n", " 41.9854\n", " 25.5507\n", " 35.0022\n", " 38.4288\n", " 37.6098\n", " 14.4765\n", " 26.8999\n", " 16.957 \n", " 38.0962\n", " 51.9647\n", " 13.5306\n", " 34.2712\n", " \u22ee \n", " 15.7977\n", " 21.9261\n", " 23.4996\n", " 40.0516\n", " 27.46 \n", " 36.5742\n", " 33.2723\n", " 21.4413\n", " 31.422 \n", " 39.2109\n", " 24.5928\n", " 39.204 " ] } ], "prompt_number": 53 }, { "cell_type": "code", "collapsed": false, "input": [ "function nll(theta)\n", " -loglikelihood(Normal(theta[1], exp(theta[2])), x)\n", "end" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 54, "text": [ "nll (generic function with 1 method)" ] } ], "prompt_number": 54 }, { "cell_type": "code", "collapsed": false, "input": [ "nll([1.0, 1.0])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 55, "text": [ "70556.9217598127" ] } ], "prompt_number": 55 }, { "cell_type": "code", "collapsed": false, "input": [ "fit = optimize(nll, [0.0, 0.0])" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 56, "text": [ "Results of Optimization Algorithm\n", " * Algorithm: Nelder-Mead\n", " * Starting Point: [0.0,0.0]\n", " * Minimum: [30.87859403957906,2.4004058957100334]\n", " * Value of Function at Minimum: 3819.342967\n", " * Iterations: 54\n", " * Convergence: true\n", " * |x - x'| < NaN: false\n", " * |f(x) - f(x')| / |f(x)| < 1.0e-08: true\n", " * |g(x)| < NaN: false\n", " * Exceeded Maximum Number of Iterations: false\n", " * Objective Function Calls: 106\n", " * Gradient Call: 0" ] } ], "prompt_number": 56 }, { "cell_type": "code", "collapsed": false, "input": [ "theta = fit.minimum" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 57, "text": [ "2-element Array{Float64,1}:\n", " 30.8786 \n", " 2.40041" ] } ], "prompt_number": 57 }, { "cell_type": "code", "collapsed": false, "input": [ "Normal(theta[1], exp(theta[2]))" ], "language": "python", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 58, "text": [ "Normal( \u03bc=30.87859403957906 \u03c3=11.027651548809786 )" ] } ], "prompt_number": 58 } ], "metadata": {} } ] }