{ "metadata": { "language": "Julia", "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "#1. PLEAC Julia: Strings\n", "[PLEAC = Language Examples Alike Cookbook](http://pleac.sourceforge.net/)\n", "PLEAC examples are drawn from the \"Perl Cookbook\" by Tom Christiansen & Nathan Torkington, published by O'Reilly. They provide a nice range of examples oriented toward data munging, the type of work I tend to want to do first when learning a new language.\n", "\n", "The Julia examples below are principally translations from the [Python version](http://pleac.sourceforge.net/pleac_python/strings.html)\n", "\n", "### Caution\n", "I'm learning as I go, so the code below probably doesn't represent best practice. Your suggestions are welcome! \n", "Please file an issue or make a pull request to the [github repo](https://github.com/catawbasam/IJulia_PLEAC/).\n", "\n", "### Why isn't this in the main PLEAC repo?\n", "IJulia_PLEAC uses IJulia notebook, whose format appears to be incompatible with the PLEAC file format.\n", "\n", "The examples are not complete. Missing items are generally noted in comments. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PLEAC 1.0: Introduction\n", "Let's try out some string literals." ] }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"\\n\" # a string containing 1 character, which is a newline" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 12, "text": [ "\"\\n\"" ] } ], "prompt_number": 12 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"\\\\\\n\" # two characters, \\ and n" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 13, "text": [ "\"\\\\\\n\"" ] } ], "prompt_number": 13 }, { "cell_type": "code", "collapsed": false, "input": [ "length(mystr)==2" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 14, "text": [ "true" ] } ], "prompt_number": 14 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"Jon 'Maddog' Orwant\" # literal single quote inside double quotes" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 15, "text": [ "\"Jon 'Maddog' Orwant\"" ] } ], "prompt_number": 15 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"\"\"Jon \"Maddog\" Orwant\"\"\" # literal double quote inside triple quotes" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 16, "text": [ "\"Jon \\\"Maddog\\\" Orwant\"" ] } ], "prompt_number": 16 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"Jon \\\"Maddog\\\" Orwant\" # escaped double quote" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 17, "text": [ "\"Jon \\\"Maddog\\\" Orwant\"" ] } ], "prompt_number": 17 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"\"\"\n", "This is a multiline string literal\n", "enclosed in triple double quotes.\n", "\"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 18, "text": [ "\"This is a multiline string literal\\nenclosed in triple double quotes.\\n\"" ] } ], "prompt_number": 18 }, { "cell_type": "code", "collapsed": false, "input": [ "#julia does not currently have an equivalent to python docstringss, but triple-quoted comments are allowed\n", "function fn(x)\n", " \"\"\"we can put a comment here\"\"\"\n", " return 2x\n", "end\n", "\n", "fn(3)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 155, "text": [ "6" ] } ], "prompt_number": 155 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.1: Accessing Substrings" ] }, { "cell_type": "code", "collapsed": false, "input": [ "offset=2; count=3;\n", "mystr[offset:offset+count]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 19, "text": [ "\"his \"" ] } ], "prompt_number": 19 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr[offset:] # offset to end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 20, "text": [ "\"his is a multiline string literal\\nenclosed in triple double quotes.\\n\"" ] } ], "prompt_number": 20 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Get a 5-char string, skip 3, then grab 2 8-char strings, then the rest" ] }, { "cell_type": "code", "collapsed": false, "input": [ "data = b\"abcdefghijklmnopqrstuvwxyz\"; # byte array 0x61-0x7a\n", "leading, s1, s2, trailing = data[1:5], data[9:16], data[17:24], data[25:]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 21, "text": [ "([0x61,0x62,0x63,0x64,0x65],[0x69,0x6a,0x6b,0x6c,0x6d,0x6e,0x6f,0x70],[0x71,0x72,0x73,0x74,0x75,0x76,0x77,0x78],[0x79,0x7a])" ] } ], "prompt_number": 21 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Split at five-char boundaries" ] }, { "cell_type": "code", "collapsed": false, "input": [ "typeof(data)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 22, "text": [ "Array{Uint8,1}" ] } ], "prompt_number": 22 }, { "cell_type": "code", "collapsed": false, "input": [ "a=Array{Uint8,1}[]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 23, "text": [ "0-element Array{Array{Uint8,1},1}" ] } ], "prompt_number": 23 }, { "cell_type": "code", "collapsed": false, "input": [ "push!(a,data[1:5])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 24, "text": [ "1-element Array{Array{Uint8,1},1}:\n", " [0x61,0x62,0x63,0x64,0x65]" ] } ], "prompt_number": 24 }, { "cell_type": "code", "collapsed": false, "input": [ "i=1\n", "count=5\n", "fivers=Array{Uint8,1}[] # 1 dimensional array of bytearrays\n", "while i<=length(data)\n", " chunk = data[ i : min(i+4,length(data)) ]\n", " push!(fivers, chunk )\n", " i=i+count\n", "end\n", "fivers" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 25, "text": [ "6-element Array{Array{Uint8,1},1}:\n", " [0x61,0x62,0x63,0x64,0x65]\n", " [0x66,0x67,0x68,0x69,0x6a]\n", " [0x6b,0x6c,0x6d,0x6e,0x6f]\n", " [0x70,0x71,0x72,0x73,0x74]\n", " [0x75,0x76,0x77,0x78,0x79]\n", " [0x7a] " ] } ], "prompt_number": 25 }, { "cell_type": "markdown", "metadata": {}, "source": [ "####index to sub-string" ] }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"This is what you have\"\n", "# julia indexes are 1-based\n", "first = mystr[1] # \"T\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 26, "text": [ "'T'" ] } ], "prompt_number": 26 }, { "cell_type": "code", "collapsed": false, "input": [ "start = mystr[6:7] # \"is\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 27, "text": [ "\"is\"" ] } ], "prompt_number": 27 }, { "cell_type": "code", "collapsed": false, "input": [ "rest = mystr[14:] # \"you have\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 28, "text": [ "\"you have\"" ] } ], "prompt_number": 28 }, { "cell_type": "code", "collapsed": false, "input": [ "last = mystr[end:] # \"e\" a string" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 29, "text": [ "\"e\"" ] } ], "prompt_number": 29 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr[end] #a char" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 30, "text": [ "'e'" ] } ], "prompt_number": 30 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr[end-1] #a char" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 31, "text": [ "'v'" ] } ], "prompt_number": 31 }, { "cell_type": "code", "collapsed": false, "input": [ "end4 = mystr[end-3:] # \"have\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 32, "text": [ "\"have\"" ] } ], "prompt_number": 32 }, { "cell_type": "code", "collapsed": false, "input": [ "piece = mystr[end-7:end-4] # \"you\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 33, "text": [ "\"you \"" ] } ], "prompt_number": 33 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr[1:10]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 34, "text": [ "\"This is wh\"" ] } ], "prompt_number": 34 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr[10+1:end]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 35, "text": [ "\"at you have\"" ] } ], "prompt_number": 35 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### replace substr" ] }, { "cell_type": "code", "collapsed": false, "input": [ "replace(mystr, \" is \", \" wasn't \")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 36, "text": [ "\"This wasn't what you have\"" ] } ], "prompt_number": 36 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### test for substr" ] }, { "cell_type": "code", "collapsed": false, "input": [ "txt = \"at\"\n", "contains(mystr,txt)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 37, "text": [ "true" ] } ], "prompt_number": 37 }, { "cell_type": "code", "collapsed": false, "input": [ "beginswith(mystr, txt)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 38, "text": [ "false" ] } ], "prompt_number": 38 }, { "cell_type": "code", "collapsed": false, "input": [ "endswith(mystr,txt)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 39, "text": [ "false" ] } ], "prompt_number": 39 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.2: Establishing a Default Value" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "####Background - checking for state of a variable." ] }, { "cell_type": "code", "collapsed": false, "input": [ "isdefined(:astr) " ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 40, "text": [ "false" ] } ], "prompt_number": 40 }, { "cell_type": "code", "collapsed": false, "input": [ "astr=None" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 41, "text": [ "None" ] } ], "prompt_number": 41 }, { "cell_type": "code", "collapsed": false, "input": [ "astr == None " ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 42, "text": [ "true" ] } ], "prompt_number": 42 }, { "cell_type": "code", "collapsed": false, "input": [ "isdefined(:astr) # defined but None" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 43, "text": [ "true" ] } ], "prompt_number": 43 }, { "cell_type": "code", "collapsed": false, "input": [ ":(astr)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 44, "text": [ ":astr" ] } ], "prompt_number": 44 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Haven't found an existing function to do this:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function isassigned(myvar::Symbol)\n", " \"\"\"return true if the Symbol is defined and not None\"\"\"\n", " DEBUG=false\n", " if DEBUG println( :(myvar) ) end\n", " if DEBUG println( isdefined(myvar) ) end\n", " if isdefined(myvar) \n", " varval = eval(myvar);\n", " if DEBUG println( \"varval $varval\" ) end\n", " if varval==None \n", " if DEBUG println( \"$myvar==None \") end\n", " return false\n", " elseif varval==\"\" \n", " if DEBUG println( \"\"\"$myvar==\"\" \"\"\") end\n", " return false\n", " else\n", " if DEBUG println(\"$myvar has an assigned value\") end\n", " return true\n", " end\n", " else\n", " if DEBUG println(\"not defined\") end\n", " return false\n", " end\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 45, "text": [ "isassigned (generic function with 1 method)" ] } ], "prompt_number": 45 }, { "cell_type": "code", "collapsed": false, "input": [ "isassigned( :nosuchvar )" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 46, "text": [ "false" ] } ], "prompt_number": 46 }, { "cell_type": "code", "collapsed": false, "input": [ "isassigned( :astr )" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 47, "text": [ "false" ] } ], "prompt_number": 47 }, { "cell_type": "code", "collapsed": false, "input": [ "ex=999\n", "isassigned(:ex)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 48, "text": [ "true" ] } ], "prompt_number": 48 }, { "cell_type": "markdown", "metadata": {}, "source": [ "####Setting a default value" ] }, { "cell_type": "code", "collapsed": false, "input": [ "mydefault = \"blah\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 49, "text": [ "\"blah\"" ] } ], "prompt_number": 49 }, { "cell_type": "code", "collapsed": false, "input": [ "ystr = isdefined(:(astr)) & (astr!=None) ? astr : mydefault \n", "ystr" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 50, "text": [ "\"blah\"" ] } ], "prompt_number": 50 }, { "cell_type": "code", "collapsed": false, "input": [ "ystr = isassigned(:astr) ? astr : mydefault" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 51, "text": [ "\"blah\"" ] } ], "prompt_number": 51 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### if myvar is returned from a function and may be empty/None, then use:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function f(x::Int)\n", " \"\"\"a function that can return None\"\"\" \n", " y = (x==4) ? x : None \n", " return y\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 52, "text": [ "f (generic function with 1 method)" ] } ], "prompt_number": 52 }, { "cell_type": "code", "collapsed": false, "input": [ "myvar = f(3)\n", "myvar== (myvar==None) ? \"blah\" : myvar" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 53, "text": [ "None" ] } ], "prompt_number": 53 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### if you want a default value that can be overridden by the person calling your code, you can often wrap it in a function with a named parameter:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function fdef(prefix=\"pre\")\n", " \"\"\"a function that can return None\"\"\" \n", " prefix * \" duh\"\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 54, "text": [ "fdef (generic function with 2 methods)" ] } ], "prompt_number": 54 }, { "cell_type": "code", "collapsed": false, "input": [ "fdef(\"asd\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 55, "text": [ "\"asd duh\"" ] } ], "prompt_number": 55 }, { "cell_type": "code", "collapsed": false, "input": [ "fdef()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 56, "text": [ "\"pre duh\"" ] } ], "prompt_number": 56 }, { "cell_type": "code", "collapsed": false, "input": [ "# use b if b is defined, else c\n", "a = isdefined(:b) ? b : 21" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 57, "text": [ "21" ] } ], "prompt_number": 57 }, { "cell_type": "code", "collapsed": false, "input": [ "ENV[\"USERNAME\"]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 58, "text": [ "\"KEITHC\"" ] } ], "prompt_number": 58 }, { "cell_type": "code", "collapsed": false, "input": [ "haskey(ENV,\"USERNAME\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 59, "text": [ "true" ] } ], "prompt_number": 59 }, { "cell_type": "code", "collapsed": false, "input": [ "function get_username()\n", " if haskey(ENV,\"USERNAME\") \n", " usr=ENV[\"USERNAME\"]\n", " elseif haskey(ENV,\"LOGNAME\") \n", " usr=ENV[\"LOGNAME\"]\n", " elseif haskey(ENV,\"USER\") \n", " usr=ENV[\"USER\"] \n", " else \n", " usr = None\n", " end\n", " return usr\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 60, "text": [ "get_username (generic function with 1 method)" ] } ], "prompt_number": 60 }, { "cell_type": "code", "collapsed": false, "input": [ "get_username()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 61, "text": [ "\"KEITHC\"" ] } ], "prompt_number": 61 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.3: Exchanging values without Using Temporary Variables" ] }, { "cell_type": "code", "collapsed": false, "input": [ "# v1, v2 = v2, v1\n", "x, y = (4, 3);" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 61 }, { "cell_type": "code", "collapsed": false, "input": [ "x" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 62, "text": [ "4" ] } ], "prompt_number": 62 }, { "cell_type": "code", "collapsed": false, "input": [ "y" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 63, "text": [ "3" ] } ], "prompt_number": 63 }, { "cell_type": "code", "collapsed": false, "input": [ "one, two, three = split(\"January March August\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 64, "text": [ "3-element Array{String,1}:\n", " \"January\"\n", " \"March\" \n", " \"August\" " ] } ], "prompt_number": 64 }, { "cell_type": "code", "collapsed": false, "input": [ "[one, two, three]'" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 65, "text": [ "1x3 Array{ASCIIString,2}:\n", " \"January\" \"March\" \"August\"" ] } ], "prompt_number": 65 }, { "cell_type": "code", "collapsed": false, "input": [ "one, two, three = two, three, one" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 66, "text": [ "(\"March\",\"August\",\"January\")" ] } ], "prompt_number": 66 }, { "cell_type": "code", "collapsed": false, "input": [ "one" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 67, "text": [ "\"March\"" ] } ], "prompt_number": 67 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.4: Converting between ASCII Characters and Values" ] }, { "cell_type": "code", "collapsed": false, "input": [ "char = 'a'" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 68, "text": [ "'a'" ] } ], "prompt_number": 68 }, { "cell_type": "code", "collapsed": false, "input": [ "ascii_value = int(char)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 69, "text": [ "97" ] } ], "prompt_number": 69 }, { "cell_type": "code", "collapsed": false, "input": [ "Base.char(120) # char(120) should work but I'm getting a namespace collision" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 70, "text": [ "'x'" ] } ], "prompt_number": 70 }, { "cell_type": "code", "collapsed": false, "input": [ "println(\"number $(int(char)) is character '$char'\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "number 97 is character 'a'" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 71 }, { "cell_type": "code", "collapsed": false, "input": [ "ascii_character_numbers = [int(c) for c in \"sample\"]" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 72, "text": [ "6-element Array{Int64,1}:\n", " 115\n", " 97\n", " 109\n", " 112\n", " 108\n", " 101" ] } ], "prompt_number": 72 }, { "cell_type": "code", "collapsed": false, "input": [ "join([c for c in \"sample\"],\"\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 73, "text": [ "\"sample\"" ] } ], "prompt_number": 73 }, { "cell_type": "code", "collapsed": false, "input": [ "join([Base.char(i) for i in ascii_character_numbers],\"\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 74, "text": [ "\"sample\"" ] } ], "prompt_number": 74 }, { "cell_type": "code", "collapsed": false, "input": [ "hal=\"HAL\"\n", "join([Base.char(int(i)+1) for i in hal],\"\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 75, "text": [ "\"IBM\"" ] } ], "prompt_number": 75 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.5: Processing a String one Character at a Time" ] }, { "cell_type": "code", "collapsed": false, "input": [ "myarray = [c for c in \"whatever\"]'" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 76, "text": [ "1x8 Array{Char,2}:\n", " 'w' 'h' 'a' 't' 'e' 'v' 'e' 'r'" ] } ], "prompt_number": 76 }, { "cell_type": "code", "collapsed": false, "input": [ "for mychar in \"whatever\"\n", " println(mychar)\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "w" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n", "h\n", "a\n", "t\n", "e\n", "v\n", "e\n", "r\n" ] } ], "prompt_number": 77 }, { "cell_type": "code", "collapsed": false, "input": [ "cset = Set([c for c in \"Whatever\"]...)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 78, "text": [ "Set{Char}('r','a','h','W','v','t','e')" ] } ], "prompt_number": 78 }, { "cell_type": "code", "collapsed": false, "input": [ "split(\"whatever\",\"\")'" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 79, "text": [ "1x8 Array{String,2}:\n", " \"w\" \"h\" \"a\" \"t\" \"e\" \"v\" \"e\" \"r\"" ] } ], "prompt_number": 79 }, { "cell_type": "code", "collapsed": false, "input": [ "sset =Set( split(\"whatever\",\"\")... )" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 80, "text": [ "Set{ASCIIString}(\"a\",\"e\",\"r\",\"v\",\"w\",\"t\",\"h\")" ] } ], "prompt_number": 80 }, { "cell_type": "code", "collapsed": false, "input": [ "csort = sort([c for c in cset])'" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 81, "text": [ "1x7 Array{Any,2}:\n", " 'W' 'a' 'e' 'h' 'r' 't' 'v'" ] } ], "prompt_number": 81 }, { "cell_type": "code", "collapsed": false, "input": [ "println(\"unique chars are: \", join(csort,\",\"))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "unique chars are: " ] }, { "output_type": "stream", "stream": "stdout", "text": [ "W,a,e,h,r,t,v\n" ] } ], "prompt_number": 82 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr=\"whatever\"\n", "ascvals= [int(c) for c in mystr];" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 82 }, { "cell_type": "code", "collapsed": false, "input": [ "println(\"\"\"Total is $(sum(ascvals)) for \"$mystr\".\"\"\")" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Total is 870 for \"whatever\"." ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] } ], "prompt_number": 83 }, { "cell_type": "code", "collapsed": false, "input": [ "function checksum(mystr) \n", " vals = [ int(c) for c in mystr ]\n", " chksum = sum(vals) % (2^16) - 1\n", " return chksum\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 84, "text": [ "checksum (generic function with 1 method)" ] } ], "prompt_number": 84 }, { "cell_type": "code", "collapsed": false, "input": [ "checksum(\"whatever\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 85, "text": [ "869" ] } ], "prompt_number": 85 }, { "cell_type": "code", "collapsed": false, "input": [ "fl = open(\"c:/temp/blah.txt\",\"w\")\n", "write(fl, \"what up, gee?\")\n", "close(fl)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 86 }, { "cell_type": "code", "collapsed": false, "input": [ "flstr = readall(\"c:/temp/blah.txt\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 87, "text": [ "\"what up, gee?\"" ] } ], "prompt_number": 87 }, { "cell_type": "code", "collapsed": false, "input": [ "checksum(flstr)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 88, "text": [ "1140" ] } ], "prompt_number": 88 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.6: Reversing a String by Word or Character" ] }, { "cell_type": "code", "collapsed": false, "input": [ "reverse([1,2,3])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 89, "text": [ "3-element Array{Int64,1}:\n", " 3\n", " 2\n", " 1" ] } ], "prompt_number": 89 }, { "cell_type": "code", "collapsed": false, "input": [ "reverse(\"Stuff\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 90, "text": [ "\"ffutS\"" ] } ], "prompt_number": 90 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr=\"a bunch of words\"\n", "myarray = split(mystr)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 91, "text": [ "4-element Array{String,1}:\n", " \"a\" \n", " \"bunch\"\n", " \"of\" \n", " \"words\"" ] } ], "prompt_number": 91 }, { "cell_type": "code", "collapsed": false, "input": [ "join(reverse(myarray), \" \")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 92, "text": [ "\"words of bunch a\"" ] } ], "prompt_number": 92 }, { "cell_type": "code", "collapsed": false, "input": [ "word = \"reviver\"\n", "is_palindrome = (word == reverse(word))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 93, "text": [ "true" ] } ], "prompt_number": 93 }, { "cell_type": "markdown", "metadata": {}, "source": [ "* skipped long palindromes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.7: Expanding and Compressing Tabs" ] }, { "cell_type": "code", "collapsed": false, "input": [ "replace(\"adfa\\t \",'\\t',\" \")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 94, "text": [ "\"adfa \"" ] } ], "prompt_number": 94 }, { "cell_type": "code", "collapsed": false, "input": [ "replace(\" blah\", \" \",'\\t')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 95, "text": [ "\"\\t\\tblah\"" ] } ], "prompt_number": 95 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.8: Expanding Variables in User Input" ] }, { "cell_type": "code", "collapsed": false, "input": [ "#-----------------------------\n", "rows=24; cols=80\n", "text = \"I am $(rows) high and $(cols) long\"\n", "print( text)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "I am 24 high and 80 long" ] } ], "prompt_number": 96 }, { "cell_type": "code", "collapsed": false, "input": [ "re = r\"\\d+\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 97, "text": [ "r\"\\d+\"" ] } ], "prompt_number": 97 }, { "cell_type": "code", "collapsed": false, "input": [ "ismatch(re,\"Are you 43?\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 98, "text": [ "true" ] } ], "prompt_number": 98 }, { "cell_type": "code", "collapsed": false, "input": [ "ismatch(re,\"Are you XXIII?\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 99, "text": [ "false" ] } ], "prompt_number": 99 }, { "cell_type": "code", "collapsed": false, "input": [ "m=match(re,\"Are you 43?\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 100, "text": [ "RegexMatch(\"43\")" ] } ], "prompt_number": 100 }, { "cell_type": "code", "collapsed": false, "input": [ "dump(m)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "RegexMatch" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " \n", " match: SubString{UTF8String} \"43\"\n", " captures: Array(Union(Nothing,SubString{UTF8String}),(0,)) []\n", " offset: " ] }, { "output_type": "stream", "stream": "stdout", "text": [ "Int64 9\n", " offsets: Array(Int64,(0,)) []\n" ] } ], "prompt_number": 101 }, { "cell_type": "code", "collapsed": false, "input": [ "m.offset" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 102, "text": [ "9" ] } ], "prompt_number": 102 }, { "cell_type": "code", "collapsed": false, "input": [ "m.match" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 103, "text": [ "\"43\"" ] } ], "prompt_number": 103 }, { "cell_type": "code", "collapsed": false, "input": [ "replace(\"Are you 23 or 43?\",re,\"xx\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 104, "text": [ "\"Are you xx or xx?\"" ] } ], "prompt_number": 104 }, { "cell_type": "code", "collapsed": false, "input": [ "m.captures" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 105, "text": [ "0-element Array{Union(Nothing,SubString{UTF8String}),1}" ] } ], "prompt_number": 105 }, { "cell_type": "code", "collapsed": false, "input": [ "rec=r\"(\\d+)\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 106, "text": [ "r\"(\\d+)\"" ] } ], "prompt_number": 106 }, { "cell_type": "code", "collapsed": false, "input": [ "m=match(rec,\"Are you 43?\")\n", "dump(m)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "RegexMatch" ] }, { "output_type": "stream", "stream": "stdout", "text": [ " \n", " match: SubString{UTF8String} \"43\"\n", " captures: Array(Union(Nothing,SubString{UTF8String}),(1,)) [\"43\"]\n", " offset: Int64 9\n", " offsets: Array(Int64,(1,)) [9]\n" ] } ], "prompt_number": 107 }, { "cell_type": "code", "collapsed": false, "input": [ "parseint( m.captures[1])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 108, "text": [ "43" ] } ], "prompt_number": 108 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \"Are you 43?\"\n", "m = match(rec,\"Are you 43?\")\n", "replace(mystr, m.captures[1], string(2*parseint(m.captures[1])) )" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 109, "text": [ "\"Are you 86?\"" ] } ], "prompt_number": 109 }, { "cell_type": "markdown", "metadata": {}, "source": [ "####Safe Substitution" ] }, { "cell_type": "code", "collapsed": false, "input": [ "function default_string(string_symbol::Symbol, default=\"[Not Defined]\")\n", " def_str= isdefined(string_symbol) ? eval(string_symbol) : \"[Not Defined]\"\n", " return def_str\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 110, "text": [ "default_string (generic function with 2 methods)" ] } ], "prompt_number": 110 }, { "cell_type": "code", "collapsed": false, "input": [ "\"\"\"who is $(isdefined(:z) ? z : \"[Not Defined]\")\"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 111, "text": [ "\"who is [Not Defined]\"" ] } ], "prompt_number": 111 }, { "cell_type": "code", "collapsed": false, "input": [ "\"\"\"who is $(default_string(:z))\"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 112, "text": [ "\"who is [Not Defined]\"" ] } ], "prompt_number": 112 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.9: Controlling Case" ] }, { "cell_type": "code", "collapsed": false, "input": [ "uppercase(\"bo peep\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 113, "text": [ "\"BO PEEP\"" ] } ], "prompt_number": 113 }, { "cell_type": "code", "collapsed": false, "input": [ "lowercase(\"Bo Peep\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 114, "text": [ "\"bo peep\"" ] } ], "prompt_number": 114 }, { "cell_type": "code", "collapsed": false, "input": [ "mystr=\"bo Peep\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 115, "text": [ "\"bo Peep\"" ] } ], "prompt_number": 115 }, { "cell_type": "code", "collapsed": false, "input": [ "function capitalcase(mystr)\n", " \"\"\"capitalize the first letter of each word in a string\"\"\"\n", " return join([ucfirst(s) for s in split(mystr)], \" \")\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 116, "text": [ "capitalcase (generic function with 1 method)" ] } ], "prompt_number": 116 }, { "cell_type": "code", "collapsed": false, "input": [ "capitalcase(\"bo peep went to fetch\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 117, "text": [ "\"Bo Peep Went To Fetch\"" ] } ], "prompt_number": 117 }, { "cell_type": "code", "collapsed": false, "input": [ "ucfirst(\"bo peep\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 118, "text": [ "\"Bo peep\"" ] } ], "prompt_number": 118 }, { "cell_type": "code", "collapsed": false, "input": [ "#skipped random case letters" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 119 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.10: Interpolating Functions and Expressions Within Strings" ] }, { "cell_type": "code", "collapsed": false, "input": [ "n=45\n", "\"I have $(n + 1) guanacos.\" " ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 120, "text": [ "\"I have 46 guanacos.\"" ] } ], "prompt_number": 120 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use templates from Package Mustache.jl to substitute from a dictionary. " ] }, { "cell_type": "code", "collapsed": false, "input": [ "using Mustache" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 121 }, { "cell_type": "code", "collapsed": false, "input": [ "mytemplate = mt\"\n", "To: {{address}}\n", "From: Your Bank\n", "CC: {{cc_number}}\n", "Date: {{date}}\n", "\n", "Dear {{name}},\n", "\n", "Today you bounced check number {{checknum}} to us.\n", "Your account is now closed.\n", "\n", "Sincerely,\n", "the management\n", "\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 122, "text": [ "MustacheTokens({{\"text\",\"\\nTo: \",0,7},{\"name\",\"address\",7,20},{\"text\",\"\\nFrom: Your Bank\\nCC: \",20,43},{\"name\",\"cc_number\",43,58},{\"text\",\"\\nDate: \",58,65},{\"name\",\"date\",65,75},{\"text\",\"\\n\\nDear \",75,82},{\"name\",\"name\",82,92},{\"text\",\",\\n\\nToday you bounced check number \",92,126},{\"name\",\"checknum\",126,140},{\"text\",\" to us.\\nYour account is now closed.\\n\\nSincerely,\\nthe management\\n\",140,203}})" ] } ], "prompt_number": 122 }, { "cell_type": "code", "collapsed": false, "input": [ "person = {\"address\"=>\"Joe@somewhere.com\",\n", " \"name\"=>\"Joe\",\n", " \"date\"=>\"2012-04-22\",\n", " \"cc_number\"=>1234567890,\n", " \"checknum\"=> 512 \n", " }" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 123, "text": [ "{\"date\"=>\"2012-04-22\",\"checknum\"=>512,\"cc_number\"=>1234567890,\"address\"=>\"Joe@somewhere.com\",\"name\"=>\"Joe\"}" ] } ], "prompt_number": 123 }, { "cell_type": "code", "collapsed": false, "input": [ "print(render(mytemplate, person))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "To: Joe@somewhere.com\n", "From: Your Bank\n", "CC: 1234567890\n", "Date: 2012-04-22\n", "\n", "Dear Joe,\n", "\n", "Today you bounced check number 512 to us.\n", "Your account is now closed.\n", "\n", "Sincerely,\n", "the management\n" ] } ], "prompt_number": 124 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.11: Indenting Here Documents \n", "indenting here documents \n", "not clear we need to do anything --there are no leading blanks by default" ] }, { "cell_type": "code", "collapsed": false, "input": [ "var = \"\"\"\n", " your text\n", " goes here\n", " \"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 125, "text": [ "\"your text\\ngoes here\\n\"" ] } ], "prompt_number": 125 }, { "cell_type": "code", "collapsed": false, "input": [ "print(var)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "your text\n", "goes here\n" ] } ], "prompt_number": 126 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### split into lines, use every line except first and last, left strip and rejoin." ] }, { "cell_type": "code", "collapsed": false, "input": [ "poem = \"\"\"\n", " Here's your poem:\n", " Now far ahead the Road has gone,\n", " And I must follow, if I can,\n", " Pursuing it with eager feet,\n", " Until it joins some larger way\n", " Where many paths and errand meet.\n", " And whither then? I cannot say.\n", " --Bilbo in /usr/src/perl/pp_ctl.c \n", " \"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 127, "text": [ "\"Here's your poem:\\nNow far ahead the Road has gone,\\n And I must follow, if I can,\\nPursuing it with eager feet,\\n Until it joins some larger way\\nWhere many paths and errand meet.\\n And whither then? I cannot say.\\n --Bilbo in /usr/src/perl/pp_ctl.c \\n\"" ] } ], "prompt_number": 127 }, { "cell_type": "code", "collapsed": false, "input": [ "print( poem)" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Here's your poem:\n", "Now far ahead the Road has gone,\n", " And I must follow, if I can,\n", "Pursuing it with eager feet,\n", " Until it joins some larger way\n", "Where many paths and errand meet.\n", " And whither then? I cannot say.\n", " --Bilbo in /usr/src/perl/pp_ctl.c \n" ] } ], "prompt_number": 128 }, { "cell_type": "code", "collapsed": false, "input": [ "print( join( [lstrip(ln) for ln in split(poem,'\\n')[2:end-2] ], \"\\n\"))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Now far ahead the Road has gone,\n", "And I must follow, if I can,\n", "Pursuing it with eager feet,\n", "Until it joins some larger way\n", "Where many paths and errand meet.\n", "And whither then? I cannot say." ] } ], "prompt_number": 129 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.12: Reformatting Paragraphs" ] }, { "cell_type": "code", "collapsed": false, "input": [ "require(\"textwrap\")\n", "using TextWrap" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 131 }, { "cell_type": "code", "collapsed": false, "input": [ "txt = \"\"\"\\\n", "Folding and splicing is the work of an editor,\n", "not a mere collection of silicon\n", "and\n", "mobile electrons!\n", "\"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 132, "text": [ "\"Folding and splicing is the work of an editor,\\nnot a mere collection of silicon\\nand\\nmobile electrons!\\n\"" ] } ], "prompt_number": 132 }, { "cell_type": "code", "collapsed": false, "input": [ "print(TextWrap.wrap(txt, width=20, initial_indent=4, subsequent_indent=2))" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ " Folding and\n", " splicing is the\n", " work of an editor,\n", " not a mere\n", " collection of\n", " silicon and mobile\n", " electrons!" ] } ], "prompt_number": 133 }, { "cell_type": "code", "collapsed": false, "input": [ "\"\"\"Expected result:\n", "\n", "01234567890123456789\n", " Folding and\n", " splicing is the\n", " work of an editor,\n", " not a mere\n", " collection of\n", " silicon and mobile\n", " electrons!\n", "\"\"\";" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 133 }, { "cell_type": "code", "collapsed": false, "input": [ "# merge multiple lines into one, then wrap one long line -- SKIP" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 134 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.13: Escaping Characters\n", "Will not solve." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.14: Trimming Blanks from the Ends of a String" ] }, { "cell_type": "code", "collapsed": false, "input": [ "mystr = \" blah \"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 135, "text": [ "\" blah \"" ] } ], "prompt_number": 135 }, { "cell_type": "code", "collapsed": false, "input": [ "lstrip(mystr)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 136, "text": [ "\"blah \"" ] } ], "prompt_number": 136 }, { "cell_type": "code", "collapsed": false, "input": [ "rstrip(mystr)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 137, "text": [ "\" blah\"" ] } ], "prompt_number": 137 }, { "cell_type": "code", "collapsed": false, "input": [ "strip(mystr)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 138, "text": [ "\"blah\"" ] } ], "prompt_number": 138 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.15: Parsing Comma-Separated Data" ] }, { "cell_type": "code", "collapsed": false, "input": [ "ezcsv = \"\"\"a,b,c\n", " 1,2,3\n", " 4,5,6\"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 139, "text": [ "\"a,b,c\\n1,2,3\\n4,5,6\"" ] } ], "prompt_number": 139 }, { "cell_type": "code", "collapsed": false, "input": [ "line = \"\"\"XYZZY,\"\",\"O'Reilly, Inc\",\"Wall, Larry\",\"a \\\\\"glug\\\\\" bit,\",5,\"Error, Core Dumped,\",\"\"\"" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 140, "text": [ "\"XYZZY,\\\"\\\",\\\"O'Reilly, Inc\\\",\\\"Wall, Larry\\\",\\\"a \\\\\\\"glug\\\\\\\" bit,\\\",5,\\\"Error, Core Dumped,\\\",\"" ] } ], "prompt_number": 140 }, { "cell_type": "code", "collapsed": false, "input": [ "sio = IOBuffer(ezcsv)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 141, "text": [ "IOBuffer([0x61,0x2c,0x62,0x2c,0x63,0x0a,0x31,0x2c,0x32,0x2c,0x33,0x0a,0x34,0x2c,0x35,0x2c,0x36],true,false,true,false,17,9223372036854775807,1)" ] } ], "prompt_number": 141 }, { "cell_type": "code", "collapsed": false, "input": [ "data_array =readdlm(sio, ',')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 142, "text": [ "3x3 Array{Any,2}:\n", " \"a\" \"b\" \"c\"\n", " 1.0 2.0 3.0 \n", " 4.0 5.0 6.0 " ] } ], "prompt_number": 142 }, { "cell_type": "code", "collapsed": false, "input": [ "close(sio)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 143 }, { "cell_type": "code", "collapsed": false, "input": [ "function stringdlm(source::String, delim::Char; has_header=false, use_mmap=false, ignore_invalid_chars=false)\n", " \"\"\"parse a string of delimited text\"\"\"\n", " sio = IOBuffer(source)\n", " data_array =Base.readdlm(sio, ',')\n", " close(sio)\n", " return data_array\n", "end " ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 144, "text": [ "stringdlm (generic function with 1 method)" ] } ], "prompt_number": 144 }, { "cell_type": "code", "collapsed": false, "input": [ "stringdlm(ezcsv,',')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 145, "text": [ "3x3 Array{Any,2}:\n", " \"a\" \"b\" \"c\"\n", " 1.0 2.0 3.0 \n", " 4.0 5.0 6.0 " ] } ], "prompt_number": 145 }, { "cell_type": "markdown", "metadata": {}, "source": [ "####readdlm() does not appear to know about quoted data, so it does not handle embedded delimiters well\n", "\n", "For example \"Wall, Larry\" should not have been split below." ] }, { "cell_type": "code", "collapsed": false, "input": [ "#bug in readdlm()?\n", "for ln in stringdlm(line,',')'\n", " println(ln)\n", "end" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "X" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "YZZY\n", "\"\"\n", "\"O'Reilly\n", " Inc\"\n", "\"Wall\n", " Larry\"\n", "\"a \\\"glug\\\" bit\n", "\"\n", "5.0\n", "\"Error\n", " Core Dumped\n", "\"\n" ] } ], "prompt_number": 146 }, { "cell_type": "code", "collapsed": false, "input": [ "stringdlm(\"\"\"'Bush,Reggie','Morris,Alfred'\"\"\", ',')" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 147, "text": [ "1x3 Array{Any,2}:\n", " \"'Bush\" \"Reggie'\" \"'Morris\"" ] } ], "prompt_number": 147 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### here are a couple of simple test files to demonstrate treatment of quoted data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "test0.csv:\n", "\n", " \"Name\",\"Age\",\"Weight\"\n", " \"Bob\",26,160\n", " \"Sue\",24,110" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "test1.csv -- the name field includes embedded commas:\n", "\n", " \"Name\",\"Age\",\"Weight\"\n", " \"Jones,Bob\",26,160\n", " \"Collins,Sue\",24,110\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "readdlm() does not seem to know about quotes around strings." ] }, { "cell_type": "code", "collapsed": false, "input": [ "readdlm(\"test0.csv\",',')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "WARNING: " ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\n" ] }, { "ename": "LoadError", "evalue": "new not defined\nat In[148]:1", "output_type": "pyerr", "traceback": [ "new not defined\nat In[148]:1", " in mmap_array at mmap.jl:133" ] } ], "prompt_number": 148 }, { "cell_type": "markdown", "metadata": {}, "source": [ "as a consequence, readdlm() has trouble with test1.csv:" ] }, { "cell_type": "code", "collapsed": false, "input": [ "readdlm(\"test1.csv\",',')" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stderr", "text": [ "backtraces on your platform are often misleading or partially incorrect\n" ] }, { "ename": "LoadError", "evalue": "new not defined\nat In[149]:1", "output_type": "pyerr", "traceback": [ "new not defined\nat In[149]:1", " in mmap_array at mmap.jl:133" ] } ], "prompt_number": 149 }, { "cell_type": "markdown", "metadata": {}, "source": [ "####Package DataFrames contains as readtable() function that is smarter \n", "However, it does not appear to have an option to using an IOBuffer at the moment." ] }, { "cell_type": "code", "collapsed": false, "input": [ "using DataFrames" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "\n", "\n" ] } ], "prompt_number": 150 }, { "cell_type": "code", "collapsed": false, "input": [ "readtable(\"test0.csv\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 151, "text": [ "2x3 DataFrame:\n", " Name Age Weight\n", "[1,] \"Bob\" 26 160\n", "[2,] \"Sue\" 24 110\n" ] } ], "prompt_number": 151 }, { "cell_type": "code", "collapsed": false, "input": [ "readtable(\"test1.csv\")" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 152, "text": [ "2x3 DataFrame:\n", " Name Age Weight\n", "[1,] \"Jones,Bob\" 26 160\n", "[2,] \"Collins,Sue\" 24 110\n" ] } ], "prompt_number": 152 }, { "cell_type": "markdown", "metadata": {}, "source": [ "##PLEAC 1.16-1.18 Soundex and Programs fixstyle, psgrep\n", "Will not solve." ] }, { "cell_type": "code", "collapsed": false, "input": [ ";ipython nbconvert 1_pleac_string.ipynb" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stderr", "text": [ "[NbConvertApp] Using existing profile dir: u'C:\\\\Users\\\\keithc\\\\.ipython\\\\profile_default'\r\n" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "[NbConvertApp] Converting notebook 1_pleac_string.ipynb to html\r\n", "[NbConvertApp] Support files will be in 1_pleac_string_files\\\r\n" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "[NbConvertApp] Loaded template html_full.tpl\r\n" ] }, { "output_type": "stream", "stream": "stderr", "text": [ "[NbConvertApp] Writing 322023 bytes to 1_pleac_string.html\r\n" ] } ], "prompt_number": 157 }, { "cell_type": "code", "collapsed": false, "input": [], "language": "python", "metadata": {}, "outputs": [] } ], "metadata": {} } ] }