{ "metadata": { "name": "", "signature": "sha256:0726660b8f2ee16f37b3438fd53e6b6ce9612c29f721cab4f3d41fb02db4dd6b" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Efficient String Concatenation in Python\n", "\n", "## \uc6d0\ubcf8 \uc790\ub8cc\n", "\n", "- [Efficient String Concatenation in Python](http://www.skymind.com/~ocrow/python_string/)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Introduction\n", "\n", "- long strings in the Python programming language can sometimes result in very slow running code\n", "- In Python the **string** object is **immutable**: **each time** a string is **assigned** to a variable a new object is created in memory to represent the **new value**\n", "- \ub9e4\ubc88 \ubb38\uc790\uc5f4 \ub9c8\uc9c0\ub9c9\uc5d0 \uc0c8\ub85c\uc6b4 \ubb38\uc790\uc5f4\uc744 \ubd99\uc774\uac8c \ub418\uba74 \ub9e4\ubc88 \ubb38\uc790\uc5f4\uc744 \uc0c8\ub85c \uc0dd\uc131\ud55c\ub2e4.\n", "- \ub611\uac19\uc740 \uac78 \ub610 \ubcf5\uc0ac\ud558\uace0 \uadf8 \ub4a4\uc5d0 \ucd94\uac00\ud558\uac8c \ub41c\ub2e4. \uadf8\ub798\uc11c \ub9ce\uc740 string\uc744 \ub2e4\ub8f0 \ub54c \ub290\ub824\uc9c0\uac8c \ub41c\ub2e4." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Six Method\n", "\n", "### Method 1: Naive appending\n", "\n", "- backticks(``): int -> str, str()\uacfc \uac19\uc740 \uc5ed\ud560" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!python --version" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Python 2.7.6\r\n" ] } ], "prompt_number": 102 }, { "cell_type": "code", "collapsed": false, "input": [ "loop_count = 100" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 50 }, { "cell_type": "code", "collapsed": false, "input": [ "def method1():\n", " out_str = ''\n", " for num in xrange(loop_count):\n", " out_str += `num`\n", " return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 281 }, { "cell_type": "code", "collapsed": false, "input": [ "# backticks: int -> str\n", "`10`" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 52, "text": [ "'10'" ] } ], "prompt_number": 52 }, { "cell_type": "code", "collapsed": false, "input": [ "str(10)" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 53, "text": [ "'10'" ] } ], "prompt_number": 53 }, { "cell_type": "code", "collapsed": false, "input": [ "method1()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 54, "text": [ "'0123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899'" ] } ], "prompt_number": 54 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit?" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 30 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1000000 loops, best of 3: 1.25 \u00b5s per loop\n" ] } ], "prompt_number": 37 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 1000 -r 10 method1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1000 loops, best of 10: 9.94 \u00b5s per loop\n" ] } ], "prompt_number": 55 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 1000 -r 10 -p 3 -o method1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1000 loops, best of 10: 10.1 \u00b5s per loop\n" ] }, { "metadata": {}, "output_type": "pyout", "prompt_number": 56, "text": [ "" ] } ], "prompt_number": 56 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit -n 1000 method1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1000 loops, best of 3: 10.1 \u00b5s per loop\n" ] } ], "prompt_number": 57 }, { "cell_type": "code", "collapsed": false, "input": [ "%%timeit?" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 31 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Method 2: MutableString class\n", "\n", "- [string - Why is MutableString deprecated in Python? - Stack Overflow](http://stackoverflow.com/questions/4651344/why-is-mutablestring-deprecated-in-python)\n", "- Mutablestring \ud074\ub798\uc2a4\uac00 \uc5c6\ub2e4. \uc0ac\ub77c\uc9d0" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def method2():\n", " from UserString import MutableString\n", " out_str = MutableString()\n", " for num in xrange(loop_count):\n", " out_str += `num`\n", " return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 282 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Method 3: Character arrays\n", "\n", "- c\uc758 array\ub85c \ub9cc\ub4e4\uace0\n", "- fromstring\uc73c\ub85c string\uc73c\ub85c \uac00\uc838\uc640\uc11c \ubc30\uc5f4\uc758 \ub05d\uc5d0 \ucd94\uac00\ud55c\ub2e4.\n", "- \ub9c8\uc9c0\ub9c9\uc73c\ub85c tostring() \ud574\uc11c string\uc73c\ub85c \ub9cc\ub4e0\ub2e4.\n", "\n", "\n", "- Arryas are mutable in python. \uadf8\ub798\uc11c \uc6d0\ub798 \uc788\ub358 \ubb38\uc790\uc5f4\uc740 \ubcf5\uc0ac\ud558\uc9c0 \uc54a\ub294\ub2e4." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def method3():\n", " from array import array\n", " char_array = array('c')\n", " for num in xrange(loop_count):\n", " char_array.fromstring(`num`)\n", " return char_array.tostring()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 283 }, { "cell_type": "code", "collapsed": false, "input": [ "method3()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 284, "text": [ "'012345678910111213141516171819'" ] } ], "prompt_number": 284 }, { "cell_type": "code", "collapsed": false, "input": [ "from array import array" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 60 }, { "cell_type": "code", "collapsed": false, "input": [ "char_array = array('c')" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 61 }, { "cell_type": "code", "collapsed": false, "input": [ "char_array" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 62, "text": [ "array('c')" ] } ], "prompt_number": 62 }, { "cell_type": "code", "collapsed": false, "input": [ "char_array.fromstring(`5`)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 63 }, { "cell_type": "code", "collapsed": false, "input": [ "char_array" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 64, "text": [ "array('c', '5')" ] } ], "prompt_number": 64 }, { "cell_type": "code", "collapsed": false, "input": [ "char_array.fromstring(`6`)\n", "char_array" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 65, "text": [ "array('c', '56')" ] } ], "prompt_number": 65 }, { "cell_type": "code", "collapsed": false, "input": [ "char_array.tostring()" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 66, "text": [ "'56'" ] } ], "prompt_number": 66 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Method 4: Build a list of strings, then join it\n", "\n", "- \uc774 \ubc29\ubc95\uc744 \ucd94\ucc9c\ud55c\ub2e4. \ub9e4\uc6b0 \ud30c\uc774\uc36c\uc2a4\ub7fd\ub2e4." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def method4():\n", " str_list = []\n", " for num in xrange(loop_count):\n", " str_list.append(`num`)\n", " out_str = ''.join(str_list)\n", " return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 285 }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### lookup \ubd80\ud558 \uc904\uc774\uae30(\uae40\ud0dc\ud638\ub2d8 \uc758\uacac)\n", "\n", "- l.append\ub97c ll\ub85c \ubcc0\uacbd\n", "- .\uc5f0\uc0b0\uc790\uac00 \uc788\uc73c\uba74 \ucc38\uc870\ud558\uae30 \uc704\ud574 \ub290\ub824\uc9c0\uae30 \ub54c\ubb38\uc5d0 \uadf8 \ucc38\uc870\ud558\ub294 \uc18d\ub3c4\ub3c4 \uc904\uc774\uae30 \uc704\ud568\n", "- for\ubb38 \uc548\uc5d0\uc11c \uadf9\ud55c\uc758 \uc18d\ub3c4\ub97c \uc5bb\uae30 \uc704\ud574\uc11c\ub294 .\uc73c\ub85c \ucc38\uc870\ud558\ub294 \uac83\ub3c4 \uc0ac\uc6a9\ud558\uc9c0 \ub9d0\uc790" ] }, { "cell_type": "code", "collapsed": false, "input": [ "l = []" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 374 }, { "cell_type": "code", "collapsed": false, "input": [ "ll = l.append" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 375 }, { "cell_type": "code", "collapsed": false, "input": [ "ll(1)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 376 }, { "cell_type": "code", "collapsed": false, "input": [ "ll" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 377, "text": [ "" ] } ], "prompt_number": 377 }, { "cell_type": "code", "collapsed": false, "input": [ "ll(10)\n", "l" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 378, "text": [ "[1, 10]" ] } ], "prompt_number": 378 }, { "cell_type": "code", "collapsed": false, "input": [ "def method4_2():\n", " str_list = []\n", " str_list2 = str_list.append\n", " for num in xrange(loop_count):\n", " str_list2(`num`)\n", " out_str = ''.join(str_list)\n", " return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 310 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Method 5: Write to a pseudo file\n", "\n", "- \ud30c\uc77c\ucc98\ub7fc \uc800\uc7a5\ub41c\ub2e4. \ud558\uc9c0\ub9cc \uc800\uc7a5\ub418\ub294 \uac83\uc740 string.\n", "- \uba85\ud655\ud558\uac8c \ud30c\uc77c\uc5d0 \ucd94\uac00\ud558\ub294 \uac83\uc740 \ub9e4\uc6b0 \uc27d\ub2e4." ] }, { "cell_type": "code", "collapsed": false, "input": [ "def method5():\n", " from cStringIO import StringIO\n", " file_str = StringIO()\n", " for num in xrange(loop_count):\n", " file_str.write(`num`)\n", " out_str = file_str.getvalue()\n", " return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 286 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Method 6: List comprehensions\n", "\n", "- Method4\uc640 \uac19\uc740\ub370 list.append()\ub97c \uac01 loop\uc5d0\uc11c \ubd80\ub974\uc9c0 \uc54a\uae30 \ub54c\ubb38\uc5d0 \uc18d\ub3c4 \ud5a5\uc0c1\uc774 \uc788\ub2e4.\n", "- \ub9e4\uc6b0 \uc9e7\uc740 \ucf54\ub4dc\ub2e4.\n", "- join, list comprehension\uc744 \ub3d9\uc2dc\uc5d0 \uc0ac\uc6a9\ud588\ub2e4" ] }, { "cell_type": "code", "collapsed": false, "input": [ "def method6():\n", " out_str = ''.join([`num` for num in xrange(loop_count)])\n", " return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 287 }, { "cell_type": "code", "collapsed": false, "input": [ "# generator\ub85c \uc0dd\uc131\n", "def method7():\n", " out_str = ''.join((`num` for num in xrange(loop_count)))\n", " return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 288 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Results\n", "\n", "### Results: Twenty thousand integers(20,000)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "loop_count = 20000" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 320 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 2.58 ms per loop\n" ] } ], "prompt_number": 321 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method2()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method3()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 6.45 ms per loop\n" ] } ], "prompt_number": 322 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method4()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 2.84 ms per loop\n" ] } ], "prompt_number": 323 }, { "cell_type": "code", "collapsed": false, "input": [ "# \ud655\uc2e4\ud788 \ube68\ub77c\uc84c\ub2e4!\n", "%timeit method4_2()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 2.07 ms per loop\n" ] } ], "prompt_number": 324 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method5()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 7.66 ms per loop\n" ] } ], "prompt_number": 325 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method6()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1000 loops, best of 3: 1.76 ms per loop\n" ] } ], "prompt_number": 326 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method7()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "100 loops, best of 3: 2.1 ms per loop\n" ] } ], "prompt_number": 327 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \uc18d\ub3c4\n", "\n", "- [\ub2f9\ub2f9\ud55c \uc790\uae30\ud45c\ud604 : \ucef4\ud4e8\ud130 ms, us, ns, ps, fs, as](http://sea1sun.egloos.com/viewer/8879042)\n", "\n", "#### \ucef4\ud4e8\ud130 \ucc98\ub9ac \uc18d\ub3c4 \ub2e8\uc704\n", "\n", "- \ubc00\ub9ac\ucd08(ms) : 1/1,000\ucd08\n", "- \ub9c8\uc774\ud06c\ub85c\ucd08(\u03bcs) : 1/1,000,000\ucd08\n", "- \ub098\ub178\ucd08(ns) : 1/1,000,000,000\ucd08\n", "- \ud53c\ucf54\ucd08(ps) : 1/1,000,000,000,000\ucd08\n", "- \ud3a8\ud1a0\ucd08(fs) : 1/1,000,000,000,000,000\ucd08\n", "- \uc544\ud1a0\ucd08(as) : 1/1,000,000,000,000,000,000\ucd08\n", "- ms : \ubc00\ub9ac\ucd08 0.001 = 10^(-3)\ucd08\n", "- \u33b2 : \ub9c8\uc774\ud06c\ub85c\ucd08 0.000001\ucd08 = 10^(-6)\ucd08\n", "- \u33b1: \ub098\ub178\ucd08 0.000000001\ucd08 = 10^(-9)\ucd08\n", "- \u33b0 : \ud53c\ucf54\ucd08 0.000000000001\ucd08 = 10^(-12)\ucd08\n", "- fs : \ud3a8\ud1a0\ucd08 0.000000000000001\ucd08 = 10^(-15)\ucd08\n", "- as : \uc544\ud1a0\ucd08 0.000000000000000001\ucd08 = 10^(-18)\ucd08\n", "\n", "\n", "- 1024byte = 1KB (kilo byte)\n", "- 1024KB = 1MB (mega byte)\n", "- 1024MB = 1GB (giga byte) = 1048576 KB\n", "- 1024GB = 1TB (tela byte)\n", "- 1024TB = 1PB (peta byte)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ ">timeit: 100, 1000 loops \uc774\ub4e0 \uc5b4\uc9dc\ud53c best 3\uc758 \uacb0\uacfc\ub9cc \ubcf4\uba74 \ub41c\ub2e4." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \uc6d0\ub798 \ube14\ub85c\uadf8\uc5d0 \uc788\ub294 Image(\ub0b4 \uc2e4\ud5d8 \uacb0\uacfc\uc640 \ub2e4\ub984)\n", "\n", "![Concatenations per second(thousands)](http://www.skymind.com/~ocrow/python_string/test_20k.gif)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### \uacb0\uacfc\uac00 \uc774\uc0c1\ud55c\ub370?\n", "\n", "1. method6: list comprehension\n", "2. method4_2: list append(.\uc5f0\uc0b0\uc790 \uc811\uadfc \uc5c6\uc570)\n", "3. method1: string concatenating\n", "4. method4: list append\n", "\n", "\n", "- \uc6d0\ub798 \ube14\ub85c\uadf8\uc5d0\uc11c \ubcf4\uc5ec\uc900 \uac83\uacfc \ub2e4\ub978 \uacb0\uacfc\uac00 \ub098\ud0c0\ub09c\ub2e4.\n", "- method1\uc774 method4\ubcf4\ub2e4 \ube60\ub978\uac8c \uc774\ud574\uac00 \uc548\ub41c\ub2e4. string\uc774 \ubcc0\uacbd\ub418\uc9c0 \uc54a\ub294 \uac1d\uccb4\uc774\uae30 \ub54c\ubb38\uc5d0 \ub9e4\ubc88 \uc0dd\uc131\ub418\uc11c \ub290\ub9b4\uac83 \uac19\uc740\ub370 \uacb0\uacfc\ub294 \ub354 \ube60\ub974\ub124?(**\uac00\uc7a5 \uc774\ud574 \uc548\ub418\ub294 \uacb0\uacfc**)\n", "- method4_2\uc5d0\uc11c **.\uc5f0\uc0b0\uc790\ub85c \ucc38\uace0\ud558\ub294 \uac83\ub3c4 \uc904\uc600\ub354\ub2c8** \uc18d\ub3c4\uac00 \uc880 \ub354 **\ube68\ub77c\uc84c\ub2e4**.\n", "- method5 \uac19\uc740 \uacbd\uc6b0\uc5d0\ub294 memory\uc5d0 \uc4f8\uac83 \uac19\uc740\ub370 \uc2e4\uc81c temp \ud30c\uc77c\ub85c \uc4f0\ub098? \uc65c \uc774\ub9ac \uc18d\ub3c4\uac00 \ub290\ub9ac\uc9c0?\n", "- \ucc38\uace0\ub85c \ub0b4 \ucef4\ud4e8\ud130\ub294 \ub9e5\ubd81 \ud504\ub85c \ub808\ud2f0\ub098\uc784\n", "\n", "### Profiling\n", "\n", "- \ud504\ub85c\ud30c\uc77c\ub9c1\uc744 \ud558\uba74 \uc5b4\ub5a4 \ud568\uc218\uc5d0\uc11c \uc2dc\uac04\uc774 \uc624\ub798 \uc18c\ubaa8\ub418\ub294\uc9c0 \uc54c \uc218 \uc788\uc73c\ub2c8\uae4c \uc88b\uc744\uac83 \uac19\uc740\ub370?\n", "- \ud504\ub85c\ud30c\uc77c\ub9c1\uc5d0 \ub300\ud574\uc11c \ud55c \ubc88 \uc870\uc0ac\ub97c \ud574\ubd10\uc57c\uaca0\ub2e4" ] }, { "cell_type": "code", "collapsed": false, "input": [ "loop_count = 500000" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 96 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 84.7 ms per loop\n" ] } ], "prompt_number": 97 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method2()" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method3()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 loops, best of 3: 178 ms per loop\n" ] } ], "prompt_number": 98 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method4()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 90 ms per loop\n" ] } ], "prompt_number": 99 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method5()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1 loops, best of 3: 197 ms per loop\n" ] } ], "prompt_number": 100 }, { "cell_type": "code", "collapsed": false, "input": [ "%timeit method6()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "10 loops, best of 3: 62.8 ms per loop\n" ] } ], "prompt_number": 101 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### timeit\uc774 \uc798\ubabb\ub410\ub098? \ub0b4\uac00 \uc9c1\uc811 \uc9dc\ubcf4\uc790" ] }, { "cell_type": "code", "collapsed": false, "input": [ "import time\n", "def check_time(func):\n", " def inner(*args, **kwargs):\n", " start = time.time()\n", " for i in range(10):\n", " ret = func(*args, **kwargs)\n", " end = time.time()\n", " print '%.8f' % (end - start)\n", " return ret\n", " return inner" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 123 }, { "cell_type": "code", "collapsed": false, "input": [ "@check_time\n", "def method1():\n", " out_str = ''\n", " for num in xrange(loop_count):\n", " out_str += `num`\n", "# return out_str" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 135 }, { "cell_type": "code", "collapsed": false, "input": [ "@check_time\n", "def method3():\n", " from array import array\n", " char_array = array('c')\n", " for num in xrange(loop_count):\n", " char_array.fromstring(`num`)\n", " char_array.tostring()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 136 }, { "cell_type": "code", "collapsed": false, "input": [ "@check_time\n", "def method4():\n", " str_list = []\n", " for num in xrange(loop_count):\n", " str_list.append(`num`)\n", " ''.join(str_list)" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 137 }, { "cell_type": "code", "collapsed": false, "input": [ "@check_time\n", "def method5():\n", " from cStringIO import StringIO\n", " file_str = StringIO()\n", " for num in xrange(loop_count):\n", " file_str.write(`num`)\n", " file_str.getvalue()" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 138 }, { "cell_type": "code", "collapsed": false, "input": [ "@check_time\n", "def method6():\n", " ''.join([`num` for num in xrange(loop_count)])" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 139 }, { "cell_type": "code", "collapsed": false, "input": [ "method1()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0.87336206\n" ] } ], "prompt_number": 140 }, { "cell_type": "code", "collapsed": false, "input": [ "method3()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "1.85140491\n" ] } ], "prompt_number": 141 }, { "cell_type": "code", "collapsed": false, "input": [ "method4()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0.94222403\n" ] } ], "prompt_number": 142 }, { "cell_type": "code", "collapsed": false, "input": [ "method5()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "2.01019597\n" ] } ], "prompt_number": 143 }, { "cell_type": "code", "collapsed": false, "input": [ "method6()" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "0.66609311\n" ] } ], "prompt_number": 144 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## \ucd5c\uc885 \uacb0\ub860\n", "\n", "- %timeit\uacfc \ub611\uac19\uc740 \uacb0\uacfc\uac00 \ub098\uc624\ub124\n", "- list comprehension\uc744 \uc0ac\uc6a9\ud55c method6\uc774 \ud56d\uc0c1 win" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Original source(\ube14\ub85c\uadf8 \uc8fc\uc778\uc7a5)" ] }, { "cell_type": "code", "collapsed": false, "input": [ "%%writefile tmp/original_str_performance.py\n", "\n", "\n", "from cStringIO import StringIO\n", "import time, commands, os\n", "from sys import argv\n", "\n", "def method1():\n", "\tout_str = ''\n", "\tfor num in xrange(loop_count):\n", "\t\tout_str += `num`\n", "\tps_stats()\n", "\treturn out_str\n", "\n", "def method2():\n", "\tfrom UserString import MutableString\n", "\tout_str = MutableString()\n", "\tfor num in xrange(loop_count):\n", "\t\tout_str += `num`\n", "\tps_stats()\n", "\treturn out_str\n", "\n", "def method3():\n", "\tfrom array import array\n", "\tchar_array = array('c')\n", "\tfor num in xrange(loop_count):\n", "\t\tchar_array.fromstring(`num`)\n", "\tps_stats()\n", "\treturn char_array.tostring()\n", "\n", "def method4():\n", "\tstr_list = []\n", "\tfor num in xrange(loop_count):\n", "\t\tstr_list.append(`num`)\n", "\tout_str = ''.join(str_list)\n", "\tps_stats()\n", "\treturn out_str\n", "\n", "def method4_2():\n", "\tstr_list = []\n", "\tstr_list2 = str_list.append\n", "\tfor num in xrange(loop_count):\n", "\t\tstr_list2(`num`)\n", "\tout_str = ''.join(str_list)\n", "\tps_stats()\n", "\treturn out_str\n", "\n", "def method5():\n", "\tfile_str = StringIO()\n", "\tfor num in xrange(loop_count):\n", "\t\tfile_str.write(`num`)\n", "\tout_str = file_str.getvalue()\n", "\tps_stats()\n", "\treturn out_str\n", "\n", "def method6():\n", "\tout_str = ''.join([`num` for num in xrange(loop_count)])\n", "\tps_stats()\n", "\treturn out_str\n", "\n", "\n", "def ps_stats():\n", "\tglobal process_size\n", "\tps = commands.getoutput('ps -l ' + `pid`)\n", "\tprocess_size = ps.split()[23]\n", " \n", "\n", "def call_method(num):\n", "\tglobal process_size\n", "\tstart = time.time()\n", "\tz = eval('method' + str(num))()\n", "\tend = time.time()\n", "\tprint \"method\", num\n", "\tprint \"time\", float(end-start) * 100\n", "\tprint \"output size \", len(z) / 1024, \"kb\"\n", "\tprint \"process size\", process_size, \"kb\"\n", "\tprint\n", "\t\n", "loop_count = 100000\n", "pid = os.getpid()\n", "\n", "if len(argv) == 2:\n", "\tcall_method(argv[1])\n", "else:\n", "\tprint \"Usage: python stest.py \\n\" \\\n", "\t\t\" where n is the method number to test\"\n", "\n" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "Overwriting tmp/original_str_performance.py\n" ] } ], "prompt_number": 379 }, { "cell_type": "code", "collapsed": false, "input": [ "!python tmp/original_str_performance.py 1" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 1\r\n", "time 2.25088596344\r\n", "output size 477 kb\r\n", "process size 4616 kb\r\n", "\r\n" ] } ], "prompt_number": 388 }, { "cell_type": "code", "collapsed": false, "input": [ "%run tmp/original_str_performance.py 1" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 1\n", "time 2.80869007111\n", "output size 477 kb\n", "process size 59764 kb\n", "\n" ] } ], "prompt_number": 389 }, { "cell_type": "code", "collapsed": false, "input": [ "%run tmp/original_str_performance.py 2" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 2\n", "time 185.649013519\n", "output size 477 kb\n", "process size 59452 kb\n", "\n" ] } ], "prompt_number": 390 }, { "cell_type": "code", "collapsed": false, "input": [ "%run tmp/original_str_performance.py 3" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 3\n", "time 4.83219623566\n", "output size 477 kb\n", "process size 58760 kb\n", "\n" ] } ], "prompt_number": 391 }, { "cell_type": "code", "collapsed": false, "input": [ "!python tmp/original_str_performance.py 4" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 4\r\n", "time 2.69351005554\r\n", "output size 477 kb\r\n", "process size 10336 kb\r\n", "\r\n" ] } ], "prompt_number": 392 }, { "cell_type": "code", "collapsed": false, "input": [ "%run tmp/original_str_performance.py 4_2" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 4_2\n", "time 2.11400985718\n", "output size 477 kb\n", "process size 62208 kb\n", "\n" ] } ], "prompt_number": 393 }, { "cell_type": "code", "collapsed": false, "input": [ "%run tmp/original_str_performance.py 5" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 5\n", "time 5.35268783569\n", "output size 477 kb\n", "process size 59916 kb\n", "\n" ] } ], "prompt_number": 394 }, { "cell_type": "code", "collapsed": false, "input": [ "%run tmp/original_str_performance.py 6" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "method 6\n", "time 1.88450813293\n", "output size 477 kb\n", "process size 59892 kb\n", "\n" ] } ], "prompt_number": 395 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### xrange vs range\n", "\n", "- xrange\uac00 \uba54\ubaa8\ub9ac \uc801\uac8c \uc18c\ubaa8\n", "- python 3.0 \ubc84\uc804\uc5d0\uc11c\ub294 range\ub3c4 xrange\ub85c \ubcc0\uacbd\ub418\uc11c iterator\uac00 \ub428" ] } ], "metadata": {} } ] }