<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <meta name="description" content="IPython Cookbook, "> <!-- FAVICON --> <link rel="apple-touch-icon" sizes="57x57" href="/apple-touch-icon-57x57.png"> <link rel="apple-touch-icon" sizes="114x114" href="/apple-touch-icon-114x114.png"> <link rel="apple-touch-icon" sizes="72x72" href="/apple-touch-icon-72x72.png"> <link rel="apple-touch-icon" sizes="144x144" href="/apple-touch-icon-144x144.png"> <link rel="apple-touch-icon" sizes="60x60" href="/apple-touch-icon-60x60.png"> <link rel="apple-touch-icon" sizes="120x120" href="/apple-touch-icon-120x120.png"> <link rel="apple-touch-icon" sizes="76x76" href="/apple-touch-icon-76x76.png"> <link rel="apple-touch-icon" sizes="152x152" href="/apple-touch-icon-152x152.png"> <link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon-180x180.png"> <link rel="icon" type="image/png" href="/favicon-192x192.png" sizes="192x192"> <link rel="icon" type="image/png" href="/favicon-160x160.png" sizes="160x160"> <link rel="icon" type="image/png" href="/favicon-96x96.png" sizes="96x96"> <link rel="icon" type="image/png" href="/favicon-16x16.png" sizes="16x16"> <link rel="icon" type="image/png" href="/favicon-32x32.png" sizes="32x32"> <meta name="msapplication-TileColor" content="#da532c"> <meta name="msapplication-TileImage" content="/mstile-144x144.png"> <link rel="alternate" href="https://ipython-books.github.io/feeds/all.atom.xml" type="application/atom+xml" title="IPython Cookbook Full Atom Feed"/> <title>IPython Cookbook - 4.6. Using stride tricks with NumPy</title> <link href="//cdnjs.cloudflare.com/ajax/libs/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"> <link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/pure/0.3.0/pure-min.css"> <!--[if lte IE 8]> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/pure/0.5.0/pure-min.css"> <![endif]--> <!--[if gt IE 8]><!--> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/pure/0.5.0/pure-min.css"> <!--<![endif]--> <link rel="stylesheet" href="https://ipython-books.github.io/theme/css/styles.css"> <link rel="stylesheet" href="https://ipython-books.github.io/theme/css/pygments.css"> <!-- <link href='https://fonts.googleapis.com/css?family=Lato:300,400,700' rel='stylesheet' type='text/css'> --> <link href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:300,500" rel="stylesheet" type="text/css"> <link href='https://fonts.googleapis.com/css?family=Ubuntu+Mono' rel='stylesheet' type='text/css'> <script src="//cdnjs.cloudflare.com/ajax/libs/jquery/2.0.3/jquery.min.js"></script> </head> <body> <header id="header" class="pure-g"> <div class="pure-u-1 pure-u-md-3-4"> <div id="menu"> <div class="pure-menu pure-menu-open pure-menu-horizontal"> <ul> <li><a href="/">home</a></li> <li><a href="https://github.com/ipython-books/cookbook-2nd-code">Jupyter notebooks</a></li> <li><a href="https://github.com/ipython-books/minibook-2nd-code">minibook</a></li> <li><a href="https://cyrille.rossant.net">author</a></li> </ul> </div> </div> </div> <div class="pure-u-1 pure-u-md-1-4"> <div id="social"> <div class="pure-menu pure-menu-open pure-menu-horizontal"> <ul> <li><a href="https://twitter.com/cyrillerossant"><i class="fa fa-twitter"></i></a></li> <li><a href="https://github.com/ipython-books/cookbook-2nd"><i class="fa fa-github"></i></a></li> </ul> </div> </div> </div> </header> <div id="layout" class="pure-g"> <section id="content" class="pure-u-1 pure-u-md-4-4"> <div class="l-box"> <header id="page-header"> <h1>4.6. Using stride tricks with NumPy</h1> </header> <section id="page"> <p><a href="/"><img src="https://raw.githubusercontent.com/ipython-books/cookbook-2nd/master/cover-cookbook-2nd.png" align="left" alt="IPython Cookbook, Second Edition" height="130" style="margin-right: 20px; margin-bottom: 10px;" /></a> <em>This is one of the 100+ free recipes of the <a href="/">IPython Cookbook, Second Edition</a>, by <a href="http://cyrille.rossant.net">Cyrille Rossant</a>, a guide to numerical computing and data science in the Jupyter Notebook. The ebook and printed book are available for purchase at <a href="https://www.packtpub.com/big-data-and-business-intelligence/ipython-interactive-computing-and-visualization-cookbook-second-e">Packt Publishing</a>.</em></p> <p>▶ <em><a href="https://github.com/ipython-books/cookbook-2nd">Text on GitHub</a> with a <a href="https://creativecommons.org/licenses/by-nc-nd/3.0/us/legalcode">CC-BY-NC-ND license</a></em><br /> ▶ <em><a href="https://github.com/ipython-books/cookbook-2nd-code">Code on GitHub</a> with a <a href="https://opensource.org/licenses/MIT">MIT license</a></em></p> <p>▶ <a href="https://ipython-books.github.io/chapter-4-profiling-and-optimization/"><strong><em>Go to</em></strong> <em>Chapter 4 : Profiling and Optimization</em></a><br /> ▶ <a href="https://github.com/ipython-books/cookbook-2nd-code/blob/master/chapter04_optimization/06_stride_tricks.ipynb"><em><strong>Get</strong> the Jupyter notebook</em></a> </p> <p>In this recipe, we will dig deeper into the internals of NumPy arrays, by generalizing the notion of row-major and column-major orders to multidimensional arrays. The general notion is that of strides, which describe how the items of a multidimensional array are organized within a one-dimensional data buffer. Strides are mostly an implementation detail, but they can also be used in specific situations to optimize some algorithms.</p> <h2>Getting ready</h2> <p>We suppose that NumPy has been imported and that the <code>aid()</code> function has been defined (see the previous recipe, <em>Understanding the internals of NumPy to avoid unnecessary array copying</em>).</p> <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> </pre></div> <div class="highlight"><pre><span></span><span class="k">def</span> <span class="nf">aid</span><span class="p">(</span><span class="n">x</span><span class="p">):</span> <span class="c1"># This function returns the memory</span> <span class="c1"># block address of an array.</span> <span class="k">return</span> <span class="n">x</span><span class="o">.</span><span class="n">__array_interface__</span><span class="p">[</span><span class="s1">'data'</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span> </pre></div> <h2>How to do it...</h2> <p><strong>1. </strong> Strides are integer numbers describing the byte step in the contiguous block of memory for each dimension.</p> <div class="highlight"><pre><span></span><span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="n">x</span><span class="o">.</span><span class="n">strides</span> </pre></div> <div class="highlight"><pre><span></span>(8,) </pre></div> <p>This vector <code>x</code> contains double-precision floating point numbers (float64, 8 bytes); one needs to go <em>8 bytes forward</em> to go from one item to the next.</p> <p><strong>2. </strong> Now, let's look at the strides of a 2D array:</p> <div class="highlight"><pre><span></span><span class="n">y</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">((</span><span class="mi">10</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span> <span class="n">y</span><span class="o">.</span><span class="n">strides</span> </pre></div> <div class="highlight"><pre><span></span>(80, 8) </pre></div> <p>In the first dimension (vertical), one needs to go <em>80 bytes</em> (10 float64 items) <em>forward</em> to go from one item to the next, because the items are internally stored in row-major order. In the second dimension (horizontal), one needs to go <em>8 bytes forward</em> to go from one item to the next.</p> <p><strong>3. </strong> Let's show how we can revisit the broadcasting rules from the previous recipe using strides:</p> <div class="highlight"><pre><span></span><span class="n">n</span> <span class="o">=</span> <span class="mi">1000</span> <span class="n">a</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">arange</span><span class="p">(</span><span class="n">n</span><span class="p">)</span> </pre></div> <p>We will create a new array, <code>b</code>, pointing to the same memory block as <code>a</code>, but with a different shape and different strides. This new array will look like a vertically-tiled version of <code>a</code>. We use a special function in NumPy to change the strides of an array:</p> <div class="highlight"><pre><span></span><span class="n">b</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">lib</span><span class="o">.</span><span class="n">stride_tricks</span><span class="o">.</span><span class="n">as_strided</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">n</span><span class="p">),</span> <span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">8</span><span class="p">))</span> </pre></div> <div class="highlight"><pre><span></span><span class="n">b</span> </pre></div> <div class="highlight"><pre><span></span>array([[ 0, 1, 2, ..., 997, 998, 999], [ 0, 1, 2, ..., 997, 998, 999], [ 0, 1, 2, ..., 997, 998, 999], ..., [ 0, 1, 2, ..., 997, 998, 999], [ 0, 1, 2, ..., 997, 998, 999], [ 0, 1, 2, ..., 997, 998, 999]]) </pre></div> <div class="highlight"><pre><span></span><span class="n">b</span><span class="o">.</span><span class="n">size</span><span class="p">,</span> <span class="n">b</span><span class="o">.</span><span class="n">shape</span><span class="p">,</span> <span class="n">b</span><span class="o">.</span><span class="n">nbytes</span> </pre></div> <div class="highlight"><pre><span></span>(1000000, (1000, 1000), 8000000) </pre></div> <p>NumPy believes that this array contains one million different elements, whereas the data buffer actually contains the same 1000 elements as <code>a</code>.</p> <p><strong>4. </strong> We can now perform an efficient outer product using the same principle as with broadcasting rules:</p> <div class="highlight"><pre><span></span><span class="o">%</span><span class="n">timeit</span> <span class="n">b</span> <span class="o">*</span> <span class="n">b</span><span class="o">.</span><span class="n">T</span> </pre></div> <div class="highlight"><pre><span></span>766 µs ± 2.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) </pre></div> <div class="highlight"><pre><span></span><span class="o">%%</span><span class="n">timeit</span> <span class="n">np</span><span class="o">.</span><span class="n">tile</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span> <span class="o">*</span> <span class="n">np</span><span class="o">.</span><span class="n">tile</span><span class="p">(</span><span class="n">a</span><span class="p">[:,</span> <span class="n">np</span><span class="o">.</span><span class="n">newaxis</span><span class="p">],</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">n</span><span class="p">))</span> </pre></div> <div class="highlight"><pre><span></span>5.55 ms ± 9.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) </pre></div> <h2>How it works...</h2> <p>Every array has a number of dimensions, a shape, a data type, and strides. Strides describe how the items of a multidimensional array are organized in the data buffer. There are many different schemes for arranging the items of a multidimensional array in a one-dimensional block. NumPy implements a <strong>strided indexing scheme</strong>, where the position of any element is a <strong>linear combination</strong> of the dimensions, the coefficients being the strides. In other words, strides describe, in any dimension, how many bytes we need to jump over in the data buffer to go from one item to the next.</p> <p>The position of any element in a multidimensional array is given by a linear combination of its indices, as follows:</p> <p><img alt="Strides" src="https://ipython-books.github.io/pages/chapter04_optimization/images/strides.png" /></p> <p>Artificially changing the strides allows us to make some array operations more efficient than with standard methods, which may involve array copies. Internally, that's how broadcasting works in NumPy.</p> <p>The <code>as_strided()</code> method takes an array, a shape, and strides as arguments. It creates a new array, but uses the same data buffer as the original array. The only thing that changes is the metadata. This trick lets us manipulate NumPy arrays as usual, except that they may take much less memory than what NumPy thinks. Here, using 0 in the strides implies that any array item can be addressed by many multidimensional indices, resulting in memory savings.</p> <blockquote> <p>Be very careful with strided arrays! The <code>as_strided()</code> function does not check whether you stay inside the memory block bounds. This means that you need to handle edge effects manually; otherwise, you may end up with garbage values in your arrays. The documentation says: "<em>This function has to be used with extreme care, see notes. (...) It is advisable to avoid <code>as_strided()</code> when possible.</em>"</p> </blockquote> <p>We will see a more useful application of stride tricks in the next recipe.</p> <h2>See also</h2> <ul> <li>Implementing an efficient rolling average algorithm with stride tricks</li> </ul> </section> </div> </section> <footer id="footer" class="pure-u-1 pure-u-md-4-4"> <div class="l-box"> <div> <p>© <a href="https://cyrille.rossant.net">Cyrille Rossant</a> – Built with <a href="https://github.com/PurePelicanTheme/pure-single">Pure Theme</a> for <a href="https://blog.getpelican.com/">Pelican</a> </p> </div> </div> </footer> </div> <!-- Start of StatCounter Code for Default Guide --> <script type="text/javascript"> var sc_project=9752080; var sc_invisible=1; var sc_security="c177b501"; var scJsHost = (("https:" == document.location.protocol) ? "https://secure." : "http://www."); </script> <script type="text/javascript" src="https://www.statcounter.com/counter/counter.js" async></script> <noscript><div class="statcounter"><a title="Web Analytics" href="https://statcounter.com/" target="_blank"><img class="statcounter" src="//c.statcounter.com/9752080/0/c177b501/1/" alt="Web Analytics"></a></div></noscript> <!-- End of StatCounter Code for Default Guide --> </body> </html>