Chris Albonhttp://chrisalbon.com/Sun, 01 May 2016 12:00:00 -0700Two Way Frequency Tablehttp://chrisalbon.com/r-stats/2-way-frequency-table.html<p><a href="http://www.statmethods.net/stats/frequencies.html">Original source</a></p> <div class="codehilite"><pre><span></span><span class="c1"># Create some data</span> A <span class="o">&lt;-</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;yes&quot;</span><span class="p">,</span> <span class="s">&quot;no&quot;</span><span class="p">,</span><span class="s">&quot;yes&quot;</span><span class="p">,</span> <span class="s">&quot;no&quot;</span><span class="p">,</span><span class="s">&quot;yes&quot;</span><span class="p">,</span> <span class="s">&quot;no&quot;</span><span class="p">,</span><span class="s">&quot;yes&quot;</span><span class="p">,</span> <span class="s">&quot;no&quot;</span><span class="p">)</span> B <span class="o">&lt;-</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;male&quot;</span><span class="p">,</span> <span class="s">&quot;female&quot;</span><span class="p">,</span><span class="s">&quot;female&quot;</span><span class="p">,</span> <span class="s">&quot;male&quot;</span><span class="p">,</span><span class="s">&quot;male&quot;</span><span class="p">,</span> <span class="s">&quot;male&quot;</span><span class="p">,</span><span class="s">&quot;male&quot;</span><span class="p">,</span> <span class="s">&quot;male&quot;</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># A will be rows, B will be columns</span> mytable <span class="o">&lt;-</span> <span class="kp">table</span><span class="p">(</span>A<span class="p">,</span>B<span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># print table</span> mytable </pre></div> <div class="codehilite"><pre><span></span> B A female male no 1 3 yes 1 3 </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># A frequencies (summed over B)</span> <span class="kp">margin.table</span><span class="p">(</span>mytable<span class="p">,</span> <span class="m">1</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>A no yes 4 4 </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># B frequencies (summed over A)</span> <span class="kp">margin.table</span><span class="p">(</span>mytable<span class="p">,</span> <span class="m">2</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>B female male 2 6 </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># cell percentages</span> <span class="kp">prop.table</span><span class="p">(</span>mytable<span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span> B A female male no 0.125 0.375 yes 0.125 0.375 </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># row percentages</span> <span class="kp">prop.table</span><span class="p">(</span>mytable<span class="p">,</span> <span class="m">1</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span> B A female male no 0.25 0.75 yes 0.25 0.75 </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># column percentages</span> <span class="kp">prop.table</span><span class="p">(</span>mytable<span class="p">,</span> <span class="m">2</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span> B A female male no 0.5 0.5 yes 0.5 0.5 </pre></div>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/2-way-frequency-table.htmlData Visualization2D Density Plothttp://chrisalbon.com/r-stats/2d-density-plot.html<p>Original source: r graphics cookbook</p> <div class="codehilite"><pre><span></span><span class="c1"># load the gcookbook package for the data</span> <span class="kn">library</span><span class="p">(</span>gcookbook<span class="p">)</span> <span class="c1"># load the ggplot2 package</span> <span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span> <span class="c1"># reset the graphing device</span> dev.off<span class="p">()</span> <span class="c1"># create the ggplot2 data</span> p <span class="o">&lt;-</span> ggplot<span class="p">(</span>faithful<span class="p">,</span> aes<span class="p">(</span>x <span class="o">=</span> eruptions<span class="p">,</span> y <span class="o">=</span> waiting<span class="p">))</span> <span class="o">+</span> <span class="c1"># add a layer with the points</span> geom_point<span class="p">()</span> <span class="o">+</span> <span class="c1"># and a layer for the density heatmap with the alpha and the color determined by density (the .. refers to the fact that density is a variable that was created inside the ggplot() function)</span> stat_density2d<span class="p">(</span>aes<span class="p">(</span>alpha<span class="o">=</span><span class="m">..</span>density..<span class="p">,</span> fill<span class="o">=</span><span class="m">..</span>density..<span class="p">),</span> geom<span class="o">=</span><span class="s">&quot;tile&quot;</span><span class="p">,</span> contour<span class="o">=</span><span class="kc">FALSE</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>null device 1 </pre></div> <div class="codehilite"><pre><span></span>p </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/2d-density-plot_files/2d-density-plot_2_1.png" /></p>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/2d-density-plot.htmlData VisualizationAdding Time To A Datehttp://chrisalbon.com/r-stats/add-dates-to-a-date.html<div class="codehilite"><pre><span></span><span class="c1"># load the lubridate package</span> <span class="kn">library</span><span class="p">(</span>lubridate<span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># create a date variable</span> date.ex <span class="o">&lt;-</span> dmy<span class="p">(</span><span class="s">&quot;1/1/2001&quot;</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>date.ex </pre></div> <div class="codehilite"><pre><span></span>[1] &quot;2001-01-01 UTC&quot; </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># add 45 days to a date</span> date.ex.2 <span class="o">&lt;-</span> date.ex <span class="o">+</span> days<span class="p">(</span><span class="m">45</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>date.ex.2 </pre></div> <div class="codehilite"><pre><span></span>[1] &quot;2001-02-15 UTC&quot; </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># add six weeks to a date</span> date.ex.3 <span class="o">&lt;-</span> date.ex <span class="o">+</span> weeks<span class="p">(</span><span class="m">6</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>date.ex.3 </pre></div> <div class="codehilite"><pre><span></span>[1] &quot;2001-02-12 UTC&quot; </pre></div>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/add-dates-to-a-date.htmlData WranglingAdding labels to a ggplot2 bar graphhttp://chrisalbon.com/r-stats/add-labels-to-bar-graph.html<p>original source: r graphics cookbook</p> <div class="codehilite"><pre><span></span><span class="c1"># load the ggplot2 package</span> <span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span> <span class="c1"># load the gcookbook package</span> <span class="kn">library</span><span class="p">(</span>gcookbook<span class="p">)</span> </pre></div> <h2>Below the top</h2> <div class="codehilite"><pre><span></span><span class="c1"># create a ggplot data</span> ggplot<span class="p">(</span>cabbage_exp<span class="p">,</span> aes<span class="p">(</span>x<span class="o">=</span><span class="kp">interaction</span><span class="p">(</span>Date<span class="p">,</span> Cultivar<span class="p">),</span> y<span class="o">=</span>Weight<span class="p">))</span> <span class="o">+</span> <span class="c1"># draw the bar plot</span> geom_bar<span class="p">(</span>stat<span class="o">=</span><span class="s">&quot;identity&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># create the weight text above the bar in white</span> geom_text<span class="p">(</span>aes<span class="p">(</span>label<span class="o">=</span>Weight<span class="p">),</span> vjust<span class="o">=</span><span class="m">1.5</span><span class="p">,</span> colour<span class="o">=</span><span class="s">&quot;white&quot;</span><span class="p">)</span> </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/add-labels-to-bar-graph_files/add-labels-to-bar-graph_3_1.png" /></p> <h2>Above the top</h2> <div class="codehilite"><pre><span></span><span class="c1"># create a ggplot data</span> ggplot<span class="p">(</span>cabbage_exp<span class="p">,</span> aes<span class="p">(</span>x<span class="o">=</span><span class="kp">interaction</span><span class="p">(</span>Date<span class="p">,</span> Cultivar<span class="p">),</span> y<span class="o">=</span>Weight<span class="p">))</span> <span class="o">+</span> <span class="c1"># draw the bar plot</span> geom_bar<span class="p">(</span>stat<span class="o">=</span><span class="s">&quot;identity&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># create the weight text below the bar in white</span> geom_text<span class="p">(</span>aes<span class="p">(</span>label<span class="o">=</span>Weight<span class="p">),</span> vjust<span class="o">=</span><span class="m">-0.2</span><span class="p">)</span> </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/add-labels-to-bar-graph_files/add-labels-to-bar-graph_5_1.png" /></p> <h1>Labels on a grouped bar chart</h1> <div class="codehilite"><pre><span></span><span class="c1"># create the ggplot data for a grouped bar chart</span> ggplot<span class="p">(</span>cabbage_exp<span class="p">,</span> aes<span class="p">(</span>x<span class="o">=</span>Date<span class="p">,</span> y<span class="o">=</span>Weight<span class="p">,</span> fill<span class="o">=</span>Cultivar<span class="p">))</span> <span class="o">+</span> <span class="c1"># plot the bars</span> geom_bar<span class="p">(</span>stat<span class="o">=</span><span class="s">&quot;identity&quot;</span><span class="p">,</span> position<span class="o">=</span><span class="s">&quot;dodge&quot;</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># create the label, &quot;dodged&quot; to fit the bars</span> geom_text<span class="p">(</span>aes<span class="p">(</span>label<span class="o">=</span>Weight<span class="p">),</span> vjust<span class="o">=</span><span class="m">1.5</span><span class="p">,</span> colour<span class="o">=</span><span class="s">&quot;white&quot;</span><span class="p">,</span> position<span class="o">=</span>position_dodge<span class="p">(</span><span class="m">.9</span><span class="p">),</span> size<span class="o">=</span><span class="m">3</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>ymax not defined: adjusting position using y instead </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/add-labels-to-bar-graph_files/add-labels-to-bar-graph_7_2.png" /></p>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/add-labels-to-bar-graph.htmlData VisualizationAdding Levels To A Factorhttp://chrisalbon.com/r-stats/add-levels-to-factors.html<div class="codehilite"><pre><span></span><span class="c1"># create simulated distract name data</span> district <span class="o">&lt;-</span> <span class="kt">c</span><span class="p">(</span><span class="s">&quot;NORTH&quot;</span><span class="p">,</span> <span class="s">&quot;NORTHWEST&quot;</span><span class="p">,</span> <span class="s">&quot;CENTRAL&quot;</span><span class="p">,</span> <span class="s">&quot;SOUTH&quot;</span><span class="p">,</span> <span class="s">&quot;SOUTHWEST&quot;</span><span class="p">,</span> <span class="s">&quot;EAST&quot;</span><span class="p">)</span> <span class="c1"># remake district categories with the combination of district categories and a new SOUTH CENTRAL category</span> <span class="kp">levels</span><span class="p">(</span>district<span class="p">)</span> <span class="o">&lt;-</span> <span class="kt">c</span><span class="p">(</span>district<span class="p">,</span> <span class="s">&quot;SOUTH CENTRAL&quot;</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span><span class="kp">levels</span><span class="p">(</span>district<span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>[1] &quot;NORTH&quot; &quot;NORTHWEST&quot; &quot;CENTRAL&quot; &quot;SOUTH&quot; [5] &quot;SOUTHWEST&quot; &quot;EAST&quot; &quot;SOUTH CENTRAL&quot; </pre></div>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/add-levels-to-factors.htmlBasicsAggregate Data By Week Or Monthhttp://chrisalbon.com/r-stats/aggregate-by-week-or-month.html<p>original source: http://stackoverflow.com/questions/19716244/aggregate-data-by-week-month-etc-in-r?lq=1</p> <div class="codehilite"><pre><span></span><span class="c1"># load the xts package</span> <span class="kn">library</span><span class="p">(</span>xts<span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>Loading required package: zoo Attaching package: ‘zoo’ The following objects are masked from ‘package:base’: as.Date, as.Date.numeric </pre></div> <h2>Create some simulated data</h2> <div class="codehilite"><pre><span></span><span class="c1"># create an element for every year between two dates</span> date <span class="o">&lt;-</span> <span class="kp">seq</span><span class="p">(</span><span class="kp">as.Date</span><span class="p">(</span><span class="s">&quot;2006-01-01&quot;</span><span class="p">),</span> <span class="kp">as.Date</span><span class="p">(</span><span class="s">&quot;2007-01-01&quot;</span><span class="p">),</span> by <span class="o">=</span> <span class="s">&quot;1 day&quot;</span><span class="p">)</span> <span class="c1"># create some simulated values</span> score <span class="o">&lt;-</span> runif<span class="p">(</span><span class="m">366</span><span class="p">)</span> <span class="c1"># create a zoo time series object of score and ata</span> zoo <span class="o">&lt;-</span> zoo<span class="p">(</span>score<span class="p">,</span> <span class="kp">date</span><span class="p">)</span> </pre></div> <h2>Create some averages</h2> <div class="codehilite"><pre><span></span><span class="c1"># create a weekly average</span> weekly.avg <span class="o">&lt;-</span> apply.weekly<span class="p">(</span>zoo<span class="p">,</span> <span class="kp">mean</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>weekly.avg </pre></div> <div class="codehilite"><pre><span></span>2006-01-01 2006-01-08 2006-01-15 2006-01-22 2006-01-29 2006-02-05 2006-02-12 0.6463105 0.3696941 0.4492466 0.5587588 0.3330893 0.7490642 0.3463500 2006-02-19 2006-02-26 2006-03-05 2006-03-12 2006-03-19 2006-03-26 2006-04-02 0.4594144 0.3015816 0.5016827 0.3824588 0.4501046 0.5086366 0.6927037 2006-04-09 2006-04-16 2006-04-23 2006-04-30 2006-05-07 2006-05-14 2006-05-21 0.5238080 0.6618441 0.4366701 0.6187016 0.6110044 0.5724795 0.5267836 2006-05-28 2006-06-04 2006-06-11 2006-06-18 2006-06-25 2006-07-02 2006-07-09 0.4003268 0.3999404 0.6366840 0.4546525 0.5675619 0.4411083 0.5747285 2006-07-16 2006-07-23 2006-07-30 2006-08-06 2006-08-13 2006-08-20 2006-08-27 0.4136250 0.4936679 0.4814989 0.4419165 0.3644543 0.6385395 0.5230308 2006-09-03 2006-09-10 2006-09-17 2006-09-24 2006-10-01 2006-10-08 2006-10-15 0.5259253 0.5474812 0.4658602 0.4771834 0.6106620 0.4471343 0.4576065 2006-10-22 2006-10-29 2006-11-05 2006-11-12 2006-11-19 2006-11-26 2006-12-03 0.6124155 0.5418694 0.3136825 0.4227544 0.2406943 0.3723846 0.6079556 2006-12-10 2006-12-17 2006-12-24 2006-12-31 2007-01-01 0.5289365 0.4426345 0.6362102 0.4849858 0.1102631 </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># create a monthly average</span> monthly.avg <span class="o">&lt;-</span> apply.monthly<span class="p">(</span>zoo<span class="p">,</span> <span class="kp">mean</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>monthly.avg </pre></div> <div class="codehilite"><pre><span></span>2006-01-31 2006-02-28 2006-03-31 2006-04-30 2006-05-31 2006-06-30 2006-07-31 0.4636550 0.4345047 0.4883293 0.5791776 0.5045688 0.5301847 0.4698487 2006-08-31 2006-09-30 2006-10-31 2006-11-30 2006-12-31 2007-01-01 0.5033712 0.5218908 0.5171836 0.3661705 0.5341010 0.1102631 </pre></div>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/aggregate-by-week-or-month.htmlData WranglingThe Probability An Economy Seat Is An Aisle?http://chrisalbon.com/articles/aisle_seat_probabilities.html<p>There are two types of people in the world, aisle seaters and window seaters. I am an aisle seater, nothing is worse than limited bathroom access on a long flight. The first thing I do when I get my ticket is check to see if I have a window seat. If not, I immediately head over to the airline counter and try to get one.</p> <p>Last flight, on Turkish Airlines, I ran into a curious situation. I recieved my boarding pass with my seat number, 18C, but the ticket did not specify if C was an aisle seat or not. Making matters worse, the airline counter was swamped with a few dozen people. So I asked myself: <strong>given only the seat letter, C, what is the probability that it is an aisle seat?</strong></p> <p>Later, on the flight, I decided to find out.</p> <h2>Preliminaries</h2> <div class="codehilite"><pre><span></span><span class="c1"># Import required modules</span> <span class="kn">import</span> <span class="nn">pandas</span> <span class="kn">as</span> <span class="nn">pd</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="c1"># Set plots to display in the iPython notebook</span> <span class="o">%</span><span class="n">matplotlib</span> <span class="n">inline</span> </pre></div> <h2>Setup possible seat configurations</h2> <p>I am a pretty frequently flyer on a variety of airlines and aircraft. There are a variety of seating configurations out there, but typically they follow some basic rules:</p> <ul> <li>No window cluster of seats has more than three seats.</li> <li>On small slights with three seats, the single seat is on the left side.</li> <li>No flight has more than nine rows.</li> </ul> <p>Based on these rules, here are the "typical" seating configurations from aircraft with between two and nine seats per row. A '1' codifies that a seat is an aisle seat, a '0' codifies that it is a non-aisle seat (i.e. window or middle), and 'np.nan' denotes that the aircraft has less than nine seats (this is so all the list lengths are the same).</p> <div class="codehilite"><pre><span></span><span class="c1"># An aircraft with two seats per row</span> <span class="n">rows2</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">]</span> <span class="c1"># An aircraft with three seats per row</span> <span class="n">rows3</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,]</span> <span class="c1"># An aircraft with four seats per row</span> <span class="n">rows4</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">]</span> <span class="c1"># An aircraft with five seats per row</span> <span class="n">rows5</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span><span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">]</span> <span class="c1"># An aircraft with six seats per row</span> <span class="n">rows6</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">]</span> <span class="c1"># An aircraft with seven seats per row</span> <span class="n">rows7</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">]</span> <span class="c1"># An aircraft with eight seats per row</span> <span class="n">rows8</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="n">nan</span><span class="p">]</span> <span class="c1"># An aircraft with nine seats per row</span> <span class="n">rows9</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mi">0</span><span class="p">]</span> </pre></div> <p>For example, in an aircraft with five seats per row, <code>rows5</code>, the seating arrangement would be:</p> <ol> <li>window</li> <li>aisle</li> <li>aisle</li> <li>middle</li> <li>window</li> <li>no seat</li> <li>no seat</li> <li>no seat</li> <li>no seat</li> </ol> <p>Next, I'm take advantage of pandas row summation options, but to do this I need to wrangle the data into a pandas dataframe. Essentially I am using the pandas dataframe as a matrix.</p> <div class="codehilite"><pre><span></span><span class="c1"># Create a list variable of all possible aircraft configurations</span> <span class="n">seating_map</span> <span class="o">=</span> <span class="p">[</span><span class="n">rows2</span><span class="p">,</span> <span class="n">rows3</span><span class="p">,</span> <span class="n">rows4</span><span class="p">,</span> <span class="n">rows5</span><span class="p">,</span> <span class="n">rows6</span><span class="p">,</span> <span class="n">rows7</span><span class="p">,</span> <span class="n">rows8</span><span class="p">,</span> <span class="n">rows9</span><span class="p">]</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># Create a dataframe from the seating_map variable</span> <span class="n">df</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span><span class="n">seating_map</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;A&#39;</span><span class="p">,</span> <span class="s1">&#39;B&#39;</span><span class="p">,</span> <span class="s1">&#39;C&#39;</span><span class="p">,</span> <span class="s1">&#39;D&#39;</span><span class="p">,</span> <span class="s1">&#39;E&#39;</span><span class="p">,</span> <span class="s1">&#39;F&#39;</span><span class="p">,</span> <span class="s1">&#39;G&#39;</span><span class="p">,</span> <span class="s1">&#39;H&#39;</span><span class="p">,</span> <span class="s1">&#39;I&#39;</span><span class="p">],</span> <span class="n">index</span><span class="o">=</span><span class="p">[</span><span class="s1">&#39;rows2&#39;</span><span class="p">,</span> <span class="s1">&#39;rows3&#39;</span><span class="p">,</span> <span class="s1">&#39;rows4&#39;</span><span class="p">,</span> <span class="s1">&#39;rows5&#39;</span><span class="p">,</span> <span class="s1">&#39;rows6&#39;</span><span class="p">,</span> <span class="s1">&#39;rows7&#39;</span><span class="p">,</span> <span class="s1">&#39;rows8&#39;</span><span class="p">,</span> <span class="s1">&#39;rows9&#39;</span><span class="p">])</span> </pre></div> <p>Here is all the data we need to construct our probabilities. The columns represent individual seat letters (A, B, etc.) while the rows represent the number of seats-per-row in the aircraft.</p> <div class="codehilite"><pre><span></span><span class="c1"># View the dataframe</span> <span class="n">df</span> </pre></div> <div style="max-height:1000px;max-width:1500px;overflow:auto;"> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>A</th> <th>B</th> <th>C</th> <th>D</th> <th>E</th> <th>F</th> <th>G</th> <th>H</th> <th>I</th> </tr> </thead> <tbody> <tr> <th>rows2</th> <td> 1</td> <td> 1</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>rows3</th> <td> 1</td> <td> 1</td> <td> 0</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>rows4</th> <td> 0</td> <td> 1</td> <td> 1</td> <td> 0</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>rows5</th> <td> 0</td> <td> 1</td> <td> 1</td> <td> 0</td> <td> 0</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>rows6</th> <td> 0</td> <td> 1</td> <td> 1</td> <td> 0</td> <td> 1</td> <td> 0</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>rows7</th> <td> 0</td> <td> 1</td> <td> 1</td> <td> 0</td> <td> 1</td> <td> 1</td> <td> 0</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>rows8</th> <td> 0</td> <td> 0</td> <td> 1</td> <td> 1</td> <td> 1</td> <td> 1</td> <td> 0</td> <td> 0</td> <td>NaN</td> </tr> <tr> <th>rows9</th> <td> 0</td> <td> 0</td> <td> 1</td> <td> 1</td> <td> 0</td> <td> 1</td> <td> 1</td> <td> 0</td> <td> 0</td> </tr> </tbody> </table> </div> <h2>Calculate aisle probability</h2> <p>Because each aircraft seats-per-row configuration (i.e. row) is binary (1 if aisle, 0 if non-aisle), the probability that a seat is an aisle is simply the mean value of each seat letter (i.e. column).</p> <div class="codehilite"><pre><span></span><span class="c1"># Create a list wherein each element is the mean value of a column</span> <span class="n">aisle_probability</span> <span class="o">=</span> <span class="p">[</span><span class="n">df</span><span class="p">[</span><span class="s1">&#39;A&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;B&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;C&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;D&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;E&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;F&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;G&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;H&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">(),</span> <span class="n">df</span><span class="p">[</span><span class="s1">&#39;I&#39;</span><span class="p">]</span><span class="o">.</span><span class="n">mean</span><span class="p">()]</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># Display the variable</span> <span class="n">aisle_probability</span> </pre></div> <div class="codehilite"><pre><span></span>[0.25, 0.75, 0.8571428571428571, 0.33333333333333331, 0.59999999999999998, 0.75, 0.33333333333333331, 0.0, 0.0] </pre></div> <p>So there you have it, the probability that each seat letter is an aisle. However, we can make the presentation a little more intituative.</p> <h2>Visualize seat letter probabilities</h2> <p>The most obvious visualization to convey the probabilities would be seat letters on the x-axis and probabilities on the y-axis. Panda's plot function makes that easy.</p> <div class="codehilite"><pre><span></span><span class="c1"># Create a list of strings to use as the x-axis labels</span> <span class="n">seats</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;Seat A&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat B&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat C&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat D&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat E&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat F&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat G&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat H&#39;</span><span class="p">,</span> <span class="s1">&#39;Seat I&#39;</span><span class="p">]</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># Plot the probabilities, using &#39;seats&#39; as the index as a bar chart</span> <span class="n">pd</span><span class="o">.</span><span class="n">Series</span><span class="p">(</span><span class="n">aisle_probability</span><span class="p">,</span> <span class="n">index</span><span class="o">=</span><span class="n">seats</span><span class="p">)</span><span class="o">.</span><span class="n">plot</span><span class="p">(</span><span class="n">kind</span><span class="o">=</span><span class="s1">&#39;bar&#39;</span><span class="p">,</span> <span class="c1"># set y to range between 0 and 1</span> <span class="n">ylim</span><span class="o">=</span><span class="p">[</span><span class="mi">0</span><span class="p">,</span><span class="mi">1</span><span class="p">],</span> <span class="c1"># set the figure size</span> <span class="n">figsize</span><span class="o">=</span><span class="p">[</span><span class="mi">10</span><span class="p">,</span><span class="mi">6</span><span class="p">],</span> <span class="c1"># set the figure title</span> <span class="n">title</span><span class="o">=</span><span class="s1">&#39;Probabilty of being an Aisle Seat in Economy Class&#39;</span><span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span>&lt;matplotlib.axes._subplots.AxesSubplot at 0x1078300f0&gt; </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/aisle_seat_probabilities/output_20_1.png" /></p> <p>So there we have it! If given a boarding pass with seat C you have a 86% probability of being in an aisle seat!</p> <p>I hope this was helpful!</p>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:articles/aisle_seat_probabilities.htmlAll Combinations For A List Of Objectshttp://chrisalbon.com/python/all_combinations_of_a_list_of_objects.html<h2>Preliminary</h2> <div class="codehilite"><pre><span></span><span class="c1"># Import combinations with replacements from itertools</span> <span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">combinations_with_replacement</span> </pre></div> <h2>Create a list of objects</h2> <div class="codehilite"><pre><span></span><span class="c1"># Create a list of objects to combine</span> <span class="n">list_of_objects</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;warplanes&#39;</span><span class="p">,</span> <span class="s1">&#39;armor&#39;</span><span class="p">,</span> <span class="s1">&#39;infantry&#39;</span><span class="p">]</span> </pre></div> <h2>Find all combinations (with replacement) for the list</h2> <div class="codehilite"><pre><span></span><span class="c1"># Create an empty list object to hold the results of the loop</span> <span class="n">combinations</span> <span class="o">=</span> <span class="p">[]</span> <span class="c1"># Create a loop for every item in the length of list_of_objects, that,</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">list</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">list_of_objects</span><span class="p">))):</span> <span class="c1"># Finds every combination (with replacement) for each object in the list</span> <span class="n">combinations</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="nb">list</span><span class="p">(</span><span class="n">combinations_with_replacement</span><span class="p">(</span><span class="n">list_of_objects</span><span class="p">,</span> <span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)))</span> <span class="c1"># View the results</span> <span class="n">combinations</span> </pre></div> <div class="codehilite"><pre><span></span>[[(&#39;warplanes&#39;,), (&#39;armor&#39;,), (&#39;infantry&#39;,)], [(&#39;warplanes&#39;, &#39;warplanes&#39;), (&#39;warplanes&#39;, &#39;armor&#39;), (&#39;warplanes&#39;, &#39;infantry&#39;), (&#39;armor&#39;, &#39;armor&#39;), (&#39;armor&#39;, &#39;infantry&#39;), (&#39;infantry&#39;, &#39;infantry&#39;)], [(&#39;warplanes&#39;, &#39;warplanes&#39;, &#39;warplanes&#39;), (&#39;warplanes&#39;, &#39;warplanes&#39;, &#39;armor&#39;), (&#39;warplanes&#39;, &#39;warplanes&#39;, &#39;infantry&#39;), (&#39;warplanes&#39;, &#39;armor&#39;, &#39;armor&#39;), (&#39;warplanes&#39;, &#39;armor&#39;, &#39;infantry&#39;), (&#39;warplanes&#39;, &#39;infantry&#39;, &#39;infantry&#39;), (&#39;armor&#39;, &#39;armor&#39;, &#39;armor&#39;), (&#39;armor&#39;, &#39;armor&#39;, &#39;infantry&#39;), (&#39;armor&#39;, &#39;infantry&#39;, &#39;infantry&#39;), (&#39;infantry&#39;, &#39;infantry&#39;, &#39;infantry&#39;)]] </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># Flatten the list of lists into just a list</span> <span class="n">combinations</span> <span class="o">=</span> <span class="p">[</span><span class="n">i</span> <span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">combinations</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">row</span><span class="p">]</span> <span class="c1"># View the results</span> <span class="n">combinations</span> </pre></div> <div class="codehilite"><pre><span></span>[(&#39;warplanes&#39;,), (&#39;armor&#39;,), (&#39;infantry&#39;,), (&#39;warplanes&#39;, &#39;warplanes&#39;), (&#39;warplanes&#39;, &#39;armor&#39;), (&#39;warplanes&#39;, &#39;infantry&#39;), (&#39;armor&#39;, &#39;armor&#39;), (&#39;armor&#39;, &#39;infantry&#39;), (&#39;infantry&#39;, &#39;infantry&#39;), (&#39;warplanes&#39;, &#39;warplanes&#39;, &#39;warplanes&#39;), (&#39;warplanes&#39;, &#39;warplanes&#39;, &#39;armor&#39;), (&#39;warplanes&#39;, &#39;warplanes&#39;, &#39;infantry&#39;), (&#39;warplanes&#39;, &#39;armor&#39;, &#39;armor&#39;), (&#39;warplanes&#39;, &#39;armor&#39;, &#39;infantry&#39;), (&#39;warplanes&#39;, &#39;infantry&#39;, &#39;infantry&#39;), (&#39;armor&#39;, &#39;armor&#39;, &#39;armor&#39;), (&#39;armor&#39;, &#39;armor&#39;, &#39;infantry&#39;), (&#39;armor&#39;, &#39;infantry&#39;, &#39;infantry&#39;), (&#39;infantry&#39;, &#39;infantry&#39;, &#39;infantry&#39;)] </pre></div>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:python/all_combinations_of_a_list_of_objects.htmlBasicsAnnotating Plotshttp://chrisalbon.com/r-stats/annotating-plots.html<p>Original source: r graphics cookbook</p> <div class="codehilite"><pre><span></span><span class="c1"># load the gcookbook package for the data</span> <span class="kn">library</span><span class="p">(</span>gcookbook<span class="p">)</span> <span class="c1"># load the ggplot2 package</span> <span class="kn">library</span><span class="p">(</span>ggplot2<span class="p">)</span> <span class="c1"># reset the graphing device</span> dev.off<span class="p">()</span> </pre></div> <div class="codehilite"><pre><span></span>quartz_off_screen 3 </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># create the ggplot2 data</span> p <span class="o">&lt;-</span> ggplot<span class="p">(</span>faithful<span class="p">,</span> aes<span class="p">(</span>x <span class="o">=</span> eruptions<span class="p">,</span> y <span class="o">=</span> waiting<span class="p">))</span> </pre></div> <h2>Add Text</h2> <div class="codehilite"><pre><span></span><span class="c1"># create the ggplot2 plot</span> p <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span> <span class="c1"># add text</span> annotate<span class="p">(</span><span class="s">&quot;text&quot;</span><span class="p">,</span> x <span class="o">=</span> <span class="m">3</span><span class="p">,</span> y <span class="o">=</span> <span class="m">48</span><span class="p">,</span> label<span class="o">=</span><span class="s">&quot;Group 1&quot;</span><span class="p">,</span> family<span class="o">=</span><span class="s">&quot;serif&quot;</span><span class="p">,</span> fontface<span class="o">=</span><span class="s">&quot;italic&quot;</span><span class="p">,</span> colour<span class="o">=</span><span class="s">&quot;darkred&quot;</span><span class="p">,</span> size<span class="o">=</span><span class="m">6</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># add more text</span> annotate<span class="p">(</span><span class="s">&quot;text&quot;</span><span class="p">,</span> x <span class="o">=</span> <span class="m">4.5</span><span class="p">,</span> y <span class="o">=</span> <span class="m">66</span><span class="p">,</span> label<span class="o">=</span><span class="s">&quot;Group 2&quot;</span><span class="p">,</span> family<span class="o">=</span><span class="s">&quot;serif&quot;</span><span class="p">,</span> fontface<span class="o">=</span><span class="s">&quot;italic&quot;</span><span class="p">,</span> colour<span class="o">=</span><span class="s">&quot;darkred&quot;</span><span class="p">,</span> size<span class="o">=</span><span class="m">6</span><span class="p">)</span> <span class="c1"># Add Mathematical Expressions</span> <span class="c1"># create the ggplot2 plot</span> p <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span> <span class="c1"># add the formula, parse=TRUE turns the next into a formula</span> annotate<span class="p">(</span><span class="s">&quot;text&quot;</span><span class="p">,</span> x <span class="o">=</span> <span class="m">4.5</span><span class="p">,</span> y <span class="o">=</span> <span class="m">66</span><span class="p">,</span> parse <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> label <span class="o">=</span> <span class="s">&quot;frac(1, sqrt(2 * pi)) * e ^ {-x^2 / 2}&quot;</span><span class="p">)</span> </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/annotating-plots_files/annotating-plots_4_1.png" /></p> <p><img alt="png" src="http://chrisalbon.com/images/annotating-plots_files/annotating-plots_4_3.png" /></p> <h2>Add Mathematical Expressions</h2> <div class="codehilite"><pre><span></span><span class="c1"># create the ggplot2 plot</span> p <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span> <span class="c1"># add the formula, parse=TRUE turns the next into a formula</span> annotate<span class="p">(</span><span class="s">&quot;text&quot;</span><span class="p">,</span> x <span class="o">=</span> <span class="m">4.5</span><span class="p">,</span> y <span class="o">=</span> <span class="m">66</span><span class="p">,</span> parse <span class="o">=</span> <span class="kc">TRUE</span><span class="p">,</span> label <span class="o">=</span> <span class="s">&quot;frac(1, sqrt(2 * pi)) * e ^ {-x^2 / 2}&quot;</span><span class="p">)</span> </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/annotating-plots_files/annotating-plots_6_1.png" /></p> <h2>Add Lines</h2> <div class="codehilite"><pre><span></span><span class="c1"># load the grid package to create the flat ends of the line seqment and arrow</span> <span class="kn">library</span><span class="p">(</span>grid<span class="p">)</span> <span class="c1"># create the ggplot2 plot</span> p <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span> <span class="c1"># add a horizontal line at y = 66</span> geom_hline<span class="p">(</span>yintercept <span class="o">=</span> <span class="m">66</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># add a vertical line at 3 = 3</span> geom_vline<span class="p">(</span>xintercept <span class="o">=</span> <span class="m">3</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># add an angled line</span> geom_abline<span class="p">(</span>intercept <span class="o">=</span> <span class="m">37.4</span><span class="p">,</span> slope <span class="o">=</span> <span class="m">9</span><span class="p">)</span> <span class="o">+</span> <span class="c1"># add a line segment</span> annotate<span class="p">(</span><span class="s">&quot;segment&quot;</span><span class="p">,</span> x <span class="o">=</span> <span class="m">1</span><span class="p">,</span> xend <span class="o">=</span> <span class="m">2.5</span><span class="p">,</span> y <span class="o">=</span> <span class="m">75</span><span class="p">,</span> yend <span class="o">=</span> <span class="m">75</span><span class="p">,</span> arrow<span class="o">=</span>arrow<span class="p">(</span>ends<span class="o">=</span><span class="s">&quot;both&quot;</span><span class="p">,</span> angle<span class="o">=</span><span class="m">90</span><span class="p">,</span> length<span class="o">=</span>unit<span class="p">(</span><span class="m">.2</span><span class="p">,</span><span class="s">&quot;cm&quot;</span><span class="p">)))</span> <span class="o">+</span> <span class="c1"># add an arrow</span> annotate<span class="p">(</span><span class="s">&quot;segment&quot;</span><span class="p">,</span> x <span class="o">=</span> <span class="m">4</span><span class="p">,</span> xend <span class="o">=</span> <span class="m">5</span><span class="p">,</span> y <span class="o">=</span> <span class="m">60</span><span class="p">,</span> yend <span class="o">=</span> <span class="m">55</span><span class="p">,</span> colour<span class="o">=</span><span class="s">&quot;blue&quot;</span><span class="p">,</span> size<span class="o">=</span><span class="m">2</span><span class="p">,</span> arrow<span class="o">=</span>arrow<span class="p">())</span> </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/annotating-plots_files/annotating-plots_8_1.png" /></p> <h2>Add A Shaded Rectangle</h2> <div class="codehilite"><pre><span></span><span class="c1"># create the ggplot2 plot</span> p <span class="o">+</span> geom_point<span class="p">()</span> <span class="o">+</span> <span class="c1"># add a shaped blue rectangle</span> annotate<span class="p">(</span><span class="s">&quot;rect&quot;</span><span class="p">,</span> xmin<span class="o">=</span><span class="m">1</span><span class="p">,</span> xmax<span class="o">=</span><span class="m">3</span><span class="p">,</span> ymin<span class="o">=</span><span class="m">40</span><span class="p">,</span> ymax<span class="o">=</span><span class="m">100</span><span class="p">,</span> alpha<span class="o">=</span><span class="m">.1</span><span class="p">,</span> fill<span class="o">=</span><span class="s">&quot;blue&quot;</span><span class="p">)</span> </pre></div> <p><img alt="png" src="http://chrisalbon.com/images/annotating-plots_files/annotating-plots_10_1.png" /></p>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/annotating-plots.htmlData VisualizationApply A Function On Every Row Of A Dataframehttp://chrisalbon.com/r-stats/apply-function-to-every-row.html<p>original source: http://stackoverflow.com/questions/2074606/doing-a-plyr-operation-on-every-row-of-a-data-frame-in-r</p> <div class="codehilite"><pre><span></span><span class="c1"># Load packages</span> <span class="kn">library</span><span class="p">(</span>plyr<span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># create a simulated dataframe</span> x <span class="o">&lt;-</span> rnorm<span class="p">(</span><span class="m">10</span><span class="p">)</span> y <span class="o">&lt;-</span> rnorm<span class="p">(</span><span class="m">10</span><span class="p">)</span> df <span class="o">&lt;-</span> <span class="kt">data.frame</span><span class="p">(</span>x<span class="p">,</span>y<span class="p">)</span> </pre></div> <div class="codehilite"><pre><span></span><span class="c1"># array to dataframe apply, on df, for each row, apply transform() to create a variable called &quot;max&quot; whose values are the maximum value of x or y (whichever is higher).</span> adply<span class="p">(</span>df<span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="kp">transform</span><span class="p">,</span> max <span class="o">=</span> <span class="kp">max</span><span class="p">(</span>x<span class="p">,</span> y<span class="p">))</span> </pre></div> <div class="codehilite"><pre><span></span> x y max 1 -1.0286311 0.6621974 0.6621974 2 0.5466022 0.2977963 0.5466022 3 -0.6559125 -2.0830247 -0.6559125 4 -1.6942847 -0.2205220 -0.2205220 5 2.2678281 -0.4791234 2.2678281 6 -1.6849528 -0.4873940 -0.4873940 7 1.1627351 0.5137251 1.1627351 8 1.4182618 0.9697840 1.4182618 9 0.2025052 -0.3519337 0.2025052 10 -0.7100003 -0.6827529 -0.6827529 </pre></div>Chris AlbonSun, 01 May 2016 12:00:00 -0700tag:chrisalbon.com,2016-05-01:r-stats/apply-function-to-every-row.htmlBasics