{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# The Data Farm: Science Week Presentation\n", "\n", "## Learning from Data\n", "\n", "### [Neil D. Lawrence](http://staffwww.dcs.shef.ac.uk/people/N.Lawrence/) and the [Sheffield Machine Learning Research Group](http://ml.dcs.shef.ac.uk/)\n", "#### 5th March 2014\n", "\n", "This notebook has been made available as part of our **Open Data Science** agenda. If you want to read more about this agenda there is a [position paper/blog post available on it here](http://inverseprobability.com/2014/07/01/open-data-science/).\n", "\n", "This session is about 'learning from data'. How do we take the information on the internet and make sense of it. The answer, as you might expect, is using computers and mathematics. Luckily we also have a suite of tools to help. The first tool is a way of programming in python that really facilitates *interaction* with data. It is known as the \"[IPython Notebook](http://ipython.org/notebook.html)\", or more recently as the \"[Jupyter Project](http://jupyter.org/)\".\n", "\n", "### Welcome to the IPython Notebook\n", "\n", "The notebook is a great way of interacting with computers. In particular it allows me to integrate text descriptions, maths and code all together in the same place. For me, that's what my research is all about. I try to take concepts that people can describe, then I try to capture the essence of the concept in a *mathematical* model. Then I try and implement the model on a computer, often combining it with data, to try and do something fun, useful or, ideally, both. \n", "\n", "For the Science Week lecture on \"The Data Farm\" we looked at *recommender systems*.\n", "\n", "## Recommender Systems\n", "\n", "Do you watch Netflix? Have you ever rated a movie there? Do you buy books or electronics on Amazon? How about grocery shopping? All these companies want you to buy more, watch more or listen more. The best way of getting you to do that is by showing you more of what you like. But what do you like? What sort of person are you? Can the computer tell? It can certainly try! And it does so with a \"Recommender System\". Recommender systems are so important to Netflix that they offered a [$1 million dollar prize](http://en.wikipedia.org/wiki/Netflix_Prize) for improving theirs.\n", "\n", "## What is a Recommender System?\n", "\n", "To understand you the computer has to turn you into a series of numbers (but don't worry, it's not painful). In the 1960s TV series, [\"The Prisoner\"](http://en.wikipedia.org/wiki/The_Prisoner), Patrick McGoohan's character, and his fellow prisoners and warders were assigned a number. But their number didn't seem to reflect anything about their statuses (it couldn't even be used to distinguish between prisoner and warder. To perform computations about preferences we need to turn users into a series of numbers. How many numbers does it need to describe you? How many numbers does it take to summarize anyone's tastes in movies, music or groceries? These are open questions that can be difficult to answer. It depends how much we want to know about you: if it's just your taste in film and clothes that's one thing. If we want to summarize your state of health, that's another. \n", "\n", "Apparently a three year old's brain has $10^{15}$ synapses in it: a synapse governs the connection between neurons. It's quite possible that each synapse needs a number to represent it. So in the end it takes a lot of numbers to represent just the brain of a human, and that's before we start talking about what's going on in each of your cells.\n", "\n", "We won't try to represent all this, instead we will perform a 'compression' on you. We will try to represent your opinions about films into just a few of numbers. We are going to work with movies, and just for fun, we thought we'd get you to rate some movies and try and build a recommendation system.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that python, like many other programs, has a lot of software libraries for doing different things. Here we used three separate libraries (which we also loaded into the computer with the `import` commands) to perform three different jobs. And the software we downloaded will also be available in the form of ... you guessed it ... a library. In this case a library that we've written in Sheffield called `pods`. In the next section we load in the library and use it to download some data for analysis. \n", "\n", "Note, that the next step won't work unless you've run the code for the previous step! That's because when running code in the notebook, the python kernel (which is the software in the computer that's running the python code) has a *state*. By state we mean the values of the different variables in the system. The values of these variables can have a lot of effects, like which program code is in memory, and therefore accessibile to the computer. By running commands in this notebook, we change that state of the computer all the time. The software below relies on the software above being run to work. The notebook lets you know which box was run when by placing a number beside the box. After the code's been run you should see a number like:\n", "```\n", "In [3]:\n", "```\n", "if it was the third box to be run. Or `In [2]:` if it was the second (etc.).\n", "\n", "Whilst the notebook is waiting for the kernel to run commands, you will see `In [*]:` as the prompt. If this happens for too long, it may be that you've asked the computer to do something too complicated and it's getting slow. You can *interrupt* the kernel by selecting `Kernel->Interrupt` from the menu in the browser window above. If you get really confused about what's going on and want to wipe the state of the kernel clean, then you can select `Kernel->Restart`. That will reset the kernel's state." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [], "source": [ "%matplotlib inline\n", "import pods # Our software for python open data science (pods)\n", "import pandas as pd # the pandas library for data analysis\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "\n", "d = pods.datasets.movie_body_count()\n", "movies = d['Y']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Recommendation in Numbers\n", "\n", "I said that we need to reduce everything to numbers for recommendations on the computer. Let's think about what those numbers could be. A recommender system aims to make suggestions for *items* (films, books, music, groceries) given what it knows about *users'* tastes. The recommendation engine needs to represent the *taste* of all the users and the *characteristics* of each object. That is what our numbers should do.\n", "\n", "One way for organizing objects is to place related things close together. You can spend a lot of time doing this when you are supposed to be revising. It's called tidying up. Tidying up can be quite important, for example in a library we try and put books that are on related topics near to each other on the shelves. One system for doing this is known as [Dewey Decimal Classification](http://en.wikipedia.org/wiki/Dewey_Decimal_Classification). In the Dewey Decimal Classification system (which dates from 1876) each subject is given a number (in fact it's a decimal number---no surprises there!). For example, the fields of *Sciences* and *Mathematics* are all given numbers starting in the 500s. Because computers weren't around in 1876 (well a man called Charles Babbage was toying with the idea ... and a lady called Ada Lovelace was even thinking about programming languages but no one had actually built a computer) ... anyway as I was saying, because computers weren't around in 1876 they didn't get a number so they ended up being given numbers staring with 004. For example, works on the 'mathematical principles' of Computer science are given the series 004.0151 (which we might normally write as 4.0151, but when you think about it we could equally write 004.0151). Whilst it's a *classification* system for splitting books into groups (this is sometimes called a *taxonomy*) the books in the system are also normally laid out in the same order as the numbers. Ah ... back to the point numbers. \n", "\n", "So in the Dewey Decimal system we might now expect that neighbouring numbers represent books that are *related* to each other in subject. That seems to be exactly what we want for our recommender system! Could we somehow represent each film's subject according to a number? \n", "\n", "The problem here is that we are representing it with only *one* number, a so called *one* dimensional representation of the system. Actually a *one* dimensional representation of a subject can be very awkward. To see this, let's have another think about the Dewey Decimal system. However, this time we can think about subjects which have numbers in the 900s. By the way, I'm not an expert on the Dewey Decimal system, I just read the Wikipedia page on it and then found a list of the numbers from [Nova Southeastern University](http://www.nova.edu/library/help/misc/lc_dewey/dewey900.html#40) ... I may be a bit geeky, but memorizing the Dewey Decimal numbers is a step too far for even me (and don't memorize $\\pi$ either, [memorize $\\tau$ instead](http://www.tauday.com/) ...).\n", "\n", "Anway, back to those Dewey decimals ... if we look at the list for the 900s from the link above, we see that whilst the ordering for places is somewhat sensible, it is also rather arbitrary. In the 940s we have Europe listed from 940-949, Asia listed from 950-959 and Africa listed from 960-969. That seems OK, because Asia borders Europe, and Africa borders Asia (sorry I've slipped into a Geography lesson here ...). But it's also true that Africa is very close to Europe and the [Carthagian's](http://en.wikipedia.org/wiki/Carthage) had an empire that went across both ... add did the Romans (a bit of History thrown in too). \n", "\n", "
\n", " | title | \n", "item | \n", "user | \n", "rating | \n", "
---|---|---|---|---|
6 | \n", "Scream (1996) | \n", "288 | \n", "lawrennd | \n", "4 | \n", "
8 | \n", "Air Force One (1997) | \n", "300 | \n", "lawrennd | \n", "4 | \n", "
9 | \n", "Independence Day (ID4) (1996) | \n", "121 | \n", "lawrennd | \n", "3 | \n", "
50 | \n", "Amadeus (1984) | \n", "191 | \n", "lawrennd | \n", "4 | \n", "
69 | \n", "Fish Called Wanda, A (1988) | \n", "153 | \n", "lawrennd | \n", "4 | \n", "
81 | \n", "Time to Kill, A (1996) | \n", "282 | \n", "lawrennd | \n", "3 | \n", "
118 | \n", "Star Trek IV: The Voyage Home (1986) | \n", "230 | \n", "lawrennd | \n", "2 | \n", "
165 | \n", "Grease (1978) | \n", "451 | \n", "lawrennd | \n", "3 | \n", "
174 | \n", "Pretty Woman (1990) | \n", "739 | \n", "lawrennd | \n", "4 | \n", "
195 | \n", "Fried Green Tomatoes (1991) | \n", "660 | \n", "lawrennd | \n", "4 | \n", "
221 | \n", "Mission: Impossible (1996) | \n", "405 | \n", "spanna | \n", "2 | \n", "
222 | \n", "Fugitive, The (1993) | \n", "79 | \n", "spanna | \n", "5 | \n", "
268 | \n", "Trainspotting (1996) | \n", "475 | \n", "spanna | \n", "4 | \n", "
319 | \n", "Good Will Hunting (1997) | \n", "272 | \n", "spanna | \n", "3 | \n", "
403 | \n", "Return of the Jedi (1983) | \n", "181 | \n", "Smeagol | \n", "5 | \n", "
408 | \n", "Air Force One (1997) | \n", "300 | \n", "Smeagol | \n", "3 | \n", "
410 | \n", "Raiders of the Lost Ark (1981) | \n", "174 | \n", "Smeagol | \n", "4 | \n", "
415 | \n", "Jerry Maguire (1996) | \n", "237 | \n", "Smeagol | \n", "3 | \n", "
430 | \n", "Men in Black (1997) | \n", "257 | \n", "Smeagol | \n", "2 | \n", "
463 | \n", "Broken Arrow (1996) | \n", "546 | \n", "Smeagol | \n", "3 | \n", "
477 | \n", "Dante's Peak (1997) | \n", "323 | \n", "Smeagol | \n", "1 | \n", "
485 | \n", "Hunt for Red October, The (1990) | \n", "265 | \n", "Smeagol | \n", "4 | \n", "
512 | \n", "Eraser (1996) | \n", "597 | \n", "Smeagol | \n", "2 | \n", "
514 | \n", "Beauty and the Beast (1991) | \n", "588 | \n", "Smeagol | \n", "5 | \n", "
515 | \n", "Boot, Das (1981) | \n", "515 | \n", "Smeagol | \n", "4 | \n", "
616 | \n", "Rock, The (1996) | \n", "117 | \n", "filmgeek1988 | \n", "4 | \n", "
636 | \n", "L.A. Confidential (1997) | \n", "302 | \n", "filmgeek1988 | \n", "4 | \n", "
657 | \n", "Jurassic Park (1993) | \n", "82 | \n", "filmgeek1988 | \n", "5 | \n", "
673 | \n", "Die Hard (1988) | \n", "144 | \n", "filmgeek1988 | \n", "4 | \n", "
674 | \n", "Casablanca (1942) | \n", "483 | \n", "filmgeek1988 | \n", "5 | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
7466 | \n", "Blues Brothers, The (1980) | \n", "186 | \n", "Schmaylor | \n", "4 | \n", "
7468 | \n", "Trainspotting (1996) | \n", "475 | \n", "Schmaylor | \n", "5 | \n", "
7606 | \n", "Scream (1996) | \n", "288 | \n", "genius | \n", "3 | \n", "
7610 | \n", "Raiders of the Lost Ark (1981) | \n", "174 | \n", "genius | \n", "5 | \n", "
7625 | \n", "Princess Bride, The (1987) | \n", "173 | \n", "genius | \n", "5 | \n", "
7630 | \n", "Men in Black (1997) | \n", "257 | \n", "genius | \n", "3 | \n", "
7649 | \n", "Apollo 13 (1995) | \n", "28 | \n", "genius | \n", "4 | \n", "
7656 | \n", "One Flew Over the Cuckoo's Nest (1975) | \n", "357 | \n", "genius | \n", "5 | \n", "
7682 | \n", "It's a Wonderful Life (1946) | \n", "496 | \n", "genius | \n", "5 | \n", "
7705 | \n", "Starship Troopers (1997) | \n", "271 | \n", "genius | \n", "5 | \n", "
7720 | \n", "Citizen Kane (1941) | \n", "134 | \n", "genius | \n", "5 | \n", "
7727 | \n", "This Is Spinal Tap (1984) | \n", "209 | \n", "genius | \n", "5 | \n", "
7736 | \n", "Unforgiven (1992) | \n", "203 | \n", "genius | \n", "5 | \n", "
7753 | \n", "Fantasia (1940) | \n", "432 | \n", "genius | \n", "4 | \n", "
7759 | \n", "Star Trek III: The Search for Spock (1984) | \n", "229 | \n", "genius | \n", "3 | \n", "
7785 | \n", "Grosse Pointe Blank (1997) | \n", "248 | \n", "genius | \n", "4 | \n", "
7809 | \n", "Independence Day (ID4) (1996) | \n", "121 | \n", "kiramira | \n", "3 | \n", "
7818 | \n", "Star Trek: First Contact (1996) | \n", "222 | \n", "kiramira | \n", "3 | \n", "
7846 | \n", "Shawshank Redemption, The (1994) | \n", "64 | \n", "kiramira | \n", "3 | \n", "
7859 | \n", "2001: A Space Odyssey (1968) | \n", "135 | \n", "kiramira | \n", "3 | \n", "
7928 | \n", "Sabrina (1995) | \n", "274 | \n", "kiramira | \n", "3 | \n", "
8032 | \n", "E.T. the Extra-Terrestrial (1982) | \n", "423 | \n", "checkers | \n", "5 | \n", "
8044 | \n", "When Harry Met Sally... (1989) | \n", "216 | \n", "checkers | \n", "4 | \n", "
8045 | \n", "Aliens (1986) | \n", "176 | \n", "checkers | \n", "5 | \n", "
8051 | \n", "Blade Runner (1982) | \n", "89 | \n", "checkers | \n", "4 | \n", "
8054 | \n", "Usual Suspects, The (1995) | \n", "12 | \n", "checkers | \n", "4 | \n", "
8073 | \n", "Die Hard (1988) | \n", "144 | \n", "checkers | \n", "5 | \n", "
8079 | \n", "Psycho (1960) | \n", "185 | \n", "checkers | \n", "3 | \n", "
8111 | \n", "Shining, The (1980) | \n", "200 | \n", "checkers | \n", "5 | \n", "
8135 | \n", "Taxi Driver (1976) | \n", "23 | \n", "checkers | \n", "3 | \n", "
335 rows × 4 columns
\n", "\n", " | 0 | \n", "1 | \n", "
---|---|---|
1 | \n", "0.183076 | \n", "0.189615 | \n", "
7 | \n", "-0.561889 | \n", "-0.910444 | \n", "
8 | \n", "-0.750350 | \n", "0.646612 | \n", "
11 | \n", "-0.298702 | \n", "-1.006292 | \n", "
13 | \n", "1.190287 | \n", "-0.249183 | \n", "
15 | \n", "-1.089392 | \n", "-0.539742 | \n", "
22 | \n", "0.528807 | \n", "-0.918695 | \n", "
25 | \n", "0.220105 | \n", "0.253632 | \n", "
28 | \n", "-0.923326 | \n", "-0.323664 | \n", "
50 | \n", "0.541759 | \n", "-0.446159 | \n", "
71 | \n", "0.214103 | \n", "0.105616 | \n", "
77 | \n", "-0.777510 | \n", "-0.395414 | \n", "
79 | \n", "0.162071 | \n", "0.935833 | \n", "
83 | \n", "-0.786773 | \n", "-0.388716 | \n", "
89 | \n", "0.667328 | \n", "-0.475148 | \n", "
95 | \n", "-0.385357 | \n", "-0.721846 | \n", "
99 | \n", "-0.849732 | \n", "-0.806150 | \n", "
124 | \n", "-0.762526 | \n", "0.288493 | \n", "
127 | \n", "-0.997096 | \n", "0.160759 | \n", "
133 | \n", "0.823060 | \n", "0.352506 | \n", "
135 | \n", "0.845122 | \n", "0.813553 | \n", "
144 | \n", "0.860058 | \n", "-0.014109 | \n", "
153 | \n", "1.201314 | \n", "-0.488023 | \n", "
172 | \n", "-1.373266 | \n", "-0.012550 | \n", "
173 | \n", "-0.737328 | \n", "0.493552 | \n", "
176 | \n", "1.057559 | \n", "-0.034322 | \n", "
179 | \n", "0.792977 | \n", "0.114314 | \n", "
181 | \n", "-1.110744 | \n", "0.699923 | \n", "
182 | \n", "0.782992 | \n", "0.171123 | \n", "
187 | \n", "-1.012451 | \n", "-0.064854 | \n", "
... | \n", "... | \n", "... | \n", "
690 | \n", "0.213227 | \n", "0.080885 | \n", "
479 | \n", "0.813733 | \n", "-0.169723 | \n", "
511 | \n", "-0.254329 | \n", "-0.125884 | \n", "
582 | \n", "-0.287145 | \n", "0.023956 | \n", "
24 | \n", "-0.366711 | \n", "0.049370 | \n", "
196 | \n", "-0.675826 | \n", "0.028841 | \n", "
393 | \n", "-0.299633 | \n", "-0.108062 | \n", "
423 | \n", "0.485487 | \n", "0.090516 | \n", "
508 | \n", "-0.672849 | \n", "-0.343811 | \n", "
117 | \n", "0.013734 | \n", "0.138576 | \n", "
188 | \n", "0.858462 | \n", "0.090509 | \n", "
209 | \n", "-0.619404 | \n", "0.545979 | \n", "
218 | \n", "0.205800 | \n", "0.188982 | \n", "
307 | \n", "0.103603 | \n", "0.314980 | \n", "
313 | \n", "-0.312871 | \n", "0.978437 | \n", "
323 | \n", "-0.953880 | \n", "-1.319440 | \n", "
411 | \n", "0.011786 | \n", "-0.817824 | \n", "
471 | \n", "-0.335378 | \n", "-0.302101 | \n", "
475 | \n", "-0.711656 | \n", "0.136600 | \n", "
515 | \n", "0.137971 | \n", "0.193580 | \n", "
550 | \n", "-0.358279 | \n", "-0.279781 | \n", "
732 | \n", "0.336907 | \n", "-0.130924 | \n", "
739 | \n", "-0.277026 | \n", "0.371131 | \n", "
751 | \n", "0.069441 | \n", "0.270111 | \n", "
118 | \n", "-0.747108 | \n", "-0.379176 | \n", "
248 | \n", "-0.288034 | \n", "0.096965 | \n", "
252 | \n", "-1.488702 | \n", "0.556678 | \n", "
476 | \n", "-0.059711 | \n", "0.176887 | \n", "
154 | \n", "0.335405 | \n", "0.519721 | \n", "
685 | \n", "-1.214512 | \n", "0.487666 | \n", "
155 rows × 2 columns
\n", "\n", " | user | \n", "item | \n", "rating | \n", "split | \n", "
---|---|---|---|---|
index | \n", "\n", " | \n", " | \n", " | \n", " |
874965758 | \n", "1 | \n", "1 | \n", "1.212109 | \n", "u1.base | \n", "
875071561 | \n", "1 | \n", "7 | \n", "0.212109 | \n", "u1.base | \n", "
875072484 | \n", "1 | \n", "8 | \n", "-2.787891 | \n", "u1.base | \n", "
875072262 | \n", "1 | \n", "11 | \n", "-1.787891 | \n", "u1.base | \n", "
875071805 | \n", "1 | \n", "13 | \n", "1.212109 | \n", "u1.base | \n", "
875071608 | \n", "1 | \n", "15 | \n", "1.212109 | \n", "u1.base | \n", "
875072404 | \n", "1 | \n", "22 | \n", "0.212109 | \n", "u1.base | \n", "
875071805 | \n", "1 | \n", "25 | \n", "0.212109 | \n", "u1.base | \n", "
875072173 | \n", "1 | \n", "28 | \n", "0.212109 | \n", "u1.base | \n", "
874965954 | \n", "1 | \n", "50 | \n", "1.212109 | \n", "u1.base | \n", "
876892425 | \n", "1 | \n", "71 | \n", "-0.787891 | \n", "u1.base | \n", "
876893205 | \n", "1 | \n", "77 | \n", "0.212109 | \n", "u1.base | \n", "
875072865 | \n", "1 | \n", "79 | \n", "0.212109 | \n", "u1.base | \n", "
875072370 | \n", "1 | \n", "83 | \n", "-0.787891 | \n", "u1.base | \n", "
875072484 | \n", "1 | \n", "89 | \n", "1.212109 | \n", "u1.base | \n", "
875072303 | \n", "1 | \n", "95 | \n", "0.212109 | \n", "u1.base | \n", "
875072547 | \n", "1 | \n", "99 | \n", "-0.787891 | \n", "u1.base | \n", "
875071484 | \n", "1 | \n", "124 | \n", "1.212109 | \n", "u1.base | \n", "
874965706 | \n", "1 | \n", "127 | \n", "1.212109 | \n", "u1.base | \n", "
876892818 | \n", "1 | \n", "133 | \n", "0.212109 | \n", "u1.base | \n", "
875072404 | \n", "1 | \n", "135 | \n", "0.212109 | \n", "u1.base | \n", "
875073180 | \n", "1 | \n", "144 | \n", "0.212109 | \n", "u1.base | \n", "
876893230 | \n", "1 | \n", "153 | \n", "-0.787891 | \n", "u1.base | \n", "
874965478 | \n", "1 | \n", "172 | \n", "1.212109 | \n", "u1.base | \n", "
878541803 | \n", "1 | \n", "173 | \n", "1.212109 | \n", "u1.base | \n", "
876892468 | \n", "1 | \n", "176 | \n", "1.212109 | \n", "u1.base | \n", "
875072370 | \n", "1 | \n", "179 | \n", "-0.787891 | \n", "u1.base | \n", "
874965739 | \n", "1 | \n", "181 | \n", "1.212109 | \n", "u1.base | \n", "
875072520 | \n", "1 | \n", "182 | \n", "0.212109 | \n", "u1.base | \n", "
874965678 | \n", "1 | \n", "187 | \n", "0.212109 | \n", "u1.base | \n", "
... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
886832337 | \n", "936 | \n", "250 | \n", "1.212109 | \n", "ub.test | \n", "
886831374 | \n", "936 | \n", "313 | \n", "0.212109 | \n", "ub.test | \n", "
886831415 | \n", "936 | \n", "333 | \n", "-0.787891 | \n", "ub.test | \n", "
876762200 | \n", "937 | \n", "258 | \n", "0.212109 | \n", "ub.test | \n", "
876768813 | \n", "937 | \n", "300 | \n", "0.212109 | \n", "ub.test | \n", "
891356390 | \n", "938 | \n", "181 | \n", "1.212109 | \n", "ub.test | \n", "
891350008 | \n", "938 | \n", "300 | \n", "-0.787891 | \n", "ub.test | \n", "
891357137 | \n", "938 | \n", "476 | \n", "0.212109 | \n", "ub.test | \n", "
880260956 | \n", "939 | \n", "222 | \n", "1.212109 | \n", "ub.test | \n", "
880260636 | \n", "939 | \n", "326 | \n", "1.212109 | \n", "ub.test | \n", "
880261610 | \n", "939 | \n", "597 | \n", "0.212109 | \n", "ub.test | \n", "
885921577 | \n", "940 | \n", "8 | \n", "1.212109 | \n", "ub.test | \n", "
885921577 | \n", "940 | \n", "56 | \n", "1.212109 | \n", "ub.test | \n", "
885921953 | \n", "940 | \n", "153 | \n", "-1.787891 | \n", "ub.test | \n", "
885921451 | \n", "940 | \n", "172 | \n", "0.212109 | \n", "ub.test | \n", "
885921310 | \n", "940 | \n", "181 | \n", "-0.787891 | \n", "ub.test | \n", "
885921953 | \n", "940 | \n", "194 | \n", "1.212109 | \n", "ub.test | \n", "
875049144 | \n", "941 | \n", "1 | \n", "1.212109 | \n", "ub.test | \n", "
875049038 | \n", "941 | \n", "222 | \n", "-1.787891 | \n", "ub.test | \n", "
875049038 | \n", "941 | \n", "273 | \n", "-0.787891 | \n", "ub.test | \n", "
875048887 | \n", "941 | \n", "298 | \n", "1.212109 | \n", "ub.test | \n", "
875048495 | \n", "941 | \n", "300 | \n", "0.212109 | \n", "ub.test | \n", "
891283517 | \n", "942 | \n", "31 | \n", "1.212109 | \n", "ub.test | \n", "
891282396 | \n", "942 | \n", "269 | \n", "-1.787891 | \n", "ub.test | \n", "
891282396 | \n", "942 | \n", "313 | \n", "-0.787891 | \n", "ub.test | \n", "
891282931 | \n", "942 | \n", "511 | \n", "0.212109 | \n", "ub.test | \n", "
888639093 | \n", "943 | \n", "12 | \n", "1.212109 | \n", "ub.test | \n", "
888692413 | \n", "943 | \n", "237 | \n", "0.212109 | \n", "ub.test | \n", "
875502042 | \n", "943 | \n", "471 | \n", "1.212109 | \n", "ub.test | \n", "
875502042 | \n", "943 | \n", "685 | \n", "0.212109 | \n", "ub.test | \n", "
266259 rows × 4 columns
\n", "