{ "cells": [ { "cell_type": "markdown", "metadata": { "extensions": { "jupyter_dashboards": { "version": 1, "views": { "grid_default": { "col": 0, "height": 4, "hidden": false, "row": 0, "width": 4 }, "report_default": { "hidden": false } } } } }, "source": [ "# Project: Wrangling and Analyze Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Gathering\n", "In the cell below, gather **all** three pieces of data for this project and load them in the notebook. **Note:** the methods required to gather each data are different.\n", "1. Directly download the WeRateDogs Twitter archive data (twitter_archive_enhanced.csv)" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "#importing the necessary libraries\n", "import numpy as np\n", "import pandas as pd\n", "import requests\n", "import os\n", "import tweepy\n", "import json\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "extensions": { "jupyter_dashboards": { "version": 1, "views": { "grid_default": { "hidden": true }, "report_default": { "hidden": true } } } } }, "outputs": [], "source": [ "#reading and checking the csv file\n", "df1=pd.read_csv('twitter-archive-enhanced.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Use the Requests library to download the tweet image prediction (image_predictions.tsv)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "#downloading the image_predictions file\n", "folder='twitter'\n", "if not os.path.exists(folder):\n", " os.makedirs(folder)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "url=' https://d17h27t6h515a5.cloudfront.net/topher/2017/August/599fd2ad_image-predictions/image-predictions.tsv'\n", "response=requests.get(url)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "#writing the downloaded doc to a file so we can access it\n", "with open(os.path.join(folder, url.split('/')[-1]), mode='wb')as file:\n", " file.write(response.content)" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "#accessing and viewing the tsv file\n", "df2= pd.read_csv('image-predictions.tsv', sep='\\t')\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. Use the Tweepy library to query additional data via the Twitter API (tweet_json.txt)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "#Loading the json file into a dataframe\n", "new_list = [] #an empty list\n", "\n", "with open('tweet-json.txt') as file:\n", " for tweet in file:\n", " data = json.loads(tweet)\n", " tweet_id=data['id']\n", " retweet_count=data['retweet_count']\n", " favorite_count=data['favorite_count']\n", " \n", " new_list.append({\"tweet_id\": tweet_id, \"retweet_count\": int(retweet_count),\n", " \"favourite_count\": favorite_count})\n", " \n", "df3=pd.DataFrame(new_list, columns= ['tweet_id', 'retweet_count', 'favourite_count'])\n" ] }, { "cell_type": "markdown", "metadata": { "extensions": { "jupyter_dashboards": { "version": 1, "views": { "grid_default": { "col": 4, "height": 4, "hidden": false, "row": 28, "width": 4 }, "report_default": { "hidden": false } } } } }, "source": [ "## Assessing Data\n", "In this section, detect and document at least **eight (8) quality issues and two (2) tidiness issue**. You must use **both** visual assessment\n", "programmatic assessement to assess the data.\n", "\n", "**Note:** pay attention to the following key points when you access the data.\n", "\n", "* You do not need to gather the tweets beyond August 1st, 2017. You can, but note that you won't be able to gather the image predictions for these tweets since you don't have access to the algorithm used.\n", "\n" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
| \n", " | tweet_id | \n", "in_reply_to_status_id | \n", "in_reply_to_user_id | \n", "timestamp | \n", "source | \n", "text | \n", "retweeted_status_id | \n", "retweeted_status_user_id | \n", "retweeted_status_timestamp | \n", "expanded_urls | \n", "rating_numerator | \n", "rating_denominator | \n", "name | \n", "doggo | \n", "floofer | \n", "pupper | \n", "puppo | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "892420643555336193 | \n", "NaN | \n", "NaN | \n", "2017-08-01 16:23:56 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Phineas. He's a mystical boy. Only eve... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "https://twitter.com/dog_rates/status/892420643... | \n", "13 | \n", "10 | \n", "Phineas | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 1 | \n", "892177421306343426 | \n", "NaN | \n", "NaN | \n", "2017-08-01 00:17:27 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Tilly. She's just checking pup on you.... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "https://twitter.com/dog_rates/status/892177421... | \n", "13 | \n", "10 | \n", "Tilly | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 2 | \n", "891815181378084864 | \n", "NaN | \n", "NaN | \n", "2017-07-31 00:18:03 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Archie. He is a rare Norwegian Pouncin... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "https://twitter.com/dog_rates/status/891815181... | \n", "12 | \n", "10 | \n", "Archie | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 3 | \n", "891689557279858688 | \n", "NaN | \n", "NaN | \n", "2017-07-30 15:58:51 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Darla. She commenced a snooze mid meal... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "https://twitter.com/dog_rates/status/891689557... | \n", "13 | \n", "10 | \n", "Darla | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 4 | \n", "891327558926688256 | \n", "NaN | \n", "NaN | \n", "2017-07-29 16:00:24 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Franklin. He would like you to stop ca... | \n", "NaN | \n", "NaN | \n", "NaN | \n", "https://twitter.com/dog_rates/status/891327558... | \n", "12 | \n", "10 | \n", "Franklin | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| \n", " | tweet_id | \n", "jpg_url | \n", "img_num | \n", "p1 | \n", "p1_conf | \n", "p1_dog | \n", "p2 | \n", "p2_conf | \n", "p2_dog | \n", "p3 | \n", "p3_conf | \n", "p3_dog | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "666020888022790149 | \n", "https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg | \n", "1 | \n", "Welsh_springer_spaniel | \n", "0.465074 | \n", "True | \n", "collie | \n", "0.156665 | \n", "True | \n", "Shetland_sheepdog | \n", "0.061428 | \n", "True | \n", "
| 1 | \n", "666029285002620928 | \n", "https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg | \n", "1 | \n", "redbone | \n", "0.506826 | \n", "True | \n", "miniature_pinscher | \n", "0.074192 | \n", "True | \n", "Rhodesian_ridgeback | \n", "0.072010 | \n", "True | \n", "
| 2 | \n", "666033412701032449 | \n", "https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg | \n", "1 | \n", "German_shepherd | \n", "0.596461 | \n", "True | \n", "malinois | \n", "0.138584 | \n", "True | \n", "bloodhound | \n", "0.116197 | \n", "True | \n", "
| 3 | \n", "666044226329800704 | \n", "https://pbs.twimg.com/media/CT5Dr8HUEAA-lEu.jpg | \n", "1 | \n", "Rhodesian_ridgeback | \n", "0.408143 | \n", "True | \n", "redbone | \n", "0.360687 | \n", "True | \n", "miniature_pinscher | \n", "0.222752 | \n", "True | \n", "
| 4 | \n", "666049248165822465 | \n", "https://pbs.twimg.com/media/CT5IQmsXIAAKY4A.jpg | \n", "1 | \n", "miniature_pinscher | \n", "0.560311 | \n", "True | \n", "Rottweiler | \n", "0.243682 | \n", "True | \n", "Doberman | \n", "0.154629 | \n", "True | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 2070 | \n", "891327558926688256 | \n", "https://pbs.twimg.com/media/DF6hr6BUMAAzZgT.jpg | \n", "2 | \n", "basset | \n", "0.555712 | \n", "True | \n", "English_springer | \n", "0.225770 | \n", "True | \n", "German_short-haired_pointer | \n", "0.175219 | \n", "True | \n", "
| 2071 | \n", "891689557279858688 | \n", "https://pbs.twimg.com/media/DF_q7IAWsAEuuN8.jpg | \n", "1 | \n", "paper_towel | \n", "0.170278 | \n", "False | \n", "Labrador_retriever | \n", "0.168086 | \n", "True | \n", "spatula | \n", "0.040836 | \n", "False | \n", "
| 2072 | \n", "891815181378084864 | \n", "https://pbs.twimg.com/media/DGBdLU1WsAANxJ9.jpg | \n", "1 | \n", "Chihuahua | \n", "0.716012 | \n", "True | \n", "malamute | \n", "0.078253 | \n", "True | \n", "kelpie | \n", "0.031379 | \n", "True | \n", "
| 2073 | \n", "892177421306343426 | \n", "https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg | \n", "1 | \n", "Chihuahua | \n", "0.323581 | \n", "True | \n", "Pekinese | \n", "0.090647 | \n", "True | \n", "papillon | \n", "0.068957 | \n", "True | \n", "
| 2074 | \n", "892420643555336193 | \n", "https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg | \n", "1 | \n", "orange | \n", "0.097049 | \n", "False | \n", "bagel | \n", "0.085851 | \n", "False | \n", "banana | \n", "0.076110 | \n", "False | \n", "
2075 rows × 12 columns
\n", "" ], "text/plain": [ " tweet_id jpg_url \\\n", "0 666020888022790149 https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg \n", "1 666029285002620928 https://pbs.twimg.com/media/CT42GRgUYAA5iDo.jpg \n", "2 666033412701032449 https://pbs.twimg.com/media/CT4521TWwAEvMyu.jpg \n", "3 666044226329800704 https://pbs.twimg.com/media/CT5Dr8HUEAA-lEu.jpg \n", "4 666049248165822465 https://pbs.twimg.com/media/CT5IQmsXIAAKY4A.jpg \n", "... ... ... \n", "2070 891327558926688256 https://pbs.twimg.com/media/DF6hr6BUMAAzZgT.jpg \n", "2071 891689557279858688 https://pbs.twimg.com/media/DF_q7IAWsAEuuN8.jpg \n", "2072 891815181378084864 https://pbs.twimg.com/media/DGBdLU1WsAANxJ9.jpg \n", "2073 892177421306343426 https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg \n", "2074 892420643555336193 https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg \n", "\n", " img_num p1 p1_conf p1_dog p2 \\\n", "0 1 Welsh_springer_spaniel 0.465074 True collie \n", "1 1 redbone 0.506826 True miniature_pinscher \n", "2 1 German_shepherd 0.596461 True malinois \n", "3 1 Rhodesian_ridgeback 0.408143 True redbone \n", "4 1 miniature_pinscher 0.560311 True Rottweiler \n", "... ... ... ... ... ... \n", "2070 2 basset 0.555712 True English_springer \n", "2071 1 paper_towel 0.170278 False Labrador_retriever \n", "2072 1 Chihuahua 0.716012 True malamute \n", "2073 1 Chihuahua 0.323581 True Pekinese \n", "2074 1 orange 0.097049 False bagel \n", "\n", " p2_conf p2_dog p3 p3_conf p3_dog \n", "0 0.156665 True Shetland_sheepdog 0.061428 True \n", "1 0.074192 True Rhodesian_ridgeback 0.072010 True \n", "2 0.138584 True bloodhound 0.116197 True \n", "3 0.360687 True miniature_pinscher 0.222752 True \n", "4 0.243682 True Doberman 0.154629 True \n", "... ... ... ... ... ... \n", "2070 0.225770 True German_short-haired_pointer 0.175219 True \n", "2071 0.168086 True spatula 0.040836 False \n", "2072 0.078253 True kelpie 0.031379 True \n", "2073 0.090647 True papillon 0.068957 True \n", "2074 0.085851 False banana 0.076110 False \n", "\n", "[2075 rows x 12 columns]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#viewing the image-predictions dataframe\n", "df2" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "| \n", " | tweet_id | \n", "retweet_count | \n", "favourite_count | \n", "
|---|---|---|---|
| 0 | \n", "892420643555336193 | \n", "8853 | \n", "39467 | \n", "
| 1 | \n", "892177421306343426 | \n", "6514 | \n", "33819 | \n", "
| 2 | \n", "891815181378084864 | \n", "4328 | \n", "25461 | \n", "
| 3 | \n", "891689557279858688 | \n", "8964 | \n", "42908 | \n", "
| 4 | \n", "891327558926688256 | \n", "9774 | \n", "41048 | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "
| 2349 | \n", "666049248165822465 | \n", "41 | \n", "111 | \n", "
| 2350 | \n", "666044226329800704 | \n", "147 | \n", "311 | \n", "
| 2351 | \n", "666033412701032449 | \n", "47 | \n", "128 | \n", "
| 2352 | \n", "666029285002620928 | \n", "48 | \n", "132 | \n", "
| 2353 | \n", "666020888022790149 | \n", "532 | \n", "2535 | \n", "
2354 rows × 3 columns
\n", "| \n", " | tweet_id | \n", "img_num | \n", "p1_conf | \n", "p2_conf | \n", "p3_conf | \n", "
|---|---|---|---|---|---|
| count | \n", "2.075000e+03 | \n", "2075.000000 | \n", "2075.000000 | \n", "2.075000e+03 | \n", "2.075000e+03 | \n", "
| mean | \n", "7.384514e+17 | \n", "1.203855 | \n", "0.594548 | \n", "1.345886e-01 | \n", "6.032417e-02 | \n", "
| std | \n", "6.785203e+16 | \n", "0.561875 | \n", "0.271174 | \n", "1.006657e-01 | \n", "5.090593e-02 | \n", "
| min | \n", "6.660209e+17 | \n", "1.000000 | \n", "0.044333 | \n", "1.011300e-08 | \n", "1.740170e-10 | \n", "
| 25% | \n", "6.764835e+17 | \n", "1.000000 | \n", "0.364412 | \n", "5.388625e-02 | \n", "1.622240e-02 | \n", "
| 50% | \n", "7.119988e+17 | \n", "1.000000 | \n", "0.588230 | \n", "1.181810e-01 | \n", "4.944380e-02 | \n", "
| 75% | \n", "7.932034e+17 | \n", "1.000000 | \n", "0.843855 | \n", "1.955655e-01 | \n", "9.180755e-02 | \n", "
| max | \n", "8.924206e+17 | \n", "4.000000 | \n", "1.000000 | \n", "4.880140e-01 | \n", "2.734190e-01 | \n", "
| \n", " | tweet_id | \n", "in_reply_to_status_id | \n", "in_reply_to_user_id | \n", "timestamp | \n", "source | \n", "text | \n", "retweeted_status_id | \n", "retweeted_status_user_id | \n", "retweeted_status_timestamp | \n", "expanded_urls | \n", "rating_numerator | \n", "rating_denominator | \n", "name | \n", "doggo | \n", "floofer | \n", "pupper | \n", "puppo | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19 | \n", "888202515573088257 | \n", "NaN | \n", "NaN | \n", "2017-07-21 01:02:36 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @dog_rates: This is Canela. She attempted s... | \n", "8.874740e+17 | \n", "4.196984e+09 | \n", "2017-07-19 00:47:34 +0000 | \n", "https://twitter.com/dog_rates/status/887473957... | \n", "13 | \n", "10 | \n", "Canela | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 32 | \n", "886054160059072513 | \n", "NaN | \n", "NaN | \n", "2017-07-15 02:45:48 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @Athletics: 12/10 #BATP https://t.co/WxwJmv... | \n", "8.860537e+17 | \n", "1.960740e+07 | \n", "2017-07-15 02:44:07 +0000 | \n", "https://twitter.com/dog_rates/status/886053434... | \n", "12 | \n", "10 | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 36 | \n", "885311592912609280 | \n", "NaN | \n", "NaN | \n", "2017-07-13 01:35:06 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @dog_rates: This is Lilly. She just paralle... | \n", "8.305833e+17 | \n", "4.196984e+09 | \n", "2017-02-12 01:04:29 +0000 | \n", "https://twitter.com/dog_rates/status/830583320... | \n", "13 | \n", "10 | \n", "Lilly | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 68 | \n", "879130579576475649 | \n", "NaN | \n", "NaN | \n", "2017-06-26 00:13:58 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @dog_rates: This is Emmy. She was adopted t... | \n", "8.780576e+17 | \n", "4.196984e+09 | \n", "2017-06-23 01:10:23 +0000 | \n", "https://twitter.com/dog_rates/status/878057613... | \n", "14 | \n", "10 | \n", "Emmy | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 73 | \n", "878404777348136964 | \n", "NaN | \n", "NaN | \n", "2017-06-24 00:09:53 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @dog_rates: Meet Shadow. In an attempt to r... | \n", "8.782815e+17 | \n", "4.196984e+09 | \n", "2017-06-23 16:00:04 +0000 | \n", "https://www.gofundme.com/3yd6y1c,https://twitt... | \n", "13 | \n", "10 | \n", "Shadow | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| ... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "... | \n", "
| 1023 | \n", "746521445350707200 | \n", "NaN | \n", "NaN | \n", "2016-06-25 01:52:36 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @dog_rates: This is Shaggy. He knows exactl... | \n", "6.678667e+17 | \n", "4.196984e+09 | \n", "2015-11-21 00:46:50 +0000 | \n", "https://twitter.com/dog_rates/status/667866724... | \n", "10 | \n", "10 | \n", "Shaggy | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 1043 | \n", "743835915802583040 | \n", "NaN | \n", "NaN | \n", "2016-06-17 16:01:16 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @dog_rates: Extremely intelligent dog here.... | \n", "6.671383e+17 | \n", "4.196984e+09 | \n", "2015-11-19 00:32:12 +0000 | \n", "https://twitter.com/dog_rates/status/667138269... | \n", "10 | \n", "10 | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 1242 | \n", "711998809858043904 | \n", "NaN | \n", "NaN | \n", "2016-03-21 19:31:59 +0000 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "RT @twitter: @dog_rates Awesome Tweet! 12/10. ... | \n", "7.119983e+17 | \n", "7.832140e+05 | \n", "2016-03-21 19:29:52 +0000 | \n", "https://twitter.com/twitter/status/71199827977... | \n", "12 | \n", "10 | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 2259 | \n", "667550904950915073 | \n", "NaN | \n", "NaN | \n", "2015-11-20 03:51:52 +0000 | \n", "<a href=\"http://twitter.com\" rel=\"nofollow\">Tw... | \n", "RT @dogratingrating: Exceptional talent. Origi... | \n", "6.675487e+17 | \n", "4.296832e+09 | \n", "2015-11-20 03:43:06 +0000 | \n", "https://twitter.com/dogratingrating/status/667... | \n", "12 | \n", "10 | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
| 2260 | \n", "667550882905632768 | \n", "NaN | \n", "NaN | \n", "2015-11-20 03:51:47 +0000 | \n", "<a href=\"http://twitter.com\" rel=\"nofollow\">Tw... | \n", "RT @dogratingrating: Unoriginal idea. Blatant ... | \n", "6.675484e+17 | \n", "4.296832e+09 | \n", "2015-11-20 03:41:59 +0000 | \n", "https://twitter.com/dogratingrating/status/667... | \n", "5 | \n", "10 | \n", "None | \n", "None | \n", "None | \n", "None | \n", "None | \n", "
181 rows × 17 columns
\n", "| \n", " | tweet_id | \n", "jpg_url | \n", "img_num | \n", "prediction1 | \n", "p1_confidence | \n", "p1_asdog | \n", "prediction2 | \n", "p2_asdog | \n", "prediction3 | \n", "p3_asdog | \n", "
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "666020888022790149 | \n", "https://pbs.twimg.com/media/CT4udn0WwAA0aMy.jpg | \n", "1 | \n", "Welsh_springer_spaniel | \n", "0.465074 | \n", "True | \n", "collie | \n", "True | \n", "Shetland_sheepdog | \n", "True | \n", "
| \n", " | tweet_id | \n", "timestamp | \n", "source | \n", "text | \n", "rating_numerator | \n", "rating_denominator | \n", "name | \n", "dog_stages | \n", "retweet_count | \n", "favourite_count | \n", "jpg_url | \n", "img_num | \n", "prediction1 | \n", "p1_confidence | \n", "p1_asdog | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "892420643555336193 | \n", "2017-08-01 16:23:56+00:00 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Phineas. He's a mystical boy. Only eve... | \n", "13 | \n", "10 | \n", "Phineas | \n", "NaN | \n", "8853 | \n", "39467 | \n", "https://pbs.twimg.com/media/DGKD1-bXoAAIAUK.jpg | \n", "1 | \n", "orange | \n", "0.097049 | \n", "False | \n", "
| 1 | \n", "892177421306343426 | \n", "2017-08-01 00:17:27+00:00 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Tilly. She's just checking pup on you.... | \n", "13 | \n", "10 | \n", "Tilly | \n", "NaN | \n", "6514 | \n", "33819 | \n", "https://pbs.twimg.com/media/DGGmoV4XsAAUL6n.jpg | \n", "1 | \n", "chihuahua | \n", "0.323581 | \n", "True | \n", "
| 2 | \n", "891815181378084864 | \n", "2017-07-31 00:18:03+00:00 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Archie. He is a rare Norwegian Pouncin... | \n", "12 | \n", "10 | \n", "Archie | \n", "NaN | \n", "4328 | \n", "25461 | \n", "https://pbs.twimg.com/media/DGBdLU1WsAANxJ9.jpg | \n", "1 | \n", "chihuahua | \n", "0.716012 | \n", "True | \n", "
| 3 | \n", "891689557279858688 | \n", "2017-07-30 15:58:51+00:00 | \n", "<a href=\"http://twitter.com/download/iphone\" r... | \n", "This is Darla. She commenced a snooze mid meal... | \n", "13 | \n", "10 | \n", "Darla | \n", "NaN | \n", "8964 | \n", "42908 | \n", "https://pbs.twimg.com/media/DF_q7IAWsAEuuN8.jpg | \n", "1 | \n", "paper_towel | \n", "0.170278 | \n", "False | \n", "
| \n", " | tweet_id | \n", "rating_numerator | \n", "rating_denominator | \n", "retweet_count | \n", "favourite_count | \n", "img_num | \n", "p1_confidence | \n", "
|---|---|---|---|---|---|---|---|
| count | \n", "2.531000e+03 | \n", "2531.000000 | \n", "2531.000000 | \n", "2531.000000 | \n", "2531.000000 | \n", "2531.000000 | \n", "2311.000000 | \n", "
| mean | \n", "7.384206e+17 | \n", "12.927301 | \n", "10.423548 | \n", "2899.792967 | \n", "9048.932043 | \n", "1.105492 | \n", "0.597116 | \n", "
| std | \n", "6.693796e+16 | \n", "44.252208 | \n", "6.508795 | \n", "5085.331743 | \n", "12668.514176 | \n", "0.644476 | \n", "0.271582 | \n", "
| min | \n", "6.660209e+17 | \n", "0.000000 | \n", "0.000000 | \n", "0.000000 | \n", "52.000000 | \n", "0.000000 | \n", "0.044333 | \n", "
| 25% | \n", "6.783890e+17 | \n", "10.000000 | \n", "10.000000 | \n", "642.000000 | \n", "2093.500000 | \n", "1.000000 | \n", "0.367945 | \n", "
| 50% | \n", "7.124382e+17 | \n", "11.000000 | \n", "10.000000 | \n", "1408.000000 | \n", "4228.000000 | \n", "1.000000 | \n", "0.596796 | \n", "
| 75% | \n", "7.904598e+17 | \n", "12.000000 | \n", "10.000000 | \n", "3263.000000 | \n", "11309.500000 | \n", "1.000000 | \n", "0.846807 | \n", "
| max | \n", "8.924206e+17 | \n", "1776.000000 | \n", "170.000000 | \n", "79515.000000 | \n", "132810.000000 | \n", "4.000000 | \n", "1.000000 | \n", "
| img_num | \n", "1 | \n", "2 | \n", "3 | \n", "4 | \n", "
|---|---|---|---|---|
| p1_asdog | \n", "\n", " | \n", " | \n", " | \n", " |
| False | \n", "526 | \n", "42 | \n", "21 | \n", "7 | \n", "
| True | \n", "1447 | \n", "184 | \n", "54 | \n", "30 | \n", "