{ "metadata": { "name": "Tutorial - Account-Level Analyses" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": "This is this second in a series of notebooks designed to show you how to analyze social media data. For demonstration purposes we are looking at tweets sent by CSR-related Twitter accounts -- accounts related to ethics, equality, the environment, etc. -- of Fortune 200 firms in 2013. We assume you have already downloaded the data and have completed the steps taken in Chapter 1. In this second notebook I will show you how to conduct various account-level (organizational-level) analyses of the Twitter data. Essentially, we will be taking the tweet-level data and aggregating to the account level." }, { "cell_type": "heading", "level": 1, "metadata": {}, "source": "Chapter 2: Analyze Twitter Data at the Account (Organization) Level" }, { "cell_type": "markdown", "metadata": {}, "source": "First, we will import several necessary Python packages and set some options for viewing the data. As with Chapter 1, we will be using the Python Data Analysis Library, or PANDAS, extensively for our data manipulations." }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": "Import packages and set viewing options" }, { "cell_type": "code", "collapsed": false, "input": "import numpy as np\nimport pandas as pd\nfrom pandas import DataFrame\nfrom pandas import Series", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "code", "collapsed": false, "input": "#Set PANDAS to show all columns in DataFrame\npd.set_option('display.max_columns', None)", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": "Import graphing packages" }, { "cell_type": "markdown", "metadata": {}, "source": "We'll be producing some figures at the end of this tutorial so we need to import various graphing capabilities. The default Matplotlib library is solid. " }, { "cell_type": "code", "collapsed": false, "input": "import matplotlib.pyplot as plt\nprint matplotlib.__version__", "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": "1.3.1\n" } ], "prompt_number": 46 }, { "cell_type": "code", "collapsed": false, "input": "#NECESSARY FOR XTICKS OPTION, ETC.\nfrom pylab import*", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 39 }, { "cell_type": "markdown", "metadata": {}, "source": "One of the great innovations of ipython notebook is the ability to see output and graphics \"inline,\" that is, on the same page and immediately below each line of code. To enable this feature for graphics we run the following line." }, { "cell_type": "code", "collapsed": false, "input": "%matplotlib inline ", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 43 }, { "cell_type": "markdown", "metadata": {}, "source": "We will be using Seaborn to help pretty up the default Matplotlib graphics. Seaborn does not come installed with Anaconda Python so you will have to open up a terminal and run pip install seaborn." }, { "cell_type": "code", "collapsed": false, "input": "import seaborn as sns\nprint sns.__version__", "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": "0.5.1\n" } ], "prompt_number": 45 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": "Read in data" }, { "cell_type": "markdown", "metadata": {}, "source": "In Chapter 1 we deleted tweets from one unneeded Twitter account and also omitted several unnecessary columns (variables). We then saved, or \"pickled,\" the updated dataframe. Let's now open this saved file. As we can see in the operations below this dataframe contains 54 variables for 32,330 tweets." }, { "cell_type": "code", "collapsed": false, "input": "df = pd.read_pickle('CSR tweets - 2013 by 41 accounts.pkl')\nprint len(df)\ndf.head(2)", "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": "32330\n" }, { "html": "
\n | rowid | \nquery | \ntweet_id_str | \ninserted_date | \nlanguage | \ncoordinates | \nretweeted_status | \ncreated_at | \nmonth | \nyear | \ncontent | \nfrom_user_screen_name | \nfrom_user_id | \nfrom_user_followers_count | \nfrom_user_friends_count | \nfrom_user_listed_count | \nfrom_user_favourites_count | \nfrom_user_statuses_count | \nfrom_user_description | \nfrom_user_location | \nfrom_user_created_at | \nretweet_count | \nfavorite_count | \nentities_urls | \nentities_urls_count | \nentities_hashtags | \nentities_hashtags_count | \nentities_mentions | \nentities_mentions_count | \nin_reply_to_screen_name | \nin_reply_to_status_id | \nsource | \nentities_expanded_urls | \nentities_media_count | \nmedia_expanded_url | \nmedia_url | \nmedia_type | \nvideo_link | \nphoto_link | \ntwitpic | \nnum_characters | \nnum_words | \nretweeted_user | \nretweeted_user_description | \nretweeted_user_screen_name | \nretweeted_user_followers_count | \nretweeted_user_listed_count | \nretweeted_user_statuses_count | \nretweeted_user_location | \nretweeted_tweet_created_at | \nFortune_2012_rank | \nCompany | \nCSR_sustainability | \nspecific_project_initiative_area | \n
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n67340 | \nhumanavitality | \n306897327585652736 | \n2014-03-09 13:46:50.222857 | \nen | \nNaN | \nNaN | \n2013-02-27 22:43:19.000000 | \n2 | \n2013 | \n@louloushive (Tweet 2) We encourage other empl... | \nhumanavitality | \n274041023 | \n2859 | \n440 | \n38 | \n25 | \n1766 | \nThis is the official Twitter account for Human... | \nNaN | \nTue Mar 29 16:23:02 +0000 2011 | \n0 | \n0 | \nNaN | \n0 | \nNaN | \n0 | \nlouloushive | \n1 | \nlouloushive | \n306218267737989120.00 | \nweb | \nNaN | \nnan | \nNaN | \nNaN | \nNaN | \n0 | \n0 | \n0 | \n121 | \n19 | \nnan | \nNaN | \nNaN | \nnan | \nnan | \nnan | \nNaN | \nNaN | \n79 | \nHumana | \n0 | \n1 | \n
1 | \n39454 | \nFundacionPfizer | \n308616393706844160 | \n2014-03-09 13:38:20.679967 | \nes | \nNaN | \nNaN | \n2013-03-04 16:34:17.000000 | \n3 | \n2013 | \n\u00bfSabes por qu\u00e9 la #vacuna contra la #neumon\u00eda ... | \nFundacionPfizer | \n188384056 | \n2464 | \n597 | \n50 | \n11 | \n2400 | \nNoticias sobre Responsabilidad Social y Fundac... | \nM\u00e9xico | \nWed Sep 08 16:14:11 +0000 2010 | \n1 | \n0 | \nNaN | \n0 | \nvacuna, neumon\u00eda | \n2 | \nNaN | \n0 | \nNaN | \nnan | \nweb | \nNaN | \nnan | \nNaN | \nNaN | \nNaN | \n0 | \n0 | \n0 | \n138 | \n20 | \nnan | \nNaN | \nNaN | \nnan | \nnan | \nnan | \nNaN | \nNaN | \n40 | \nPfizer | \n0 | \n1 | \n
2 rows \u00d7 54 columns
\n\n | Company | \nDescription | \nNumber_of_tweets | \n
---|---|---|---|
from_user_screen_name | \n\n | \n | \n |
3M_FoodSafety | \n3M | \n3M Food Safety | global manufacturer of innova... | \n325 | \n
ATTAspire | \nAT&T | \nInspiring students to achieve their dreams. | \n336 | \n
AlcoaFoundation | \nAlcoa | \nSince 1952, we\u2019ve invested $570 million to imp... | \n1053 | \n
AmgenFoundation | \nAmgen | \nThe Amgen Foundation seeks to advance science ... | \n135 | \n
BofA_Community | \nBank of America Corp. | \nWe\u2019re connecting local communities to the peop... | \n1722 | \n
CiscoCSR | \nCisco Systems | \nSharing stories about how Cisco and our partne... | \n2511 | \n
CiscoEDU | \nCisco Systems | \nTweets on the Cisco Connected Learning Experie... | \n932 | \n
CitizenDisney | \nWalt Disney | \nWe believe in the power of stories, families a... | \n211 | \n
ClickToEmpower | \nAllstate | \nClick to Empower is a Web initiative of The Al... | \n149 | \n
Comcastdreambig | \nComcast | \nComcast empowers communities across the countr... | \n581 | \n
DE_Youtility | \nDuke Energy | \nThis profile will retire soon, so head on over... | \n300 | \n
Dell4Good | \nDell | \nNews from Dell corp. responsibility team: sust... | \n829 | \n
DellEDU | \nDell | \nConnecting with students, educators, school ad... | \n1108 | \n
DuPont_ability | \nDuPont | \nDuPont sustainability news: clean energy, sola... | \n1509 | \n
EnviroSears | \nSears Holdings | \nDiscussing sustainability topics at Sears Hold... | \n53 | \n
FordDriveGreen | \nFord Motor | \nFord is committed to affordable, sustainable p... | \n27 | \n
FundacionPfizer | \nPfizer | \nNoticias sobre Responsabilidad Social y Fundac... | \n421 | \n
GreenIBM | \nInternational Business Machines | \nOfficial Twitter account for IBM Big Green Inn... | \n81 | \n
HeartRescue | \nMedtronic | \nA collaborative effort supported by the Medtro... | \n322 | \n
HoneywellBuild | \nHoneywell International | \nHoneywell Building Solutions installs, integra... | \n242 | \n
IBMSmartCities | \nInternational Business Machines | \nOfficial IBM Smarter Cities account. Managed b... | \n1570 | \n
Intelinvolved | \nIntel | \nConnecting & enriching lives worldwide to crea... | \n1524 | \n
JNJStories | \nJohnson & Johnson | \nWe're tweeting about social good over on @JNJC... | \n9 | \n
Microsoft_Green | \nMicrosoft | \nThe official Twitter account for Microsoft's E... | \n436 | \n
PG_CSDW | \nProcter & Gamble | \nNews from P&G's Children\u2019s Safe Drinking Water... | \n187 | \n
PPGIdeascapes | \nPPG Industries | \nPPG Ideascapes provides sustainable building p... | \n160 | \n
PromesaPepsiCo | \nPepsiCo | \nPepsiCo's commitment to the Hispanic community... | \n7 | \n
SprintGreenNews | \nSprint Nextel | \nNews and Information about Sprint's sustainabi... | \n201 | \n
TeachingMoney | \nCapital One Financial | \nTeaching Money is a new initiative created to ... | \n187 | \n
WalmartAction | \nWal-Mart Stores | \nOur Community Action Network works to improve ... | \n1915 | \n
WalmartGreen | \nWal-Mart Stores | \nTogether, we can create a sustainable world & ... | \n1434 | \n
citizenIBM | \nInternational Business Machines | \nThis official Citizen IBM Twitter feed is admi... | \n1039 | \n
ecomagination | \nGeneral Electric | \nA forum for fresh thinking and conversation ab... | \n594 | \n
gehealthy | \nGeneral Electric | \nA shared commitment to creating better health ... | \n2461 | \n
googlestudents | \nGoogle news and updates especially for student... | \n211 | \n|
hpglobalcitizen | \nHewlett-Packard | \nHP Living Progress is focused to create a bett... | \n470 | \n
humanavitality | \nHumana | \nThis is the official Twitter account for Human... | \n762 | \n
mathmovesu | \nRaytheon | \nWe represent the centerpiece citizenship initi... | \n1146 | \n
msftcitizenship | \nMicrosoft | \nSharing stories about how Microsoft and our pa... | \n2493 | \n
nikebetterworld | \nNike | \nA better world needs the world\u2019s biggest team.... | \n153 | \n
verizongiving | \nVerizon Communications | \nWe are focused on using technology to solve cr... | \n2524 | \n
41 rows \u00d7 3 columns
\n\n | size | \n
---|---|
Company | \n\n |
Cisco Systems | \n2 | \n
Dell | \n2 | \n
General Electric | \n2 | \n
International Business Machines | \n3 | \n
Microsoft | \n2 | \n
Wal-Mart Stores | \n2 | \n
6 rows \u00d7 1 columns
\n\n | Company | \nDescription | \nNumber_of_tweets | \n
---|---|---|---|
from_user_screen_name | \n\n | \n | \n |
verizongiving | \nVerizon Communications | \nWe are focused on using technology to solve cr... | \n2524 | \n
CiscoCSR | \nCisco Systems | \nSharing stories about how Cisco and our partne... | \n2511 | \n
msftcitizenship | \nMicrosoft | \nSharing stories about how Microsoft and our pa... | \n2493 | \n
gehealthy | \nGeneral Electric | \nA shared commitment to creating better health ... | \n2461 | \n
WalmartAction | \nWal-Mart Stores | \nOur Community Action Network works to improve ... | \n1915 | \n
BofA_Community | \nBank of America Corp. | \nWe\u2019re connecting local communities to the peop... | \n1722 | \n
IBMSmartCities | \nInternational Business Machines | \nOfficial IBM Smarter Cities account. Managed b... | \n1570 | \n
Intelinvolved | \nIntel | \nConnecting & enriching lives worldwide to crea... | \n1524 | \n
DuPont_ability | \nDuPont | \nDuPont sustainability news: clean energy, sola... | \n1509 | \n
WalmartGreen | \nWal-Mart Stores | \nTogether, we can create a sustainable world & ... | \n1434 | \n
mathmovesu | \nRaytheon | \nWe represent the centerpiece citizenship initi... | \n1146 | \n
DellEDU | \nDell | \nConnecting with students, educators, school ad... | \n1108 | \n
AlcoaFoundation | \nAlcoa | \nSince 1952, we\u2019ve invested $570 million to imp... | \n1053 | \n
citizenIBM | \nInternational Business Machines | \nThis official Citizen IBM Twitter feed is admi... | \n1039 | \n
CiscoEDU | \nCisco Systems | \nTweets on the Cisco Connected Learning Experie... | \n932 | \n
Dell4Good | \nDell | \nNews from Dell corp. responsibility team: sust... | \n829 | \n
humanavitality | \nHumana | \nThis is the official Twitter account for Human... | \n762 | \n
ecomagination | \nGeneral Electric | \nA forum for fresh thinking and conversation ab... | \n594 | \n
Comcastdreambig | \nComcast | \nComcast empowers communities across the countr... | \n581 | \n
hpglobalcitizen | \nHewlett-Packard | \nHP Living Progress is focused to create a bett... | \n470 | \n
Microsoft_Green | \nMicrosoft | \nThe official Twitter account for Microsoft's E... | \n436 | \n
FundacionPfizer | \nPfizer | \nNoticias sobre Responsabilidad Social y Fundac... | \n421 | \n
ATTAspire | \nAT&T | \nInspiring students to achieve their dreams. | \n336 | \n
3M_FoodSafety | \n3M | \n3M Food Safety | global manufacturer of innova... | \n325 | \n
HeartRescue | \nMedtronic | \nA collaborative effort supported by the Medtro... | \n322 | \n
DE_Youtility | \nDuke Energy | \nThis profile will retire soon, so head on over... | \n300 | \n
HoneywellBuild | \nHoneywell International | \nHoneywell Building Solutions installs, integra... | \n242 | \n
googlestudents | \nGoogle news and updates especially for student... | \n211 | \n|
CitizenDisney | \nWalt Disney | \nWe believe in the power of stories, families a... | \n211 | \n
SprintGreenNews | \nSprint Nextel | \nNews and Information about Sprint's sustainabi... | \n201 | \n
PG_CSDW | \nProcter & Gamble | \nNews from P&G's Children\u2019s Safe Drinking Water... | \n187 | \n
TeachingMoney | \nCapital One Financial | \nTeaching Money is a new initiative created to ... | \n187 | \n
PPGIdeascapes | \nPPG Industries | \nPPG Ideascapes provides sustainable building p... | \n160 | \n
nikebetterworld | \nNike | \nA better world needs the world\u2019s biggest team.... | \n153 | \n
ClickToEmpower | \nAllstate | \nClick to Empower is a Web initiative of The Al... | \n149 | \n
AmgenFoundation | \nAmgen | \nThe Amgen Foundation seeks to advance science ... | \n135 | \n
GreenIBM | \nInternational Business Machines | \nOfficial Twitter account for IBM Big Green Inn... | \n81 | \n
EnviroSears | \nSears Holdings | \nDiscussing sustainability topics at Sears Hold... | \n53 | \n
FordDriveGreen | \nFord Motor | \nFord is committed to affordable, sustainable p... | \n27 | \n
JNJStories | \nJohnson & Johnson | \nWe're tweeting about social good over on @JNJC... | \n9 | \n
PromesaPepsiCo | \nPepsiCo | \nPepsiCo's commitment to the Hispanic community... | \n7 | \n
41 rows \u00d7 3 columns
\n