<title type="main">What Do Boy Bands Tell Us About Disasters? The Social Media Response to the Nepal Earthquake Shepard David Lawrence UCLA, United States of America shepard.david@gmail.com Hashimoto Takako Chiba University of Commerce, Japan takako@cuc.ac.jp Kuboyama Tetsuji Gakushuin University, Japan kuboyama@tk.cc.gakushuin.ac.jp Shin Kilho Hyogo University, Japan yshin@ai.u-hyogo.ac.jp 2016-03-06T12:42:31.327901000 Maciej Eder, Pedagogical University in Krakow Jan Rybicki, Jagiellonian University
Institute of Polish Studies Pedagogical University ul. Podchorazych 2 30-084 Krakow, Poland maciej.eder@ijp-pan.krakow.pl

Converted from an OASIS Open Document

Paper Long Paper social media event detection text mining natural language processing social media data mining / text mining English

When an earthquake struck Nepal in 2015, the band One Direction sent a tweet encouraging their fans to donate to relief efforts. This one tweet was retweeted a few times, but quickly lost in a flood of other tweets about One Direction’s tour. Simultaneously, an Indian Hindu extremist politician flooded his Twitter stream with rumors that Christian missionaries were coercing conversions from Nepalis in exchange for humanitarian aid. Additionally, an Indian religious group mixed substantial numbers of tweets about a movie they had released with tweets about their relief efforts in Nepal. These are just a few of the users who engaged with the disaster from a distance: they had different motives for tweeting about the disaster, and different levels of engagement with it. We call these users “onlookers:” they tweet about a disaster, but are not directly affected by it.

This paper analyzes onlooker behavior: it argues that onlookers who will send a few tweets can be predicted by their interests, while onlookers who will tweet heavily about it have few, if any, shared interests. We show that onlookers who primarily tweet about entertainment topics and news topics are likely to mention the disaster, yet send few tweets about it. On the other hand, onlookers who tweet substantially about a disaster after it happens are difficult to identify before the disaster occurs because they do not share common interests aside from the disaster.

Background

Natural disasters often attract significant attention on Twitter, both by those affected and those who are distant. A substantial amount of research has explored how social media causes users to engage with political, social, and humanitarian problems; however, opinions on social media’s effectiveness—whether it causes users to donate money, stay informed, or participate in campaigns—are mixed. Some argue that displaying concern in social media is more about acquiring social capital than effecting change (Shulman, 2009; Gladwell, 2011; Morozov, 2012; Morozov, 2014), while a Pew Research Center survey finds that social media does create change (Raine et al., 2016). Additionally, many have argued that social media was important though not essential to protests in Egypt (Mazaid, 2011; Tufekci and Wilson, 2012) and other nations (Raine et al., 2016; Shirky, 2011). One analysis found that charities’ use of social media does not increase donations (Malcolm, 2016), while another finds that certain tweeting strategies do (Gasso Climent, 2015) although tweets may not raise awareness about the charity’s causes (Bravo and Hoffman-Goetz, 2015). Where all these studies concur is that social media enable a substantial amount of discourse about crises. The question we explore is how to predict how much attention Twitter users pay to crises: social media presents the opportunity for a user to send a single retweet about a disaster—as many One Direction fans did—or to sustain interest by following other users and sending many tweets about the event over a period of time.

Additionally, there is little question that social media is useful for those directly affected by disasters. A substantial amount of research finds that social media helps first responders (Regalado et al., 2015; Dugdale et al., 2012; Omilion-Hodges and McClain, 2016; Burns, 2015; Xiao et al., 2015; Kaewkitipong et al., 2016; Meng et al., 2015; Madianou, 2015; Palen, 2008). In fact, specialized algorithms have been developed for that purpose (Pohl et al., 2013a; Pohl et al., 2013b; Platt et al., 2011). Little work examines users who tweet about disasters at a distance, however, despite the large numbers of such users. We examine these onlookers because they produce a large amount of the tweets about humanitarian crises.

Method

We use quantitative text analysis to identify tweets about the earthquake, to cluster onlookers based on shared interests, and to derive a correlation between onlookers’ interests and the number of tweets they sent about the earthquake. This section outlines our methods.

To attain a broad sample of onlookers, we gathered a dataset of over 5 million tweets sent by around 15,000 users in the three weeks following the Nepal earthquake. We harvested the data from the Twitter REST API by searching for any tweets that mentioned the word “Nepal” from April 24 to May 8. We randomly selected 15,000 users from this set and harvested all of their tweets sent between April 24, 2014 and May 5, 2015. We attempted to capture only English-speaking users to increase the likelihood that we would capture users not directly affected by the earthquake, but we still captured some users who tweeted in multiple languages. This left us with roughly 11,000 onlookers.

To determine how many tweets a user had sent about the earthquake, we trained a Naive Bayesian classifier using MALLET (McCallum, 2002) on a set of 100 onlookers’ tweets (totaling about 30,000 tweets), marking them as either quake-related or not. We applied the trained classifier to the remainder of the dataset to count each user’s quake-related tweets. Spot checking showed this technique had acceptable accuracy.

To find shared interests, we used Latent Dirichlet Allocation (Blei et al., 2003), treating all of a user’s tweets as one document. We ran LDA with MALLET with various numbers of topics, and settled on 80. These topics represented a broad span of themes: greetings, news, entertainment, technology, plus four topics directly related to the earthquake. We then looked for connections between onlookers by building an edge list of shared topics, creating a weighted edge between two onlookers if over 25% of both onlookers’ tweets consisted of a shared topic. The edge weight was the product of their affinities to that topic. Using Gephi (Bastian et al., 2009), we then ran a weighted Louvian modularity algorithm (Blondel et al., 2008) over this onlooker graph to generate communities of users.

Analysis

This experiment resulted in 21 communities of onlookers being identified. The communities were labelled using the strongest topics in each.

ID Label Average Number of Tweets Users Average Quake-Related Tweets 0 Foreign language (Spanish) 419 882 9 1 Japanese Music 403 135 5 2 Greetings 326 199 5 3 Portuguese/Fifth Harmony 710 658 7 4 News Media 977 11 1 5 News Media 652 1312 22 6 News/Politics 600 1236 26 7 Indonesia 386 416 8 8 Foreign language (unknown) 495 312 5 9 Unclassified 372 589 25 10 Dera Sacha Sauda 1732 54 205 11 News about Russia 780 30 18 12 Shopping 1153 226 11 13 Greetings 476 1010 11 14 Greetings 439 1347 5 15 Science and animals 521 108 13 16 One Direction 388 1085 10 17 Foreign Language (Italian) 553 47 14 18 TV/Music 598 722 5 19 Music Videos 649 14 30 20 Nepal 393 748 125

After pruning out the foreign language communities in the dataset and some that were difficult to classify (Communities 0, 7-9, and 17), we can further group these onlookers into three broad interest groups: Casual Users (Communities 1-3, 12-14, 18, and 19), News and Pundits (4-6, and 13), and Engaged Users (20). We divided these subgroups based on the topics that were strongest in each, but these subgroups correlated with the number of quake-related tweets that each sent. They are described in the table below.

Category Proportion Quake-Related Tweets/Week Topic Affinities Casual Onlookers 46% 3 Entertainment, greetings News Onlookers 25% 6 News and politics Engaged Onlookers 12% 10 Nepal

Casual Onlookers. Onlookers in these communities showed high activity but low engagement with the disaster, sending an average total of three quake-related tweets a week. Their primary topics of discussion were entertainment, or greetings and positive sentiments. This is the largest group.

News Onlookers. These accounts are either the accounts of professional news outlets or amateur pundits. We find low average quake-related tweets in this group as well: users sent an average of six relevant tweets per week. News outlets generally moved from one topic to another quickly, and pundits only sustained interest in the topics that appealed to them.

Engaged Onlookers. This group sent the most quake-related tweets of all users; the strongest LDA topics in this group were two “Nepal earthquake” topics. However, users in this community had few other topics in common with each other.

This breakdown suggests a model for predicting the number of tweets onlookers send about events. There will be roughly three categories of onlookers: Casual Onlookers, News Onlookers, and Engaged Onlookers. Casual Onlookers will consist of roughly 50% of onlookers, and will send only a few tweets over the first three weeks. Membership in this group is predicted by an interest in entertainment topics. The number of News Onlookers will be half the size of the Casual Onlookers, but they will be roughly twice as engaged. An onlookers’s affinity to this group will be predicted by a general interest in news. Finally, the Engaged Onlookers will send 10-20 times as many tweets as the Casual Onlookers, and will comprise slightly over 10% of onlookers. This group sends the most tweets about an event, but membership in this group cannot be predicted from their preexisting interests.

Conclusion

We find that it is easy to predict shallow engagement with a disaster on Twitter, but difficult to anticipate sustained interest. Onlookers who tweet about entertainment are likely to pass on at least a few messages about donating money because entertainers are likely to post these messages, and fans are likely to retweet them. On the other hand, onlookers who tweet more about an event are likely to have preexisting interests that intersect with a particular aspect of the disaster, but the relevant interests are hard to predict because doing so would require knowing the nature of the disaster ahead of time. For example, to know the Hindu extremist would tweet about rumors of coerced conversions in Nepal, we would have to predict a crisis that would produce such rumors.

Additionally, we acknowledge that our model is derived from a single case study. As such, we treat it as provisional pending further experiments. We hope to confirm this model with future work.

Bibliography Bastian, M., Heymann, S., Jacomy, M. et al. (2009). Gephi: an open source software for exploring and manipulating networks. http://www.aaai.org/ocs/index.php/ICWSM/09/paper/viewFile/154./1009 (accessed 22 February 2016). Blei, D. M., Ng, A. Y. and Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3: 993–1022 (accessed 30 June 2014). Blondel, V. D., Guillaume, J.-L., Lambiotte, R. and Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10). doi:10.1088/1742-5468/2008/10/P10008 (accessed 17 October 2013). Bravo, C. A. and Hoffman-Goetz, L. (2015). Tweeting About Prostate and Testicular Cancers: What Are Individuals Saying in Their Discussions About the 2013 Movember Canada Campaign? Journal of Cancer Education. 1(8). doi:10.1007/s13187-015-0838-8 (accessed 20 February 2016). Burns, R. (2015). Digital Humanitarianism and the Geospatial Web: Emerging Modes of Mapping and the Transformation of Humanitarian Practices Thesis https://digital.lib.washington.edu:443/researchworks/handle/1773/33947 (accessed 20 February 2016). Dugdale, J., Walle, B. Van de and Koeppinghoff, C. (2012). Social media and SMS in the haiti earthquake. ACM Press, pp. 713. doi:10.1145/2187980.2188189. http://dl.acm.org/citation.cfm?doid=2187980.2188189 (accessed 20 February 2016). Gasso Climent, C. (2015). Twitter as a social marketing tool: modifying tweeting behavior in order to encourage donations. Info:eu-repo/semantics/bachelorThesis http://essay.utwente.nl/68039/ (accessed 20 February 2016). Gladwell, M. (2011). Outliers: The Story of Success. Reprint edition. Back Bay Books. Kaewkitipong, L., Chen, C. C. and Ractham, P. (2016). A community-based approach to sharing knowledge before, during, and after crisis events: A case study from Thailand. Computers in Human Behavior, 54: 653–66 doi:10.1016/j.chb.2015.07.063 (accessed 20 February 2016). Madianou, M. (2015). Digital Inequality and Second-Order Disasters: Social Media in the Typhoon Haiyan Recovery. Social Media + Society, 1(2). doi:10.1177/2056305115603386 (accessed 20 February 2016). Malcolm, K. (2016). How Social Media Affects the Annual Fund Revenues of Nonprofit Organizations. Walden Dissertations and Doctoral Studies http://scholarworks.waldenu.edu/dissertations/2005. Mazaid, P. N. H. and A. D. and D. F. and M. H. and W. M. and M. (2011). Opening Closed Regimes: What Was the Role of Social Media During the Arab Spring?. http://ictlogy.net/bibliography/reports/projects.php?idp=2170 (accessed 15 May 2014). McCallum, A. K. (2002). MALLET: A Machine Learning for Language Toolkit. Amherst, MA: UMass Amherst http://mallet.cs.umass.edu. Meng, Q., Zhang, N., Zhao, X., Li, F. and Guan, X. (2015). The governance strategies for public emergencies on social media and their effects: a case study based on the microblog data. Electronic Markets, 26(1): 15–29 doi:10.1007/s12525-015-0202-1 (accessed 20 February 2016). Morozov, E. (2012). The Net Delusion: The Dark Side of Internet Freedom. Reprint edition. New York: PublicAffairs. Morozov, E. (2014). To Save Everything, Click Here: The Folly of Technological Solutionism. First Trade Paper Edition edition. New York: PublicAffairs. Omilion-Hodges, L. M. and McClain, K. L. (2016). University use of social media and the crisis lifecycle: Organizational messages, first information responders’ reactions, reframed messages and dissemination patterns. Computers in Human Behavior, 54: 630–38 doi:10.1016/j.chb.2015.06.002 (accessed 20 February 2016). Palen, L. (2008). Online social media in crisis events. Educause Quarterly, 31(3): 76–78 (accessed 20 February 2016). Platt, A., Hood, C. and Citrin, L. (2011). From Earthquakes to‘# morecowbell’: Identifying Sub-topics in Social Network Communications. Privacy, Security, Risk and Trust (passat), 2011 Ieee Third International Conference on and 2011 Ieee Third International Conference on Social Computing (socialcom). IEEE, pp. 541–44 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6113164 (accessed 17 May 2014). Pohl, D., Bouchachia, A. and Hellwagner, H. (2013a). Supporting Crisis Management via Detection of Sub-Events in Social Networks. International Journal of Information Systems for Crisis Response and Management (IJISCRAM), 5(3): 20–36 (accessed 17 May 2014). Pohl, D., Bouchachia, A. and Hellwagner, H. (2013b). Social media for crisis management: clustering approaches for sub-event detection. Multimedia Tools and Applications, pp. 1–32 (accessed 17 May 2014). Raine, L., Purcell, K. and Smith, A. (2016). The Social Side of the Internet | Pew Research Center Pew Research Center: Numbers, Facts and Trends Shaping Your World http://www.pewinternet.org/2011/01/18/the-social-side-of-the-internet/ (accessed 21 February 2016). Regalado, R. V. J., McHale, K., Dela Cruz, B., Garcia, J. P. F., Ma, C., Kalaw, D. F. and Lu, V. E. (2015). FILIET: An Information Extraction System for Filipino Disaster-Related Tweets. Manila, Philippines: De la Salle University. http://www.dlsu.edu.ph/conferences/dlsu_research_congress/2015/proceedings/SEE/010-HCT_Regalado_RJ.pdf (accessed 20 February 2016). Shirky, C. (2011). Political Power of Social Media - Technology, the Public Sphere Sphere, and Political Change, The. Foreign Affairs, 90: 28. Shulman, S. W. (2009). The Case Against Mass E-mails: Perverse Incentives and Low Quality Public Participation in U.S. Federal Rulemaking. Policy & Internet, 1(1): 23–53 doi:10.2202/1944-2866.1010 (accessed 21 February 2016). Tufekci, Z. and Wilson, C. (2012). Social Media and the Decision to Participate in Political Protest: Observations From Tahrir Square. Journal of Communication, 62(2): 363–79 doi:10.1111/j.1460-2466.2012.01629.x (accessed 15 May 2014). Xiao, Y., Huang, Q. and Wu, K. (2015). Understanding social media data for disaster management. Natural Hazards, 79(3): 1663–79 doi:10.1007/s11069-015-1918-0 (accessed 20 February 2016).