Big Crisis Data: Social Media in Disasters and Time-Critical Situations

From Humanitarian Computing Library
Jump to: navigation, search

This wiki page about Big Crisis Data is hosted at the Humanitarian Computing Library. It contains pointers to useful resources about social media in disasters, and it is always under construction. Please feel free to edit and contribute links.


Corpora of social media messages for crises/disasters research.

Software for natural language processing (NLP)

Programs and libraries for tokenization, part-of-speech tagging, entity extraction, entity linking, and other NLP operations.

In Java:

  • WEKA: open-source data mining software in Java.
  • MALLET: natural language processing and topic modeling.
  • Apache OpenNLP: natural language processing.
  • GATE: text processing.
  • ArkNLP: Twitter-specific natural language processing.

In Python:

  • NLTK: natural language toolkit.



Software for geographical information systems


Free maps:

  • OpenStreetMap: geographical information, useful for building gazeteers.

Software for crowdsourcing

Free/open systems

Integrated systems that are open and free.


Past and present related conferences and workshops:

  • SWDM'16: Social Web for Disaster Management
  • ISCRAM'16: Information Systems for Crisis Response and Management
  • SAFE'15: Workshop on Semantics and Analytics for Emergency Response
  • KDD-LESI'14: Workshop on Learning About Emergencies from Social Information.
  • AAAI Spring Symposium'15: Structured Data for Humanitarian Technologies.
  • HUMTEC'16: Humanitarian Technologies

Volunteer organizations

There are many digital volunteering organizations, this list contains a few examples:

Blogs and social media

Blogs and social media accounts of researchers and practitioners working in social innovation in general, and/or social media for disasters in particular.


Talks about big data for emergency/disaster management, and about social media data for research in general.

Related books


  • The Arabic example on page 39, it says "al3ab", should say "al7ob" (thanks to Sallam Abualhaija).
  • The success of Swift and Twain spreading fake news, noted on page 110, says it was in the "eighteenth" century but should be "nineteenth"

Book information

Author(s) Carlos Castillo
Published in
Institution Cambridge University Press
Publication Date 2016-07-14
Keyword(s) Social Media, Crowdsourcing
Citation Count
Download URL
Abstract Social media is an invaluable source of time-critical information during a crisis. However, emergency response and humanitarian relief organizations that would like to use this information struggle with an avalanche of social media messages that exceeds human capacity to process. Emergency managers, decision makers, and affected communities can make sense of social media through a combination of machine computation and human compassion - expressed by thousands of digital volunteers who publish, process, and summarize potentially life-saving information. This book brings together computational methods from many disciplines: natural language processing, semantic technologies, data mining, machine learning, network analysis, human-computer interaction, and information visualization, focusing on methods that are commonly used for processing social media messages under time-critical constraints, and offering more than 500 references to in-depth information.
Type book