Big Crisis Data: Social Media in Disasters and Time-Critical Situations
This wiki page about Big Crisis Data is hosted at the Humanitarian Computing Library. It contains pointers to useful resources about social media in disasters, and it is always under construction. Please feel free to edit and contribute links.
Corpora of social media messages for crises/disasters research.
- CrisisLex: several corpora of disaster-related social media messages.
- CredBank: corpus for credibility research
- HDX: Humanitarian Data eXchange, datasets of humanitarian variables by UN OCHA.
- TREC Microblog Corpus: corpus of social media messages.
- TREC Temporal Summarization Track: corpus for social media update summarization.
Software for natural language processing (NLP)
Programs and libraries for tokenization, part-of-speech tagging, entity extraction, entity linking, and other NLP operations.
- WEKA: open-source data mining software in Java.
- MALLET: natural language processing and topic modeling.
- Apache OpenNLP: natural language processing.
- GATE: text processing.
- ArkNLP: Twitter-specific natural language processing.
- NLTK: natural language toolkit.
- Transliteration of non-roman scripts.
Software for geographical information systems
- GeoNames: geo-tagging software.
- OpenStreetMap: geographical information, useful for building gazeteers.
Software for crowdsourcing
Integrated systems that are open and free.
Past and present related conferences and workshops:
- SWDM'16: Social Web for Disaster Management
- ISCRAM'16: Information Systems for Crisis Response and Management
- SAFE'15: Workshop on Semantics and Analytics for Emergency Response
- KDD-LESI'14: Workshop on Learning About Emergencies from Social Information.
- AAAI Spring Symposium'15: Structured Data for Humanitarian Technologies.
- HUMTEC'16: Humanitarian Technologies
There are many digital volunteering organizations, this list contains a few examples:
- Digital Humanitarian Network
- Stand-By Task Force
- Humanity Road
- Geeks Without Borders
- Humanitarian OpenStreetMap Team (HOT)
- Crisis Commons
- Translators Without Borders
Blogs and social media accounts of researchers and practitioners working in social innovation in general, and/or social media for disasters in particular.
- Andrej Verity - @AndrejVerity
- Patrick Meier - @PatrickMeier
- Heather Leson - @HeatherLeson
Talks about big data for emergency/disaster management, and about social media data for research in general.
- Frontiers in Crisis Informatics by Leysia Palen (2015)
- Changing the World, One Map at a Time by Patrick Meier (2011)
- Big Data Gets Personal by Kate Crawford (2013)
- Algorithmic Issues of Big Data by Kate Crawford (2013)
- Digital Humanitarians: How Big Data is Changing the Face of Humanitarian Response by Patrick Meier (2015)
- Social Media in Disaster Response: How Experience Architects Can Build for Participation by Liza Potts (2013)
- Disasters 2.0: The Application of Social Media Systems for Modern Emergency Management by Adam Crowe (2012)
- The Arabic example on page 39, it says "al3ab", should say "al7ob" (thanks to Sallam Abualhaija).
|Institution||Cambridge University Press|
|Keyword(s)||Social Media, Crowdsourcing|
|Abstract||Social media is an invaluable source of time-critical information during a crisis. However, emergency response and humanitarian relief organizations that would like to use this information struggle with an avalanche of social media messages that exceeds human capacity to process. Emergency managers, decision makers, and affected communities can make sense of social media through a combination of machine computation and human compassion - expressed by thousands of digital volunteers who publish, process, and summarize potentially life-saving information. This book brings together computational methods from many disciplines: natural language processing, semantic technologies, data mining, machine learning, network analysis, human-computer interaction, and information visualization, focusing on methods that are commonly used for processing social media messages under time-critical constraints, and offering more than 500 references to in-depth information.|