Ryan Thomson and I experimented with Massmine, a Raspberry Pi, and R to collect and analyze Twitter data for sociological research. We recently wrote about this for the website of the University of Florida Bureau of Economic and Business Research.
(Continues after figure.)
The revolution of informatics, Big Data, and computational social science has reached every corner of the academy and beyond. Social scientists are increasingly trying to use Big Data in their research, particularly from social media and networking websites, but often can’t afford the prepackaged datasets designed for marketing firms and other clients in the private sector. How can we get low-cost access to the large amounts of available, public raw data on online activity?
This is a central question raised by the open science movement, which is seeking to make data, statistical programs, findings, and scholarship available to everyone. Over the past twenty years, a prominent role in this movement has been played by the R programming language for statistical computing and its large users community. Some of the most popular R packages, including twitteR, were developed to make social media data collection and analysis much easier than before.
Another software breakthrough for the collection of social media data came from Massmine, a suite of Mac and Linux tools designed by Aaron Beveridge at the University of Florida and Nicholas Van Horn at Capital University. Massmine allows (non-programmer) academics to collect social media data from a multitude of different sources. At the same time, hardware tools have become increasingly effective and affordable. A Raspberry Pi, for example, is a $35 single-board computer the size of a credit card: Massmine can run on it and seamlessly collect terabytes of social media data.
A Raspberry Pi micro-computer running software like Massmine has the potential to change the production of knowledge in the social sciences and humanities, opening the world of Big Data to researchers and the public alike. Rather than relying on costly third-party services and specialized marketing data, scholars and community organizations now have the DIY tools to access data and knowledge on online discourses, behaviors and interactions.
The full article is here.