November 25, 2013

This post is the second is a series that examines this year's console launches through the eyes of the Twitterverse. For more on the project, see the description in the previous post..

Fun analysis, thanks for putting all that together!

Once again, I welcome any and all articles from the Quarter to Three Analytics Department.

But I have to wonder how useful Twitter data is in most situations. What specifically comes to mind are the waves Sharknado made in the Twitter-sphere. Everyone talked about it, but no one actually watched it.

I agree with you. Trying to use standard tools to determine sentiment in tweets is like trying to use a cocktail straw to suck a gallon of distilled water out of a cess pool. Nevertheless, with the vast quantities of data, there are some trends that are easy to pick up. One thing that I did not discuss (but you will see if you look at the raw data) is that I stratified by original / raw / total corpus of tweets. In terms of messages being spread, it's relevant (I think?) to look at the total corpus. However, if you want to look for trends sans retweets, the files of "originals" are better source data. Take that with a grain of salt, though, because there are some retweets in the file of originals, too. Were I to start this project again today, I would change the twitter stream harvesting script to return some different fields that might make it easier to stratify. Maybe that's a good Thanksgiving project.

The other thing one can ask and wonder about is the staying power of a retweet vs. an original tweet or the impression left by the viewing of a celebrity tweet. To my knowledge, those are unknowns. I've certainly found a lot of interesting resources and links through twitter, but most of what shows up on my feed seems like discarded thoughts that nobody wants to read.