I wrote a blog post on the tools to capture data from Twitter back in July, followed up by a post on the challenges of using Twitter as a data source in August, for the London School of Economics and Political Science blog.

I have received some queries regarding historical Twitter data, specially on DiscoverText and Sifter. Using DiscoverText and Sifter, it is possible to retrieve and analyse Twitter data (Firehose).

I thought I would answer some of the typical questions I receive in this post. Hope that it is helpful for students across degree programs, and also for academics looking into using Twitter data

Q – Is DiscoverText easy to use?  A – Yes, I have a non-technical background, and I have found DiscoverText very simple and easy to use. You can find some excellent video tutorials here.  Depending on the learner, it could take as a little as a few days to start using some of the advanced features. It is possible to manually code tweets, machine classify, and de-duplicate tweets without being an expert in programming! DiscoverText offer a free, no commitment, 30 day trail, so I would recommend going for this which will allow you to test out the software.

Q – Is Twitter data expensive? A – It can range from not that expensive to very expensive. It is possible to get a good dataset for just a couple of hundred dollars. I purchased a neat dataset of Ebola tweets for $100.  It is possible to generate a free estimate via Sifter, here.

Q – Is it possible to retrieve Twitter data from previous years ? A – Yes, using Sifter it is entirely possible to retrieve Twitter data going back to the very start of Twitter.  However, remember that as Twitter over the years has introduced new metadata fields, these may not be present in older datasets. This is a good resource.

Q – My topic is X, is there anyway to retrieve the data for free ? A – In almost all cases, the answer to this question is no. This is because it is not possible to share Twitter datasets.

Q – My research Q is X, can Twitter data help address it ? A – This can be difficult to answer. In some cases, yes, and in some cases no. My advice would be to use Twitter’s advance search feature, link here, this will display tweets all the way back to Jack’s first tweet. You could, therefore, use this as way of seeing tweets, before making a commitment to purchase.

Q – What do you use for historical Twitter data?  A – For my PhD project, I have used DiscoverText, and Sifter. I am yet to come across a better alternative, especially for academic use. The support is excellent, and I almost receive a reply to my email queries within a few minutes.

Q – How do you use DiscoverText?  A – I was interviewed by DiscoverText, link here, it provides further information on how I have used DiscoverText.

The best person to contact about DiscoverText and Sifter is Dr Stu Shulman.




  1. Hanaa · January 6, 2016

    Thanks this is incredibly useful information! I wonder if you had any comments about generating and analysing data utilizing Twitter, rather than analysing existing data?

    • Wasim Ahmed · January 10, 2016

      Hi Hanaa, it is possible to both generate and analyse Twitter data within DiscoverText. There are some very powerful features such as machine learning, and de-duplication features. I would highly recommend checking the features out. Mozdeh, NodeXL, Chorus, and COSMOS (which are all free) are desktop tools which can be used to both generate and analyse Twitter data. I recommend taking a look at these too. Hope this helps!


      • Hanaa · February 12, 2016

        thank you!


      • Wasim Ahmed · February 12, 2016

        Glad it was helpful!


