Big Data Text Analysis

Home -- Download -- Instructions -- FAQ

These are basic instructions for creating a spam-free subproject in Mozdeh from your original project. There is also a more advanced spam removal technique that involves using the spam marking facilities of Mozdeh.
  1. Start Mozdeh and load your project.
  2. Use the keyword search facility of Mozdeh to browse for spam tweets and note the distinctive words or phrases that they use.
  3. In the Mozdeh search interface construct a query that matches only spam tweets and check the results.
  4. Re-run the query in 3. after ticking the option to save results to a subproject. This subproject will just contain the spam tweets.
  5. To create a new subproject without the spam tweets, select Manage Subproject from the Subprojects option in the Analyse menu. Then click on the spam subproject just created and click the Invert... button. This will create a new subproject without the spam. Now close Mozdeh and restart it, this time selecting the spam-free subproject to analyse the data without the spam.

Now follow from the second step the instructions to analyse the tweets.

Made by the Statistical Cybermetrics Research Group at the University of Wolverhampton during the CREEN and CyberEmotions EU projects.