Big Data Text Analysis

Home -- Download -- Instructions -- FAQ

Analysing tweets in Mozdeh (basic instructions)

The tweets can be explored by searching them, graphing them or automatically searching for keyword spikes. These are all described briefly below, although their application in research is not discussed here. Whichever is required, begin by starting Mozdeh, selecting the project, indexed as above and clicking the Open selected project button.

    1. Searching the tweets. Once Mozdeh is opened as above, the search screen will be shown. If it is not shown, select Search from the Analyse menu (see the arrow pointing to sister in the picture below). Simple queries can be entered into the text box near the bottom of the screen to search the indexed tweets. The results are displayed in chronological order and there are various options for advanced searching, such as to jump to results from specific days. Clicking on a result will reveal the full text of the tweet. The default for searching is OR, so if you search for justin bieber then this will match tweets containing either justin or bieber or both. To search for tweets containing both justin and bieber, search for justin AND bieber instead.
      • It is possible to search by tweeter gender and by the strength of sentiment of the tweets (see below).

    1. Creating a time series graph of the tweets. To obtain a time series graph of the tweets, select Graph Time Series from the Mozdeh Analyze menu. [screenshot of graph interface] Enter a blank search and click Create Graph with Boolean Search to generate a graph of the whole corpus. To create a graph of just the tweets matching a particular search, enter the query and click the same button.
    2. Searching that words occuring disproportionately often for one topic or gender compared to another. To obtain a time series graph of the tweets, select Search from the Mozdeh Analyze menu (this is the starting interface), then select the Co-word tab. To find words that are more common in tweets from one topic or gender compared to another, enter the words (or queries) for the topic in the Comma-separated box, separated by a comma, click the Compare male vs. female option, if wanted, and click the Calculate button. This will generate a plain text file that needs to be loaded into a spreadsheet to be read.It lists words that occur in a high proportion for one topic or gender compared to the other.

    the co-word search interface

    Terms more common in female tweets than in male tweets


    [Advanced] Identifying keyword spikes in the data (may take hours). [screenshot of scanning interface] To obtain a list of the keywords that create the biggest spikes in the data, and are normally associated with significant events within the corpus, select Time Series Scanning from the Mozdeh Analyze menu. Click the Make Time Series File For All Words Matching Conditions button to run the search, which may take hours, and then load the results into a spreadsheet to interpret them.

Made by the Statistical Cybermetrics Research Group at the University of Wolverhampton during the CREEN and CyberEmotions EU projects.