Big Data Text Analysis

Home -- Download -- Instructions -- FAQ

Keyword Searches and Filters to Explore Texts in Mozdeh

The texts collected by a project can be explored by searching them.

    1. First, start Mozdeh, select and open the project if it is not already open in Mozdeh. Once Mozdeh is opened as above, the search screen below will be shown.

    2. Enter a query into the text box near the top of the screen. This can be one or more keywords, or a phrase search "in quotes" or multiple terms or phrases joined togther with AND or OR commands. The default for searching is OR, so if you search for justin bieber then this will match texts containing either justin or bieber or both. To search for texts containing both justin and bieber, search for justin AND bieber instead.
    3. Click the Search button to search the indexed texts with your query. The first 40+ matches will be displayed in the large box near the bottom of the window. The results are displayed in chronological order. The Mozdeh title bar will also report the total number of matches (17472 in the example below).
    4. If there are more matches than Mozdeh displays in one go, try clicking on Next and Previous buttons to navigate between them.
    5. Clicking on a result will reveal the full text of the text in simplified form - to see the original text, click first on the Show original tweet option.
    6. Try out some of the options for advanced searching, such as to jump to results from specific days. Click on the Search button after each change to apply it.
    7. When searching for texts within Mozdeh, they are returned in approximately chronological order. Try sorting the results in different orders by altering the Sort by value.The Random option may be useful if you want to see what a typical texts look like or if you need a random sample for a content analysis.

    8. Try searching for texts written by a specific gender by selecting the gender from the dropdown box (see above) and clicking on the Search button. Mozdeh estimates gender by first name matching with genders from the 1990 US census. It allocates genders to about 30% of users. These are people with an identifiable first name that is 90% male or 90% female in the 1990 US census (accuracy estimates).
    9. Try searching for texts with a given positive and/or negative sentiment range by entering minimum and/or maximum positive and negative sentiment strength values in the sentiment boxes and clicking on the Search button. The scale is from 1 (no sentiment) to 5 (very strong sentiment). So a positive sentiment setting of 3 and 5 would generate a list of texts with moderate, strong or very strong positive sentiment.

    11. Try searching for texts written by a specific date range by finding the numbers of the first and last date to search for from the First date to show dropdown box (see above) and adding them in square brackets after the search term, then clicking on the Search button. The example below searches for texts containing room in the first three dates (in this case all dates from 2002).

Other searches

Wildcards are allowed in phrase searches in the above interface. For example, "I * you" matches any three word phrase where the first word is I and the last is you.

Made by the Statistical Cybermetrics Research Group at the University of Wolverhampton during the CREEN and CyberEmotions EU projects.