Big Data Text Analysis
Since Japanese does not include spaces between words, Mozdeh uses a helper program to insert the spaces for the analyses. This is a Java program. Your computer will need to run Java programs to use this software. Please pilot test this on a few minutes of data first to make sure that it works on your computer!
Start Mozdeh and enter a project name (English only please - Japanese characters might crash the text segmentation step).
Enter query terms in Japanese (one per line, any number) and select ja as the language, click Keep... then click to start collecting.
When you have finished collecting tweets, click Stop Monitoring.
Accept the option to filter tweets (optional).
Mozdeh will now download a Java program called Segmenter.jar and save it to your moz_data folder containing your projects. A new "command" window will then open and attempt to run this program on your Japanese tweets. If you do not have Java on your system then you will get an error message like: 'java' is not recognized as an internal or external command, operable program or batch file in this window (see below). If you see this message then you need to install Java on your computer and start again with a new project.
If you don't see an error message then follow the instructions in the command box (press any key).
Next click OK, make sure that 3 is selected for the language group in the next box and click OK to the next boxes. This may take a long time - days if you have collected weeks of data.
Finally, click Search to show your data and notice that there are spaces between words. The spaces may not always be correct. Read other parts of this website to find how to analyse this text.
For example, in the screen below a query has been run and wht Mine associations button clicked, with results below right.
|Made by the Statistical Cybermetrics Research Group at the University of Wolverhampton during the CREEN and CyberEmotions EU projects.|