Big Data Text Analysis

Home -- Download -- Instructions -- FAQ

Username anonymisation with Mozdeh

The standard version of Mozdeh has anonymisation options when collecting YouTube comments and Reddit posts, and this anonymisation is automatic and compulsory in the compulsory anonymisation version of Mozdeh. The two versions are otherwise the same. The compulsory anonymisation version is designed for universities wishing to comply with ethical requirements when giving social media data collection assignments to their students. Both versions of Mozdeh have optional anonymisation procedures when importing data from elsewhere, including previously collected tweets.

To activate anonymisation for YouTube and Reddit in the standard Mozdeh, check the anonymisation option in the data collection screen.

To anonymise data imported into Mozdeh with the Import Data button, see the options available after clicking this button.

Important: The anonymisation in Mozdeh is partial: only for usernames and not for full names. It anonymises the usernames of the post authors by replacing them with a number. It also anonymises @usernames in posts by replacing them with a number. It does this consistently so that, for example, @userbob would always be replaced by the same number wherever @userbob is found, both as a poster and someone mentioned in the post.

Mozdeh does not attempt to detect and anonymise names. So if someone posted that "Mike Thelwall is an idiot" then it would not be changed, but "@MikeThelwall is an idiot" would be changed to something like "@user1234 is an idiot".

Made by the University of Wolverhampton during the CREEN and CyberEmotions EU projects and updated at the University of Sheffield.