Applying Machine Learning, Text Mining and Social Network Analysis for matching entities.
Online discussion forums are used by lots of people to connect for good and for bad. Unfortunately, these discussion boards can be used to connect terrorists and other criminals together. They can also be used to spread propaganda like during the recent US Presidential election. In this situation, it is simple for an adversary to create multiple accounts to enhance their presence online and turn the opinion of the public.
This project aims to make use of methods from Machine Learning, Social Network Analysis and Text Mining to create a probabilistic model to determine if two users on a discussion form are in fact the same person. The model can take into account information about posting behaviour, language use, interaction with other users, etc. The final outcome is a tool for decision support, which can be used by law enforcement to counter criminal activity online.
The image is used under Creative Commons with credits to Kevin Dooley at FlickR.