Flame Prediction Based on Harmful Expression Judgement Using Distributed Representation

Abstract

In recent years, flaming-that is, hostile or insulting interaction-on social media has been a problem. To avoid or minimize flaming, enabling the system to automatically check messages before posting to determine whether they include expressions that are likely to trigger flaming can be helpful. We target two types of harmful expressions: insulting expressions and expressions that are likely to cause a quarrel. We first constructed an original harmful expressions dictionary. To minimize the cost of collecting the expressions, we built our dictionary semi-automatically by using word distributed representations. The method used distributed representations of harmful expressions and general expressions as features, and constructed a classifier of harmful/general expressions based on these features. An evaluation experiment found that the proposed method was able to extract harmful expressions with an accuracy of approximately 70%. The proposed method was also able to extract unknown expressions; however, it tended to wrongly extract non-harmful expressions. The method is able to determine unknown harmful expressions not included in the basic dictionary and can identify semantic relationships among harmful expressions. Although the method cannot presently be applied directly to multi-word expressions, it should be possible to add such a capability by introducing time-series learning

    Similar works