Method of Profanity Detection Using Word Embedding and LSTM

Abstract

With the rising number of Internet users, there has been a rapid increase in cyberbullying. Among the types of cyberbullying, verbal abuse is emerging as the most serious problem, for preventing which profanity is being identified and blocked. However, users employ words cleverly to avoid blocking. With the existing profanity discrimination methods, deliberate typos and profanity using special characters can be discriminated with high accuracy. However, as they cannot grasp the meaning of the words and the flow of sentences, standard words such as “Sibaljeom (starting point, a Korean word that sounds similar to a swear word)” and “Saekkibalgalag (little toe, a Korean word that sounds similar to another swear word)” are less accurately discriminated. Therefore, in order to solve this problem, this study proposes a method of discriminating profanity using a deep learning model that can grasp the meaning and context of words after separating Hangul into the onset, nucleus, and coda

    Similar works

    Full text

    thumbnail-image