155 research outputs found
Cyberbullying detection: Hybrid models based on machine learning and natural language processing techniques
The rise in web and social media interactions has resulted in the efortless proliferation of offensive language and hate speech. Such online harassment, insults, and attacks are commonly termed cyberbullying. The sheer volume of userâgenerated content has made it challenging to iden-tify such illicit content. Machine learning has wide applications in text classification, and researchers are shifting towards using deep neural networks in detecting cyberbullying due to the several ad-vantages they have over traditional machine learning algorithms. This paper proposes a novel neural network framework with parameter optimization and an algorithmic comparative study of eleven classification methods: four traditional machine learning and seven shallow neural networks on two real world cyberbullying datasets. In addition, this paper also examines the effect of feature extraction and wordâembeddingâtechniquesâbased natural language processing on algorithmic per-formance. Key observations from this study show that bidirectional neural networks and attention models provide high classification results. Logistic Regression was observed to be the best among the traditional machine learning classifiers used. Term FrequencyâInverse Document Frequency (TFâIDF) demonstrates consistently high accuracies with traditional machine learning techniques. Global Vectors (GloVe) perform better with neural network models. BiâGRU and BiâLSTM worked best amongst the neural networks used. The extensive experiments performed on the two datasets establish the importance of this work by comparing eleven classification methods and seven feature extraction techniques. Our proposed shallow neural networks outperform existing stateâofâtheâart approaches for cyberbullying detection, with accuracy and F1âscores as high as ~95% and ~98%, respectively
Deep neural networks for identification of sentential relations
Natural language processing (NLP) is one of the most important technologies in the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate mostly in language: web search, advertisement, emails, customer service, language translation, etc. There are a large variety of underlying tasks and machine learning models powering NLP applications.
Recently, deep learning approaches have obtained exciting performance across a broad array of NLP tasks. These models can often be trained in an end-to-end paradigm without traditional, task-specific feature engineering.
This dissertation focuses on a specific NLP task --- sentential relation
identification. Successfully identifying the relations of two sentences can contribute greatly to some downstream NLP problems. For example, in open-domain question answering, if the system can recognize that a new question is a paraphrase of a previously observed question, the known answers can be returned directly,
avoiding redundant reasoning. For another, it is also helpful to discover some latent knowledge, such as inferring ``the weather is good today'' from another description ``it is sunny today''. This dissertation presents some deep neural networks (DNNs) which are developed to handle this sentential relation identification problem. More specifically, this problem is addressed by this dissertation in the following three aspects.
(i) Sentential relation representation is built on the matching between
phrases of arbitrary lengths. Stacked Convolutional Neural Networks (CNNs) are employed to model the sentences, so that each filter can cover a local phrase, and filters in lower level span shorter phrases and filters in higher level span longer phrases. CNNs in stack enable to model sentence phrases in different granularity and different abstraction.
(ii) Phrase matches contribute differently to the tasks. This motivates us to propose an attention mechanism in CNNs for these tasks, differing from the popular research of attention mechanisms in Recurrent Neural Networks (RNNs). Attention mechanisms are implemented in both convolution layer as well as pooling layer in deep CNNs, in order to figure out automatically which phrase of one sentence matches a specific phrase of the other sentence. These matches are supposed to be indicative to the final decision. Another contribution in terms of attention mechanism is inspired by the observation that some sentential relation identification task, like answer selection for multi-choice question answering, is mainly determined by phrase alignments of stronger degree; in contrast, some tasks such as textual entailment benefit more from the phrase alignments of weaker degree. This motivates us to propose a dynamic
``attentive pooling'' to select phrase alignments of different intensities for different task categories.
(iii) In certain scenarios, sentential relation can only be successfully identified within specific background knowledge, such as the multi-choice question answering based on passage comprehension. In this case, the relation between two sentences (question and answer candidate) depends on not only the semantics in the two sentences, but also the information encoded in the given passage.
Overall, the work in this dissertation models sentential relations in
hierarchical DNNs, different attentions and different background knowledge. All systems got state-of-the-art performances in representative tasks.Die Verarbeitung natĂŒrlicher Sprachen (engl.: natural language processing - NLP) ist eine der wichtigsten Technologien des Informationszeitalters. Weiterhin ist das Verstehen komplexer sprachlicher AusdrĂŒcke ein essentieller
Teil kĂŒnstlicher Intelligenz. Anwendungen von NLP sind ĂŒberall zu finden, da Menschen haupt\-sĂ€ch\-lich ĂŒber Sprache kommunizieren: Internetsuchen, Werbung, E-Mails, Kundenservice, Ăbersetzungen, etc. Es gibt eine groĂe Anzahl Tasks und Modelle des maschinellen Lernens fĂŒr NLP-Anwendungen.
In den letzten Jahren haben Deep-Learning-AnsĂ€tze vielversprechende Ergebnisse fĂŒr eine groĂe Anzahl verschiedener NLP-Tasks erzielt. Diese Modelle können oft end-to-end trainiert werden, kommen also ohne auf den Task zugeschnittene Feature aus.
Diese Dissertation hat einen speziellen NLP-Task als Fokus: Sententielle Relationsidentifizierung. Die Beziehung zwischen zwei SĂ€tzen erfolgreich zu erkennen, kann die Performanz fĂŒr nachfolgende NLP-Probleme stark verbessern. FĂŒr open-domain question answering, zum Beispiel, kann ein System, das erkennt, dass eine neue Frage eine Paraphrase einer bereits gesehenen Frage ist, die be\-kann\-te Antwort direkt zurĂŒckgeben und damit mehrfaches
Schlussfolgern vermeiden. Zudem ist es auch hilfreich, zu Grunde liegendes Wissen zu entdecken, so wie das SchlieĂen der Tatsache "das Wetter ist gut" aus der Beschreibung "es ist heute sonnig". Diese Dissertation stellt einige tiefe neuronale Netze (eng.: deep neural networks - DNNs) vor, die speziell fĂŒr das Problem der sententiellen Re\-la\-tions\-i\-den\-ti\-fi\-zie\-rung entwickelt wurden. Im Speziellen wird dieses Problem in dieser Dissertation unter den
folgenden drei Aspekten behandelt: (i) Sententielle Relationsrepr\"{a}sentationen basieren auf einem Matching zwischen Phrasen beliebiger LĂ€nge. Tiefe
convolutional neural networks (CNNs) werden verwendet, um diese SĂ€tze zu modellieren, sodass jeder Filter eine lokale Phrase abdecken kann, wobei Filter in niedrigeren Schichten kĂŒrzere und Filter in höheren Schichten lĂ€ngere Phrasen umfassen. Tiefe CNNs machen es möglich, SĂ€tze in unterschiedlichen GranularitĂ€ten und Abstraktionsleveln zu modellieren. (ii) Matches zwischen Phrasen tragen unterschiedlich zu unterschiedlichen Tasks bei. Das motiviert uns, einen Attention-Mechanismus fĂŒr CNNs fĂŒr diese Tasks einzufĂŒhren, der sich von dem bekannten Attention-Mechanismus fĂŒr recurrent neural networks
(RNNs) unterscheidet. Wir implementieren Attention-Mechanismen sowohl im convolution layer als auch im pooling layer tiefer CNNs, um herauszufinden, welche Phrasen eines Satzes bestimmten Phrasen eines anderen Satzes entsprechen. Wir erwarten, dass solche Matches die finale Entscheidung stark beeinflussen. Ein anderer Beitrag zu Attention-Mechanismen
wurde von der Beobachtung inspiriert, dass einige
sententielle Relationsidentifizierungstasks, zum Beispiel die Auswahl einer Antwort fĂŒr multi-choice question answering hauptsĂ€chlich von Phrasen\-a\-lignie\-rungen stĂ€rkeren Grades bestimmt werden. Im Gegensatz dazu profitieren andere Tasks wie textuelles SchlieĂen mehr von Phrasenalignierungen schwĂ€cheren Grades. Das motiviert uns, ein dynamisches "attentive pooling" zu entwickeln, um Phrasenalignierungen verschiedener StĂ€rken fĂŒr verschiedene
Taskkategorien auszuwÀhlen. (iii) In bestimmten Szenarien können sententielle Relationen nur mit entsprechendem Hintergrundwissen erfolgreich identifiziert werden, so wie multi-choice question answering auf der Grundlage des VerstÀndnisses eines Absatzes. In diesem Fall hÀngt die Relation zwischen zwei SÀtzen (der Frage und der möglichen Antwort) nicht nur von der Semantik der beiden SÀtze, sondern auch von der in dem gegebenen Absatz enthaltenen
Information ab.
Insgesamt modellieren die in dieser Dissertation enthaltenen Arbeiten sententielle Relationen in hierarchischen DNNs, mit verschiedenen Attention-Me\-cha\-nis\-men und wenn unterschiedliches Hintergrundwissen zur Verf\ {u}gung steht. Alle Systeme erzielen state-of-the-art Ergebnisse fĂŒr die entsprechenden Tasks
Tackling Sequence to Sequence Mapping Problems with Neural Networks
In Natural Language Processing (NLP), it is important to detect the
relationship between two sequences or to generate a sequence of tokens given
another observed sequence. We call the type of problems on modelling sequence
pairs as sequence to sequence (seq2seq) mapping problems. A lot of research has
been devoted to finding ways of tackling these problems, with traditional
approaches relying on a combination of hand-crafted features, alignment models,
segmentation heuristics, and external linguistic resources. Although great
progress has been made, these traditional approaches suffer from various
drawbacks, such as complicated pipeline, laborious feature engineering, and the
difficulty for domain adaptation. Recently, neural networks emerged as a
promising solution to many problems in NLP, speech recognition, and computer
vision. Neural models are powerful because they can be trained end to end,
generalise well to unseen examples, and the same framework can be easily
adapted to a new domain.
The aim of this thesis is to advance the state-of-the-art in seq2seq mapping
problems with neural networks. We explore solutions from three major aspects:
investigating neural models for representing sequences, modelling interactions
between sequences, and using unpaired data to boost the performance of neural
models. For each aspect, we propose novel models and evaluate their efficacy on
various tasks of seq2seq mapping.Comment: PhD thesi
- âŠ