5,253 research outputs found
Transforming unstructured voice and text data into insight for paramedic emergency service using recurrent and convolutional neural networks
Paramedics often have to make lifesaving decisions within a limited time in
an ambulance. They sometimes ask the doctor for additional medical
instructions, during which valuable time passes for the patient. This study
aims to automatically fuse voice and text data to provide tailored situational
awareness information to paramedics. To train and test speech recognition
models, we built a bidirectional deep recurrent neural network (long short-term
memory (LSTM)). Then we used convolutional neural networks on top of
custom-trained word vectors for sentence-level classification tasks. Each
sentence is automatically categorized into four classes, including patient
status, medical history, treatment plan, and medication reminder. Subsequently,
incident reports were automatically generated to extract keywords and assist
paramedics and physicians in making decisions. The proposed system found that
it could provide timely medication notifications based on unstructured voice
and text data, which was not possible in paramedic emergencies at present. In
addition, the automatic incident report generation provided by the proposed
system improves the routine but error-prone tasks of paramedics and doctors,
helping them focus on patient care
Author profiling with bidirectional RNNs using attention with GRUs : notebook for PAN at CLEF 2017
This paper describes our approach for the Author Profiling Shared Task at PAN 2017. The goal was to classify the gender and language variety of a Twitter user solely by their tweets. Author Profiling can be applied in various fields like marketing, security and forensics. Twitter already uses similar techniques to deliver personalized advertisement for their users. PAN 2017 provided a corpus for this purpose in the languages: English, Spanish, Portuguese and Arabic.
To solve the problem we used a deep learning approach, which has shown recent success in Natural Language Processing. Our submitted model consists of a bidirectional Recurrent Neural Network implemented with a Gated Recurrent Unit (GRU) combined with an Attention Mechanism. We achieved an average accuracy over all languages of 75,31% in gender classification and 85,22% in language variety classification
An Empirical Study of Offensive Language in Online Interactions
In the past decade, usage of social media platforms has increased significantly. People use these platforms to connect with friends and family, share information, news and opinions. Platforms such as Facebook, Twitter are often used to propagate offensive and hateful content online. The open nature and anonymity of the internet fuels aggressive and inflamed conversations. The companies and federal institutions are striving to make social media cleaner, welcoming and unbiased. In this study, we first explore the underlying topics in popular offensive language datasets using statistical and neural topic modeling. The current state-of-the-art models for aggression detection only present a toxicity score based on the entire post. Content moderators often have to deal with lengthy texts without any word-level indicators. We propose a neural transformer approach for detecting the tokens that make a particular post aggressive. The pre-trained BERT model has achieved state-of-the-art results in various natural language processing tasks. However, the model is trained on general-purpose corpora and lacks aggressive social media linguistic features. We propose fBERT, a retrained BERT model with over million offensive tweets from the SOLID dataset. We demonstrate the effectiveness and portability of fBERT over BERT in various shared offensive language detection tasks. We further propose a new multi-task aggression detection (MAD) framework for post and token-level aggression detection using neural transformers. The experiments confirm the effectiveness of the multi-task learning model over individual models; particularly when the number of training data is limited
- …