354,089 research outputs found

    Automatic Detection and Classification of Argument Components using Multi-task Deep Neural Network

    Get PDF
    International audienceIn this article we propose a novel method for automatically extracting and classifying argument components from raw texts. We introduce a multi-task deep learning framework exploiting weight parameters trained on auxiliary simple tasks, such as Part-Of-Speech tagging or chunking, in order to solve more complex tasks that require a fine-grained understanding of natural language. Interestingly, our results show that the use of advanced deep learning techniques framed in a multi-task setting enables competing with state-of-the-art systems that depend on handcrafted features

    Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

    Full text link
    Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced semantics. Our model is based on an extended Dynamic Convolution Neural Network, which learns convolution filters at both the sentence and document level, hierarchically learning to capture and compose low level lexical features into high level semantic concepts. We demonstrate the effectiveness of this model on a range of document modelling tasks, achieving strong results with no feature engineering and with a more compact model. Inspired by recent advances in visualising deep convolution networks for computer vision, we present a novel visualisation technique for our document networks which not only provides insight into their learning process, but also can be interpreted to produce a compelling automatic summarisation system for texts

    Multimodal Visual Concept Learning with Weakly Supervised Techniques

    Full text link
    Despite the availability of a huge amount of video data accompanied by descriptive texts, it is not always easy to exploit the information contained in natural language in order to automatically recognize video concepts. Towards this goal, in this paper we use textual cues as means of supervision, introducing two weakly supervised techniques that extend the Multiple Instance Learning (MIL) framework: the Fuzzy Sets Multiple Instance Learning (FSMIL) and the Probabilistic Labels Multiple Instance Learning (PLMIL). The former encodes the spatio-temporal imprecision of the linguistic descriptions with Fuzzy Sets, while the latter models different interpretations of each description's semantics with Probabilistic Labels, both formulated through a convex optimization algorithm. In addition, we provide a novel technique to extract weak labels in the presence of complex semantics, that consists of semantic similarity computations. We evaluate our methods on two distinct problems, namely face and action recognition, in the challenging and realistic setting of movies accompanied by their screenplays, contained in the COGNIMUSE database. We show that, on both tasks, our method considerably outperforms a state-of-the-art weakly supervised approach, as well as other baselines.Comment: CVPR 201

    Syntactic manipulation for generating more diverse and interesting texts

    Get PDF
    Natural Language Generation plays an important role in the domain of dialogue systems as it determines how users perceive the system. Recently, deep-learning based systems have been proposed to tackle this task, as they generalize better and require less amounts of manual effort to implement them for new domains. However, deep learning systems usually adapt a very homogeneous sounding writing style which expresses little variation. In this work, we present our system for Natural Language Generation where we control various aspects of the surface realization in order to increase the lexical variability of the utterances, such that they sound more diverse and interesting. For this, we use a Semantically Controlled Long Short-term Memory Network (SCLSTM), and apply its specialized cell to control various syntactic features of the generated texts. We present an in-depth human evaluation where we show the effects of these surface manipulation on the perception of potential users

    Text Classification: A Perspective of Deep Learning Methods

    Full text link
    In recent years, with the rapid development of information on the Internet, the number of complex texts and documents has increased exponentially, which requires a deeper understanding of deep learning methods in order to accurately classify texts using deep learning techniques, and thus deep learning methods have become increasingly important in text classification. Text classification is a class of tasks that automatically classifies a set of documents into multiple predefined categories based on their content and subject matter. Thus, the main goal of text classification is to enable users to extract information from textual resources and process processes such as retrieval, classification, and machine learning techniques together in order to classify different categories. Many new techniques of deep learning have already achieved excellent results in natural language processing. The success of these learning algorithms relies on their ability to understand complex models and non-linear relationships in data. However, finding the right structure, architecture, and techniques for text classification is a challenge for researchers. This paper introduces deep learning-based text classification algorithms, including important steps required for text classification tasks such as feature extraction, feature reduction, and evaluation strategies and methods. At the end of the article, different deep learning text classification methods are compared and summarized

    Classifiers and text mining: application to a specific context

    Get PDF
    [Abstract]: The constant growth of social networks has not only brought us new ways of interacting with each other, but has also given way to a severe increase in negative behaviors: hate speech, racism, gender harassment, cyberbullying, etc. Manually trying to detect this kind of behaviours in millions of daily social media posts is out of the question. The solution lies in developing intelligent systems to automate such detection tasks. As the nature of these texts is completely subjective, this problem falls under the field of sentiment analysis, which aims to systematically identify and study affective states and subjective information in textual data using natural language processing techniques. In particular, this project is focused on the research of different machine learning techniques related to natural language processing, in order to automate and perform a reliable detection and classification of sexist-related behaviours in social media texts. We will tackle the task of adequately processing the extracted data from social media, as well as researching various text classification techniques and models that we will use to develop and evaluate a variety of classifiers.Traballo fin de grao (UDC.FIC). Enxeñaría Informática. Curso 2021/202

    Understanding Climate Legislation Decisions with Machine Learning

    Get PDF
    Effective action is crucial in order to avert climate disaster. Key in enacting change is the swift adoption of climate positive legislation which advocates for climate change mitigation and adaptation. This is because government legislation can result in far-reaching impact, due to the relationships between climate policy, technology, and market forces. To advocate for legislation, current strategies aim to identify potential levers and obstacles, presenting an opportunity for the application of recent advances in machine learning language models. Here we propose a machine learning pipeline to analyse climate legislation, aiming to investigate the feasibility of natural language processing for the classification of climate legislation texts, to predict policy voting outcomes. By providing a model of the decision making process, the proposed pipeline can enhance transparency and aid policy advocates and decision makers in understanding legislative decisions, thereby providing a tool to monitor and understand legislative decisions towards climate positive impact
    • …
    corecore