27 research outputs found

    Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

    Get PDF

    Argumentation Mining System for Corpus-based Discourse Analysis based on Structured Arguments

    Get PDF
    An Argumentation mining system can analyze a large volume of text data through a variety of sources. Nowadays it is highly useful in the areas of business, economics, and finance with digital marketing being the most promising field along with social media. It is the study of corpus-based discourse analysis that involves the automatic identification of argumentative structure in text. Initially, AM talks about extracting structured arguments from natural text, often unstructured or noisy text. Theoretical approaches of AM and pragmatic schemes that satisfy the needs of social media generated data, recognizing the need for adapting more flexible and expandable schemes, capable of adjusting to argumentation conditions that exist in social media. In this scenario it is a very challenging argumentation scheme able to identify the distinct sub-task and capture the needs of social media text, revealing the need for adopting a more flexible and extensible framework. Corpus-based Machine Learning of linguistic annotations has enabled researchers to identify repetitive linguistic patterns of language use and to uncover hidden meaning in all areas of Natural Language Processing

    Multilingual Universal Sentence Encoder for Semantic Retrieval

    Full text link
    We introduce two pre-trained retrieval focused multilingual sentence encoding models, respectively based on the Transformer and CNN model architectures. The models embed text from 16 languages into a single semantic space using a multi-task trained dual-encoder that learns tied representations using translation based bridge tasks (Chidambaram al., 2018). The models provide performance that is competitive with the state-of-the-art on: semantic retrieval (SR), translation pair bitext retrieval (BR) and retrieval question answering (ReQA). On English transfer learning tasks, our sentence-level embeddings approach, and in some cases exceed, the performance of monolingual, English only, sentence embedding models. Our models are made available for download on TensorFlow Hub.Comment: 6 pages, 6 tables, 2 listings, and 1 figur

    A Case Study and Qualitative Analysis of Simple Cross-Lingual Opinion Mining

    Full text link
    User-generated content from social media is produced in many languages, making it technically challenging to compare the discussed themes from one domain across different cultures and regions. It is relevant for domains in a globalized world, such as market research, where people from two nations and markets might have different requirements for a product. We propose a simple, modern, and effective method for building a single topic model with sentiment analysis capable of covering multiple languages simultanteously, based on a pre-trained state-of-the-art deep neural network for natural language understanding. To demonstrate its feasibility, we apply the model to newspaper articles and user comments of a specific domain, i.e., organic food products and related consumption behavior. The themes match across languages. Additionally, we obtain an high proportion of stable and domain-relevant topics, a meaningful relation between topics and their respective textual contents, and an interpretable representation for social media documents. Marketing can potentially benefit from our method, since it provides an easy-to-use means of addressing specific customer interests from different market regions around the globe. For reproducibility, we provide the code, data, and results of our study.Comment: 10 pages, 2 tables, 5 figures, full paper, peer-reviewed, published at KDIR/IC3k 2021 conferenc