Search CORE

147 research outputs found

Parsing Argumentation Structures in Persuasive Essays

Author: Gurevych Iryna
Stab Christian
Publication venue
Publication date: 22/07/2016
Field of study

In this article, we present a novel approach for parsing argumentation structures. We identify argument components using sequence labeling at the token level and apply a new joint model for detecting argumentation structures. The proposed model globally optimizes argument component types and argumentative relations using integer linear programming. We show that our model considerably improves the performance of base classifiers and significantly outperforms challenging heuristic baselines. Moreover, we introduce a novel corpus of persuasive essays annotated with argumentation structures. We show that our annotation scheme and annotation guidelines successfully guide human annotators to substantial agreement. This corpus and the annotation guidelines are freely available for ensuring reproducibility and to encourage future research in computational argumentation.Comment: Under review in Computational Linguistics. First submission: 26 October 2015. Revised submission: 15 July 201

arXiv.org e-Print Archive

TUbiblio

Directory of Open Access Journals

TUdatalib Repository (TU Darmstadt)

Self-Adaptive Hierarchical Sentence Model

Author: Lu Zhengdong
Poupart Pascal
Zhao Han
Publication venue
Publication date: 27/04/2015
Field of study

The ability to accurately model a sentence at varying stages (e.g., word-phrase-sentence) plays a central role in natural language processing. As an effort towards this goal we propose a self-adaptive hierarchical sentence model (AdaSent). AdaSent effectively forms a hierarchy of representations from words to phrases and then to sentences through recursive gated local composition of adjacent segments. We design a competitive mechanism (through gating networks) to allow the representations of the same sentence to be engaged in a particular learning task (e.g., classification), therefore effectively mitigating the gradient vanishing problem persistent in other recursive models. Both qualitative and quantitative analysis shows that AdaSent can automatically form and select the representations suitable for the task at hand during training, yielding superior classification performance over competitor models on 5 benchmark data sets.Comment: 8 pages, 7 figures, accepted as a full paper at IJCAI 201

arXiv.org e-Print Archive

CiteSeerX

Does working memory capacity predict literal and inferential comprehension of bilinguals' digital reading in a multitasking setting?

Author: Azevedo Bruno de
Finger Ingrid
Oliveira Davi Alves
Tomitch Lêda Maria Braga
Publication venue
Publication date: 01/01/2022
Field of study

The ubiquity of multitasking has led researchers to investigate its potential costs for reading and learning (Clinton-Lisell, 2021). While some studies have not shown detrimental effects of multitasking for reading comprehension (Bowman et al., 2010; Cho et al., 2015; Pashler et al., 2013), one particular study has found a benefit of multitasking (Tran et al., 2013). These results, nevertheless, do not converge with the findings of recent meta-analyses, which have suggested both a negative effect of multitasking for reading comprehension (Clinton-Lisell, 2021), as well as the disruptive effects of listening to lyrical music while reading for comprehension (Vasilev et al., 2018). Previous research seems to converge with the theories of how working memory copes with the complexity of reading as a process, since several subprocesses must be orchestrated so that the ultimate goal of reading – the construction of a mental representation – is fully achieved (Tomitch, 2020). In addition to that, no previous study has investigated reading as a multilevel construct in which both literal and inferential comprehension (Alptekin & Erçetin, 2010; Kintsch, 1998) is assessed in a multitasking setting. With that in mind, we investigated whether working memory capacity, measured by the Self-Administrable Reading Span Test (Oliveira et al., 2021), predicts proficient bilinguals’ performance in literal and inferential comprehension, by means of comprehension questions (Pearson & Johnson, 1978) and reading times, under a multitasking setting in two conditions – listening to lyrical music (experimental) as opposed to listening to non-lyrical music (control). Multiple linear regression analyses revealed that working memory capacity significantly predicted inferential, but not literal comprehension nor reading times, and only when participants were listening to lyrical music. Results are discussed both in terms of the effects of multitasking on reading comprehension as well as the role of working memory in language comprehension

Lume 5.8

Siamese Fine-tuning of BERT for Classification of Small and Imbalanced Datasets, Applied to Prediction of Involuntary Admissions in Mental Healthcare

Author: Kalidas V.
Publication venue
Publication date: 28/08/2020
Field of study

Pure OAI Repository

Summarizing Product Reviews Using Dynamic Relation Extraction

Author: Gråborg Mikael
Handmark Oskar
Publication venue: Lunds universitet/Institutionen för datavetenskap
Publication date: 01/01/2016
Field of study

The accumulated review data for a single product on Amazon.com could po- tentially take several weeks to examine manually. Computationally extracting the essence of a document is a substantial task, which has been explored pre- viously through many different approaches. We explore how statistical predic- tion can be used to perform dynamic relation extraction. Using patterns in the syntactic structure of a sentence, each word is classified as either product fea- ture or descriptor, and then linked together by association. The classifiers are trained with a manually annotated training set and features from dependency parse trees produced by the Stanford CoreNLP library. In this thesis we compare the most widely used machine learning algo- rithms to find the one most suitable for our scenario. We ultimately found that the classification step was most successful with SVM, reaching an FS- core of 80 percent for the relation extraction classification step. The results of the predictions are presented in a graphical interface displaying the relations. An end-to-end evaluation was also conducted, where our system achieved a relaxed recall of 53.35%

Towards Personalized and Human-in-the-Loop Document Summarization

Author: Ghodratnama Samira
Publication venue
Publication date: 30/09/2021
Field of study

The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.Comment: PhD thesi

arXiv.org e-Print Archive