Search CORE

2 research outputs found

Extracting Scales of Measurement Automatically from Biomedical Text with Special Emphasis on Comparative and Superlative Scales

Author: Baker Sara
Publication venue: Scholarship@Western
Publication date: 04/04/2019
Field of study

Abstract In this thesis, the focus is on the topic of “Extracting Scales of Measurement Automatically from Biomedical Text with Special Emphasis on Comparative and Superlative Scales.” Comparison sentences, when considered as a critical part of scales of measurement, play a highly significant role in the process of gathering information from a large number of biomedical research papers. A comparison sentence is defined as any sentence that contains two or more entities that are being compared. This thesis discusses several different types of comparison sentences such as gradable comparisons and non-gradable comparisons. The main goal is extracting comparison sentences automatically from the full text of biomedical articles. Therefore, the thesis presents a Java program that could be used to analyze biomedical text to identify comparison sentences by matching the sentences in the text to 37 syntactic and semantic features. These features or qualities would be helpful to extract comparative sentences from any biomedical text. Two machine learning techniques are used with the 37 roles to assess the curated dataset. The results of this study are compared with earlier studies

Scholarship@Western

Structurally informed methods for improved sentiment analysis

Author: Kessler Stefanie Wiltrud
Publication venue
Publication date: 01/01/2017
Field of study

Sentiment analysis deals with methods to automatically analyze opinions in natural language texts, e.g., product reviews. Such reviews contain a large number of fine-grained opinions, but to automatically extract detailed information it is necessary to handle a wide variety of verbalizations of opinions. The goal of this thesis is to develop robust structurally informed models for sentiment analysis which address challenges that arise from structurally complex verbalizations of opinions. In this thesis, we look at two examples for such verbalizations that benefit from including structural information into the analysis: negation and comparisons. Negation directly influences the polarity of sentiment expressions, e.g., while "good" is positive, "not good" expresses a negative opinion. We propose a machine learning approach that uses information from dependency parse trees to determine whether a sentiment word is in the scope of a negation expression. Comparisons like "X is better than Y" are the main topic of this thesis. We present a machine learning system for the task of detecting the individual components of comparisons: the anchor or predicate of the comparison, the entities that are compared, which aspect they are compared in, and which entity is preferred. Again, we use structural context from a dependency parse tree to improve the performance of our system. We discuss two ways of addressing the issue of limited availability of training data for our system. First, we create a manually annotated corpus of comparisons in product reviews, the largest such resource available to date. Second, we use the semi-supervised method of structural alignment to expand a small seed set of labeled sentences with similar sentences from a large set of unlabeled sentences. Finally, we work on the task of producing a ranked list of products that complements the isolated prediction of ratings and supports the user in a process of decision making. We demonstrate how we can use the information from comparisons to rank products and evaluate the result against two conceptually different external gold standard rankings.Sentimentanalyse befasst sich mit Methoden zur automatischen Analyse von Meinungen in Texten wie z.B. Produktbewertungen. Solche bewertenden Texte enthalten detaillierte Meinungsäußerungen. Um diese automatisch analysieren zu können müssen wir mit strukturell komplexen Äußerungen umgehen können. In dieser Arbeit präsentieren wir einen Ansatz für die robuste Analyse von komplexen Meinungsäußerungen mit Hilfe von Informationen aus der Satzstruktur. Wir betrachten zwei Beispiele für komplexe Meinungsäußerungen: Negationen und Vergleiche. Eine Negation hat direkten Einfluss auf die Polarität einer Meinungsäußerung in einem Satz. Während "gut" eine positive Meinung ausdrückt, ist "nicht gut" negativ. Wir präsentieren ein System, das auf maschinellem Lernen beruht und Informationen aus dem Satzstrukturbaum verwendet um für ein gegebenes Schlüsselwort festzustellen, ob im Kontext eine Negation vorkommt die die Polarität beeinflusst. Als zweites Beispiel für komplexe Meinungsäußerungen betrachten wir Vergleiche von Produkten, z.B. "X ist besser als Y". Wir präsentieren ein lernendes System, das die einzelnen Komponenten von Vergleichen identifiziert: Das Prädikat bzw. das Wort, das den Vergleich einführt, die beiden Entitäten, die verglichen werden, der Aspekt in dem sie verglichen werden, und welche Entität als besser bewertet wird. Auch hier verwenden wir Satzstrukturinformationen um die Erkennung zu verbessern. Ein Problem für die Anwendung von maschinellen Lernverfahren ist die eingeschränkte Verfügbarkeit von Trainingsdaten. Wir gehen dieses Problem auf zwei Arten an. Zum einen durch die Annotation eines eigenen Datensatzes von Vergleichen in Kamerabewertungen. Zum anderen indem wir eine halbüberwachte Methode einsetzen um eine kleine Menge von manuell annotierten Sätzen durch ähnliche Sätze aus einer großen Menge unannotierter Sätze zu ergänzen. Abschließend bearbeiten wir die Aufgabe, den Auswahlprozess eines Kunden zu unterstützen indem wir eine Rangfolge von Produkten erstellen. Wir demonstrieren, wie wir Vergleiche zu diesem Zweck nutzen können und evaluieren unser System gegen zwei konzeptionell unterschiedliche Rangfolgen aus externen Quellen