67 research outputs found
Application of Common Sense Computing for the Development of a Novel Knowledge-Based Opinion Mining Engine
The ways people express their opinions and sentiments have radically changed in the past few years thanks to the advent of social networks, web communities, blogs, wikis and other online collaborative media. The distillation of knowledge from this huge amount of unstructured information can be a key factor for marketers who want to create an image or identity in the minds of their customers for their product, brand, or organisation. These online social data, however, remain hardly accessible to computers, as they are specifically meant for human consumption. The automatic analysis of online opinions, in fact, involves a deep understanding of natural language text by machines, from which we are still very far.
Hitherto, online information retrieval has been mainly based on algorithms relying on the textual representation of web-pages. Such algorithms are very good at retrieving texts, splitting them into parts, checking the spelling and counting their words. But when it comes to interpreting sentences and extracting meaningful information, their capabilities are known to be very limited. Existing approaches to opinion mining and sentiment analysis, in particular, can be grouped into three main categories: keyword spotting, in which text is classified into categories based on the presence of fairly unambiguous affect words; lexical affinity, which assigns arbitrary words a probabilistic affinity for a particular emotion; statistical methods, which calculate the valence of affective keywords and word co-occurrence frequencies on the base of a large training corpus. Early works aimed to classify entire documents as containing overall positive or negative polarity, or rating scores of reviews.
Such systems were mainly based on supervised approaches relying on manually labelled samples, such as movie or product reviews where the opinionist’s overall positive or negative attitude was explicitly indicated. However, opinions and sentiments do not occur only at document level, nor they are limited to a single valence or target. Contrary or complementary attitudes toward the same topic or multiple topics can be present across the span of a document. In more recent works, text analysis granularity has been taken down to segment and sentence level, e.g., by using presence of opinion-bearing lexical items (single words or n-grams) to detect subjective sentences, or by exploiting association rule mining for a feature-based analysis of product reviews. These approaches, however, are still far from being able to infer the cognitive and affective information associated with natural language as they mainly rely on knowledge bases that are still too limited to efficiently process text at sentence level.
In this thesis, common sense computing techniques are further developed and applied to bridge the semantic gap between word-level natural language data and the concept-level opinions conveyed by these. In particular, the ensemble application of graph mining and multi-dimensionality reduction techniques on two common sense knowledge bases was exploited to develop a novel intelligent engine for open-domain opinion mining and sentiment analysis. The proposed approach, termed sentic computing, performs a clause-level semantic analysis of text, which allows the inference of both the conceptual and emotional information associated with natural language opinions and, hence, a more efficient passage from (unstructured) textual information to (structured) machine-processable data.
The engine was tested on three different resources, namely a Twitter hashtag repository, a LiveJournal database and a PatientOpinion dataset, and its performance compared both with results obtained using standard sentiment analysis techniques and using different state-of-the-art knowledge bases such as Princeton’s WordNet, MIT’s ConceptNet and Microsoft’s Probase. Differently from most currently available opinion mining services, the developed engine does not base its analysis on a limited set of affect words and their co-occurrence frequencies, but rather on common sense concepts and the cognitive and affective valence conveyed by these. This allows the engine to be domain-independent and, hence, to be embedded in any opinion mining system for the development of intelligent applications in multiple fields such as Social Web, HCI and e-health. Looking ahead, the combined novel use of different knowledge bases and of common sense reasoning techniques for opinion mining proposed in this work, will, eventually, pave the way for development of more bio-inspired approaches to the design of natural language processing systems capable of handling knowledge, retrieving it when necessary, making analogies and learning from experience
Commonsense Reasoning for Conversational AI: A Survey of the State of the Art
Large, transformer-based pretrained language models like BERT, GPT, and T5
have demonstrated a deep understanding of contextual semantics and language
syntax. Their success has enabled significant advances in conversational AI,
including the development of open-dialogue systems capable of coherent, salient
conversations which can answer questions, chat casually, and complete tasks.
However, state-of-the-art models still struggle with tasks that involve higher
levels of reasoning - including commonsense reasoning that humans find trivial.
This paper presents a survey of recent conversational AI research focused on
commonsense reasoning. The paper lists relevant training datasets and describes
the primary approaches to include commonsense in conversational AI. The paper
also discusses benchmarks used for evaluating commonsense in conversational AI
problems. Finally, the paper presents preliminary observations of the limited
commonsense capabilities of two state-of-the-art open dialogue models,
BlenderBot3 and LaMDA, and its negative effect on natural interactions. These
observations further motivate research on commonsense reasoning in
conversational AI.Comment: Accepted to Workshop on Knowledge Augmented Methods for Natural
Language Processing, in conjunction with AAAI 202
Position bias mitigation : a knowledge-aware graph model for emotion cause extraction
The Emotion Cause Extraction (ECE)} task aims to identify clauses which
contain emotion-evoking information for a particular emotion expressed in text.
We observe that a widely-used ECE dataset exhibits a bias that the majority of
annotated cause clauses are either directly before their associated emotion
clauses or are the emotion clauses themselves. Existing models for ECE tend to
explore such relative position information and suffer from the dataset bias. To
investigate the degree of reliance of existing ECE models on clause relative
positions, we propose a novel strategy to generate adversarial examples in
which the relative position information is no longer the indicative feature of
cause clauses. We test the performance of existing models on such adversarial
examples and observe a significant performance drop. To address the dataset
bias, we propose a novel graph-based method to explicitly model the emotion
triggering paths by leveraging the commonsense knowledge to enhance the
semantic dependencies between a candidate clause and an emotion clause.
Experimental results show that our proposed approach performs on par with the
existing state-of-the-art methods on the original ECE dataset, and is more
robust against adversarial attacks compared to existing models.Comment: ACL2021 Main Conference Long pape
- …