1,238 research outputs found
Leveraging Structural and Semantic Correspondence for Attribute-Oriented Aspect Sentiment Discovery
Opinionated text often involves attributes such as authorship and location
that influence the sentiments expressed for different aspects. We posit that
structural and semantic correspondence is both prevalent in opinionated text,
especially when associated with attributes, and crucial in accurately revealing
its latent aspect and sentiment structure. However, it is not recognized by
existing approaches.
We propose Trait, an unsupervised probabilistic model that discovers aspects
and sentiments from text and associates them with different attributes. To this
end, Trait infers and leverages structural and semantic correspondence using a
Markov Random Field. We show empirically that by incorporating attributes
explicitly Trait significantly outperforms state-of-the-art baselines both by
generating attribute profiles that accord with our intuitions, as shown via
visualization, and yielding topics of greater semantic cohesion.Comment: EMNLP 201
LasUIE: Unifying Information Extraction with Latent Adaptive Structure-aware Generative Language Model
Universally modeling all typical information extraction tasks (UIE) with one
generative language model (GLM) has revealed great potential by the latest
study, where various IE predictions are unified into a linearized hierarchical
expression under a GLM. Syntactic structure information, a type of effective
feature which has been extensively utilized in IE community, should also be
beneficial to UIE. In this work, we propose a novel structure-aware GLM, fully
unleashing the power of syntactic knowledge for UIE. A heterogeneous structure
inductor is explored to unsupervisedly induce rich heterogeneous structural
representations by post-training an existing GLM. In particular, a structural
broadcaster is devised to compact various latent trees into explicit high-order
forests, helping to guide a better generation during decoding. We finally
introduce a task-oriented structure fine-tuning mechanism, further adjusting
the learned structures to most coincide with the end-task's need. Over 12 IE
benchmarks across 7 tasks our system shows significant improvements over the
baseline UIE system. Further in-depth analyses show that our GLM learns rich
task-adaptive structural bias that greatly resolves the UIE crux, the
long-range dependence issue and boundary identifying. Source codes are open at
https://github.com/ChocoWu/LasUIE.Comment: NeurIPS2022 conference pape
Octa: Omissions and Conflicts in Target-Aspect Sentiment Analysis
Sentiments in opinionated text are often determined by both aspects and
target words (or targets). We observe that targets and aspects interrelate in
subtle ways, often yielding conflicting sentiments. Thus, a naive aggregation
of sentiments from aspects and targets treated separately, as in existing
sentiment analysis models, impairs performance.
We propose Octa, an approach that jointly considers aspects and targets when
inferring sentiments. To capture and quantify relationships between targets and
context words, Octa uses a selective self-attention mechanism that handles
implicit or missing targets. Specifically, Octa involves two layers of
attention mechanisms for, respectively, selective attention between targets and
context words and attention over words based on aspects. On benchmark datasets,
Octa outperforms leading models by a large margin, yielding (absolute) gains in
accuracy of 1.6% to 4.3%.Comment: Accepted by Findings of EMNLP 202
Concepts and Methods from Artificial Intelligence in Modern Information Systems – Contributions to Data-driven Decision-making and Business Processes
Today, organizations are facing a variety of challenging, technology-driven developments, three of the most notable ones being the surge in uncertain data, the emergence of unstructured data and a complex, dynamically changing environment. These developments require organizations to transform in order to stay competitive. Artificial Intelligence with its fields decision-making under uncertainty, natural language processing and planning offers valuable concepts and methods to address the developments. The dissertation at hand utilizes and furthers these contributions in three focal points to address research gaps in existing literature and to provide concrete concepts and methods for the support of organizations in the transformation and improvement of data-driven decision-making, business processes and business process management. In particular, the focal points are the assessment of data quality, the analysis of textual data and the automated planning of process models. In regard to data quality assessment, probability-based approaches for measuring consistency and identifying duplicates as well as requirements for data quality metrics are suggested. With respect to analysis of textual data, the dissertation proposes a topic modeling procedure to gain knowledge from CVs as well as a model based on sentiment analysis to explain ratings from customer reviews. Regarding automated planning of process models, concepts and algorithms for an automated construction of parallelizations in process models, an automated adaptation of process models and an automated construction of multi-actor process models are provided
Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim?
Memes can sway people's opinions over social media as they combine visual and
textual information in an easy-to-consume manner. Since memes instantly turn
viral, it becomes crucial to infer their intent and potentially associated
harmfulness to take timely measures as needed. A common problem associated with
meme comprehension lies in detecting the entities referenced and characterizing
the role of each of these entities. Here, we aim to understand whether the meme
glorifies, vilifies, or victimizes each entity it refers to. To this end, we
address the task of role identification of entities in harmful memes, i.e.,
detecting who is the 'hero', the 'villain', and the 'victim' in the meme, if
any. We utilize HVVMemes - a memes dataset on US Politics and Covid-19 memes,
released recently as part of the CONSTRAINT@ACL-2022 shared-task. It contains
memes, entities referenced, and their associated roles: hero, villain, victim,
and other. We further design VECTOR (Visual-semantic role dEteCToR), a robust
multi-modal framework for the task, which integrates entity-based contextual
information in the multi-modal representation and compare it to several
standard unimodal (text-only or image-only) or multi-modal (image+text) models.
Our experimental results show that our proposed model achieves an improvement
of 4% over the best baseline and 1% over the best competing stand-alone
submission from the shared-task. Besides divulging an extensive experimental
setup with comparative analyses, we finally highlight the challenges
encountered in addressing the complex task of semantic role labeling within
memes.Comment: Accepted at EACL 2023 (Main Track). 9 Pages (main content),
Limitations, Ethical Considerations + 4 Pages (Refs.) + Appendix; 8 Figures;
5 Tables; Paper ID: 80
Big Data Computing for Geospatial Applications
The convergence of big data and geospatial computing has brought forth challenges and opportunities to Geographic Information Science with regard to geospatial data management, processing, analysis, modeling, and visualization. This book highlights recent advancements in integrating new computing approaches, spatial methods, and data management strategies to tackle geospatial big data challenges and meanwhile demonstrates opportunities for using big data for geospatial applications. Crucial to the advancements highlighted in this book is the integration of computational thinking and spatial thinking and the transformation of abstract ideas and models to concrete data structures and algorithms
Entity-Oriented Search
This open access book covers all facets of entity-oriented search—where “search” can be interpreted in the broadest sense of information access—from a unified point of view, and provides a coherent and comprehensive overview of the state of the art. It represents the first synthesis of research in this broad and rapidly developing area. Selected topics are discussed in-depth, the goal being to establish fundamental techniques and methods as a basis for future research and development. Additional topics are treated at a survey level only, containing numerous pointers to the relevant literature. A roadmap for future research, based on open issues and challenges identified along the way, rounds out the book. The book is divided into three main parts, sandwiched between introductory and concluding chapters. The first two chapters introduce readers to the basic concepts, provide an overview of entity-oriented search tasks, and present the various types and sources of data that will be used throughout the book. Part I deals with the core task of entity ranking: given a textual query, possibly enriched with additional elements or structural hints, return a ranked list of entities. This core task is examined in a number of different variants, using both structured and unstructured data collections, and numerous query formulations. In turn, Part II is devoted to the role of entities in bridging unstructured and structured data. Part III explores how entities can enable search engines to understand the concepts, meaning, and intent behind the query that the user enters into the search box, and how they can provide rich and focused responses (as opposed to merely a list of documents)—a process known as semantic search. The final chapter concludes the book by discussing the limitations of current approaches, and suggesting directions for future research. Researchers and graduate students are the primary target audience of this book. A general background in information retrieval is sufficient to follow the material, including an understanding of basic probability and statistics concepts as well as a basic knowledge of machine learning concepts and supervised learning algorithms
Data Mining
The availability of big data due to computerization and automation has generated an urgent need for new techniques to analyze and convert big data into useful information and knowledge. Data mining is a promising and leading-edge technology for mining large volumes of data, looking for hidden information, and aiding knowledge discovery. It can be used for characterization, classification, discrimination, anomaly detection, association, clustering, trend or evolution prediction, and much more in fields such as science, medicine, economics, engineering, computers, and even business analytics. This book presents basic concepts, ideas, and research in data mining
Multimodal Explainable Artificial Intelligence: A Comprehensive Review of Methodological Advances and Future Research Directions
The current study focuses on systematically analyzing the recent advances in
the field of Multimodal eXplainable Artificial Intelligence (MXAI). In
particular, the relevant primary prediction tasks and publicly available
datasets are initially described. Subsequently, a structured presentation of
the MXAI methods of the literature is provided, taking into account the
following criteria: a) The number of the involved modalities, b) The stage at
which explanations are produced, and c) The type of the adopted methodology
(i.e. mathematical formalism). Then, the metrics used for MXAI evaluation are
discussed. Finally, a comprehensive analysis of current challenges and future
research directions is provided.Comment: 26 pages, 11 figure
- …