183 research outputs found
Neural Coreference Resolution for Turkish
Coreference resolution deals with resolving mentions of the same underlying entity in a given text. This challenging task is an indispensable aspect of text understanding and has important applications in various language processing systems such as question answering and machine translation. Although a significant amount of studies is devoted to coreference resolution, the research on Turkish is scarce and mostly limited to pronoun resolution. To our best knowledge, this article presents the first neural Turkish coreference resolution study where two learning-based models are explored. Both models follow the mention-ranking approach while forming clusters of mentions. The first model uses a set of hand-crafted features whereas the second coreference model relies on embeddings learned from large-scale pre-trained language models for capturing similarities between a mention and its candidate antecedents. Several language models trained specifically for Turkish are used to obtain mention representations and their effectiveness is compared in conducted experiments using automatic metrics. We argue that the results of this study shed light on the possible contributions of neural architectures to Turkish coreference resolution.119683
Recommended from our members
A modular, open-source information extraction framework for identifying clinical concepts and processes of care in clinical narratives
In this thesis, a synthesis is presented of the knowledge models required by clinical informa- tion systems that provide decision support for longitudinal processes of care. Qualitative research techniques and thematic analysis are novelly applied to a systematic review of the literature on the challenges in implementing such systems, leading to the development of an original conceptual framework. The thesis demonstrates how these process-oriented systems make use of a knowledge base derived from workflow models and clinical guidelines, and argues that one of the major barriers to implementation is the need to extract explicit and implicit information from diverse resources in order to construct the knowledge base. Moreover, concepts in both the knowledge base and in the electronic health record (EHR) must be mapped to a common ontological model. However, the majority of clinical guideline information remains in text form, and much of the useful clinical information residing in the EHR resides in the free text fields of progress notes and laboratory reports. In this thesis, it is shown how natural language processing and information extraction techniques provide a means to identify and formalise the knowledge components required by the knowledge base. Original contributions are made in the development of lexico-syntactic patterns and the use of external domain knowledge resources to tackle a variety of information extraction tasks in the clinical domain, such as recognition of clinical concepts, events, temporal relations, term disambiguation and abbreviation expansion. Methods are developed for adapting existing tools and resources in the biomedical domain to the processing of clinical texts, and approaches to improving the scalability of these tools are proposed and evalu- ated. These tools and techniques are then combined in the creation of a novel approach to identifying processes of care in the clinical narrative. It is demonstrated that resolution of coreferential and anaphoric relations as narratively and temporally ordered chains provides a means to extract linked narrative events and processes of care from clinical notes. Coreference performance in discharge summaries and progress notes is largely dependent on correct identification of protagonist chains (patient, clinician, family relation), pronominal resolution, and string matching that takes account of experiencer, temporal, spatial, and anatomical context; whereas for laboratory reports additional, external domain knowledge is required. The types of external knowledge and their effects on system performance are identified and evaluated. Results are compared against existing systems for solving these tasks and are found to improve on them, or to approach the performance of recently reported, state-of-the- art systems. Software artefacts developed in this research have been made available as open-source components within the General Architecture for Text Engineering framework
Recommended from our members
Advances in statistical script learning
When humans encode information into natural language, they do so with the
clear assumption that the reader will be able to seamlessly make inferences
based on world knowledge. For example, given the sentence ``Mrs. Dalloway said
she would buy the flowers herself,'' one can make a number of probable
inferences based on event co-occurrences: she bought flowers, she went to a
store, she took the flowers home, and so on.
Observing this, it is clear that many different useful natural language
end-tasks could benefit from models of events as they typically co-occur
(so-called script models).
Robust question-answering systems must be able to infer highly-probable implicit
events from what is explicitly stated in a text, as must robust
information-extraction systems that map from unstructured text to formal
assertions about relations expressed in the text. Coreference resolution
systems, semantic role labeling, and even syntactic parsing systems could, in
principle, benefit from event co-occurrence models.
To this end, we present a number of contributions related to statistical
event co-occurrence models. First, we investigate a method of incorporating
multiple entities into events in a count-based co-occurrence model. We find that
modeling multiple entities interacting across events allows for improved
empirical performance on the task of modeling sequences of events in documents.
Second, we give a method of applying Recurrent Neural Network sequence models
to the task of predicting held-out predicate-argument structures from documents.
This model allows us to easily incorporate entity noun information, and can
allow for more complex, higher-arity events than a count-based co-occurrence
model. We find the neural model improves performance considerably over the
count-based co-occurrence model.
Third, we investigate the performance of a sequence-to-sequence encoder-decoder
neural model on the task of predicting held-out predicate-argument events from
text. This model does not explicitly model any external syntactic information,
and does not require a parser. We find the text-level model to be competitive in
predictive performance with an event level model directly mediated by an
external syntactic analysis.
Finally, motivated by this result, we investigate incorporating features derived
from these models into a baseline noun coreference resolution system. We find
that, while our additional features do not appreciably improve top-level
performance, we can nonetheless provide empirical improvement on a number of
restricted classes of difficult coreference decisions.Computer Science
Inducing Stereotypical Character Roles from Plot Structure
If we are to understand stories, we must understand characters: characters are central to every narrative and drive the action forward. Critically, many stories (especially cultural ones) employ stereotypical character roles in their stories for different purposes, including efficient communication among bundles of default characteristics and associations, ease understanding of those characters\u27 role in the overall narrative, and many more. These roles include ideas such as hero, villain, or victim, as well as culturally-specific roles such as, for example, the donor (in Russian tales) or the trickster (in Native American tales). My thesis aims to learn these roles automatically, inducing them from data using a clustering technique.
The first step of learning character roles, however, is to identify which coreference chains correspond to characters, which are defined by narratologists as animate entities that drive the plot forward. The first part of my work has focused on this character identification problem, specifically focusing on the problem of animacy detection. Prior work treated animacy as a word-level property, and researchers developed statistical models to classify words as either animate or inanimate. I claimed this approach to the problem is ill-posed and presented a new hybrid approach for classifying the animacy of coreference chains that achieved state-of-the-art performance.
The next step of my work is to develop approaches first to identify the characters and then a new unsupervised clustering approach to learn stereotypical roles. My character identification system consists of two stages: first, I detect animate chains from the coreference chains using my existing animacy detector; second, I apply a supervised machine learning model that identifies which of those chains qualify as characters. I proposed a narratologically grounded definition of character and built a supervised machine learning model with a small set of features that achieved state-of-the-art performance.
In the last step, I successfully implemented a clustering approach with plot and thematic information to cluster the archetypes. This work resulted in a completely new approach to understanding the structure of stories, greatly advancing the state-of-the-art of story understanding
Structured learning with latent trees: a joint approach to coreference resolution
This thesis explores ways to define automated coreference resolution systems by using structured machine learning techniques. We design supervised models that learn to build coreference clusters from raw text: our main objective is to get model able to process documentsglobally, in a structured fashion, to ensure coherent outputs. Our models are trained and evaluated on the English part of the CoNLL-2012 Shared Task annotated corpus with standard metrics. We carry out detailed comparisons of different settings so as to refine our models anddesign a complete end-to-end coreference resolver. Specifically, we first carry out a preliminary work on improving the way features areemployed by linear models for classification: we extend existing work on separating different types of mention pairs to define more accurate classifiers of coreference links. We then define various structured models based on latent trees to learn to build clusters globally, andnot only from the predictions of a mention pair classifier. We study different latent representations (various shapes and sparsity) and show empirically that the best suited structure is some restricted class of trees related to the best-first rule for selecting coreference links. Wefurther improve this latent representation by integrating anaphoricity modelling jointly with coreference, designing a global (structured at the document level) and joint model outperforming existing models on gold mentions evaluation. We finally design a complete end-to-endresolver and evaluate the improvement obtained by our new models on detected mentions, a more realistic setting for coreference resolution
A Survey on Semantic Processing Techniques
Semantic processing is a fundamental research domain in computational
linguistics. In the era of powerful pre-trained language models and large
language models, the advancement of research in this domain appears to be
decelerating. However, the study of semantics is multi-dimensional in
linguistics. The research depth and breadth of computational semantic
processing can be largely improved with new technologies. In this survey, we
analyzed five semantic processing tasks, e.g., word sense disambiguation,
anaphora resolution, named entity recognition, concept extraction, and
subjectivity detection. We study relevant theoretical research in these fields,
advanced methods, and downstream applications. We connect the surveyed tasks
with downstream applications because this may inspire future scholars to fuse
these low-level semantic processing tasks with high-level natural language
processing tasks. The review of theoretical research may also inspire new tasks
and technologies in the semantic processing domain. Finally, we compare the
different semantic processing techniques and summarize their technical trends,
application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN
1566-2535. The equal contribution mark is missed in the published version due
to the publication policies. Please contact Prof. Erik Cambria for detail
Intelligent Information Access to Linked Data - Weaving the Cultural Heritage Web
The subject of the dissertation is an information alignment experiment of two cultural heritage information systems (ALAP): The Perseus Digital Library and Arachne. In modern societies, information integration is gaining importance for many tasks such as business decision making or even catastrophe management. It is beyond doubt that the information available in digital form can offer users new ways of interaction. Also, in the humanities and cultural heritage communities, more and more information is being published online. But in many situations the way that information has been made publicly available is disruptive to the research process due to its heterogeneity and distribution. Therefore integrated information will be a key factor to pursue successful research, and the need for information alignment is widely recognized.
ALAP is an attempt to integrate information from Perseus and Arachne, not only on a schema level, but to also perform entity resolution. To that end, technical peculiarities and philosophical implications of the concepts of identity and co-reference are discussed. Multiple approaches to information integration and entity resolution are discussed and evaluated. The methodology that is used to implement ALAP is mainly rooted in the fields of information retrieval and knowledge discovery.
First, an exploratory analysis was performed on both information systems to get a first impression of the data. After that, (semi-)structured information from both systems was extracted and normalized. Then, a clustering algorithm was used to reduce the number of needed entity comparisons. Finally, a thorough matching was performed on the different clusters. ALAP helped with identifying challenges and highlighted the opportunities that arise during the attempt to align cultural heritage information systems
- …