Search CORE

19 research outputs found

Ranking and Retrieval under Semantic Relevance

Author: Chen Tongfei
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 16/02/2021
Field of study

This thesis presents a series of conceptual and empirical developments on the ranking and retrieval of candidates under semantic relevance. Part I of the thesis introduces the concept of uncertainty in various semantic tasks (such as recognizing textual entailment) in natural language processing, and the machine learning techniques commonly employed to model these semantic phenomena. A unified view of ranking and retrieval will be presented, and the trade-off between model expressiveness, performance, and scalability in model design will be discussed. Part II of the thesis focuses on applying these ranking and retrieval techniques to text: Chapter 3 examines the feasibility of ranking hypotheses given a premise with respect to a human's subjective probability of the hypothesis happening, effectively extending the traditional categorical task of natural language inference. Chapter 4 focuses on detecting situation frames for documents using ranking methods. Then we extend the ranking notion to retrieval, and develop both sparse (Chapter 5) and dense (Chapter 6) vector-based methods to facilitate scalable retrieval for potential answer paragraphs in question answering. Part III turns the focus to mentions and entities in text, while continuing the theme on ranking and retrieval: Chapter 7 discusses the ranking of fine-grained types that an entity mention could belong to, leading to state-of-the-art performance on hierarchical multi-label fine-grained entity typing. Chapter 8 extends the semantic relation of coreference to a cross-document setting, enabling models to retrieve from a large corpus, instead of in a single document, when resolving coreferent entity mentions

Johns Hopkins University

JScholarship

Reality in Perspectives

Author: Khalili Mahdi
Publication venue
Publication date: 01/01/2022
Field of study

This dissertation is about human knowledge of reality. In particular, it argues that scientific knowledge is bounded by historically available instruments and theories; nevertheless, the use of several independent instruments and theories can provide access to the persistent potentialities of reality. The replicability of scientific observations and experiments allows us to obtain explorable evidence of robust entities and properties. The dissertation includes seven chapters. It also studies three cases – namely, Higgs bosons and hypothetical Ϝ-particles (section 2.4), the Ptolemaic and Kepler model of the planets (section 6.7), and the special theory of relativity (chapter 7). Chapter 1 is the introduction of the dissertation. Chapter 2 clarifies the notion of the real on the basis of two concepts: persistence and resistance. These concepts enable me to explain my ontological belief in the real potentialities of human-independent things and the implications of this view for the perceptual and epistemological levels of discussion. On the basis of the concept of “overlapping perspectives”, chapter 3 argues that entity realism and perspectivism are complementary. That is, an entity that manifests itself through several experimental/observational methods is something real, but our knowledge of its nature is perspectival. Critically studying the recent views of entity realism, chapter 4 extends the discussion of entity realism and provides a criterion for the reality of property tokens. Chapter 5, in contrast, develops the perspectival aspects of my view on the basis of the phenomenological-hermeneutical approaches to the philosophy of science. This chapter also elaborates my view of empirical evidence, as briefly expressed in sections 2.5 and 4.5. Chapter 6 concerns diachronic theoretical perspectives. It first explains my view of progress, according to which current perspectives are broader than past ones. Second, it argues that the successful explanations and predictions of abandoned theories can be accounted for from our currently acceptable perspectives. The case study of Ptolemaic astronomy supports the argument of this chapter. Chapter 7 serves as the conclusion of the dissertation by applying the central themes of the previous chapters to the case study of special relativity theory. I interpret frame-dependent properties, such as length and time duration, and the constancy of the speed of light according to realist perspectivism

PhilPapers

VU Research Portal

Sensory Representation and Cognitive Architecture: An alternative to phenomenal concepts

Author: Fazekas Peter
Jakab Zoltán
Publication venue
Publication date
Field of study

We present a cognitive-physicalist account of phenomenal consciousness. We argue that phenomenal concepts do not differ from other types of concepts. When explaining the peculiarities of conscious experience, the right place to look at is sensory/ perceptual representations and their interaction with general conceptual structures. We utilize Jerry Fodor’s psycho- semantic theory to formulate our view. We compare and contrast our view with that of Murat Aydede and Güven Güzeldere, who, using Dretskean psychosemantic theory, arrived at a solution different from ours in some ways. We have suggested that the representational atomism of certain sensory experiences plays a central role in reconstructing the epistemic gap associated with conscious experience, still, atomism is not the whole story. It needs to be supple- mented by some additional principles. We also add an account of introspection, and suggest some cognitive features that might distinguish representational atoms with phenomenal character from those without it

PhilPapers

Recommended from our members

A modular, open-source information extraction framework for identifying clinical concepts and processes of care in clinical narratives

Author: Gooch P.
Publication venue
Publication date
Field of study

In this thesis, a synthesis is presented of the knowledge models required by clinical informa- tion systems that provide decision support for longitudinal processes of care. Qualitative research techniques and thematic analysis are novelly applied to a systematic review of the literature on the challenges in implementing such systems, leading to the development of an original conceptual framework. The thesis demonstrates how these process-oriented systems make use of a knowledge base derived from workflow models and clinical guidelines, and argues that one of the major barriers to implementation is the need to extract explicit and implicit information from diverse resources in order to construct the knowledge base. Moreover, concepts in both the knowledge base and in the electronic health record (EHR) must be mapped to a common ontological model. However, the majority of clinical guideline information remains in text form, and much of the useful clinical information residing in the EHR resides in the free text fields of progress notes and laboratory reports. In this thesis, it is shown how natural language processing and information extraction techniques provide a means to identify and formalise the knowledge components required by the knowledge base. Original contributions are made in the development of lexico-syntactic patterns and the use of external domain knowledge resources to tackle a variety of information extraction tasks in the clinical domain, such as recognition of clinical concepts, events, temporal relations, term disambiguation and abbreviation expansion. Methods are developed for adapting existing tools and resources in the biomedical domain to the processing of clinical texts, and approaches to improving the scalability of these tools are proposed and evalu- ated. These tools and techniques are then combined in the creation of a novel approach to identifying processes of care in the clinical narrative. It is demonstrated that resolution of coreferential and anaphoric relations as narratively and temporally ordered chains provides a means to extract linked narrative events and processes of care from clinical notes. Coreference performance in discharge summaries and progress notes is largely dependent on correct identification of protagonist chains (patient, clinician, family relation), pronominal resolution, and string matching that takes account of experiencer, temporal, spatial, and anatomical context; whereas for laboratory reports additional, external domain knowledge is required. The types of external knowledge and their effects on system performance are identified and evaluated. Results are compared against existing systems for solving these tasks and are found to improve on them, or to approach the performance of recently reported, state-of-the- art systems. Software artefacts developed in this research have been made available as open-source components within the General Architecture for Text Engineering framework

City Research Online

Grounding event references in news

Author: Altena R.
Geerlings W.A.
Klingeren B. van
Lange W.C.M. de
Werf T.S.
Publication venue: School of Information Technologies
Publication date: 01/01/2000
Field of study

Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Sydney eScholarship

Radboud Repository

Dissertations of the University of Groningen

Anaphora resolution for Arabic machine translation :a case study of nafs

Author: Hamouda Wafya
Publication venue: Newcastle Univeristy
Publication date: 01/01/2014
Field of study

PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government

Newcastle University eTheses

A Survey on Semantic Processing Techniques

Author: Cambria Erik
Chen Guanyi
He Kai
Mao Rui
Ni Jinjie
Yang Zonglin
Zhang Xulang
Publication venue
Publication date: 22/10/2023
Field of study

Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

arXiv.org e-Print Archive

Unsupervised Induction of Frame-Based Linguistic Forms

Author: Ferraro Francis
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 22/05/2018
Field of study

This thesis studies the use of bulk, structured, linguistic annotations in order to perform unsupervised induction of meaning for three kinds of linguistic forms: words, sentences, and documents. The primary linguistic annotation I consider throughout this thesis are frames, which encode core linguistic, background or societal knowledge necessary to understand abstract concepts and real-world situations. I begin with an overview of linguistically-based structured meaning representation; I then analyze available large-scale natural language processing (NLP) and linguistic resources and corpora for their abilities to accommodate bulk, automatically-obtained frame annotations. I then proceed to induce meanings of the different forms, progressing from the word level, to the sentence level, and finally to the document level. I first show how to use these bulk annotations in order to better encode linguistic- and cognitive science backed semantic expectations within word forms. I then demonstrate a straightforward approach for learning large lexicalized and refined syntactic fragments, which encode and memoize commonly used phrases and linguistic constructions. Next, I consider two unsupervised models for document and discourse understanding; one is a purely generative approach that naturally accommodates layer annotations and is the first to capture and unify a complete frame hierarchy. The other conditions on limited amounts of external annotations, imputing missing values when necessary, and can more readily scale to large corpora. These discourse models help improve document understanding and type-level understanding

JScholarship

On Coreferring Text-extracted Event Descriptions with the aid of Ontological Reasoning

Author: Alessio Palmero Aprosio
Loris Bozzato
Luciano Serafini
Marco Rospocher
Stefano Borgo
Publication venue
Publication date: 01/01/2016
Field of study

Systems for automatic extraction of semantic information about events from large textual resources are now available: these tools are capable to generate RDF datasets about text extracted events and this knowledge can be used to reason over the recognized events. On the other hand, text based tasks for event recognition, as for example event coreference (i.e. recognizing whether two textual descriptions refer to the same event), do not take into account ontological information of the extracted events in their process. In this paper, we propose a method to derive event coreference on text extracted event data using semantic based rule reasoning. We demonstrate our method considering a limited (yet representative) set of event types: we introduce a formal analysis on their ontological properties and, on the base of this, we define a set of coreference criteria. We then implement these criteria as RDF-based reasoning rules to be applied on text extracted event data. We evaluate the effectiveness of our approach over a standard coreference benchmark dataset

arXiv.org e-Print Archive

Archivio della ricerca - Fondazione Bruno Kessler

On Coreferring Text-extracted Event Descriptions with the aid of Ontological Reasoning

Author: Alessio Palmero Aprosio
Loris Bozzato
Luciano Serafini
Marco Rospocher
Stefano Borgo
Publication venue
Publication date
Field of study

Archivio della ricerca - Fondazione Bruno Kessler