9,486 research outputs found
Taxonomy Induction using Hypernym Subsequences
We propose a novel, semi-supervised approach towards domain taxonomy
induction from an input vocabulary of seed terms. Unlike all previous
approaches, which typically extract direct hypernym edges for terms, our
approach utilizes a novel probabilistic framework to extract hypernym
subsequences. Taxonomy induction from extracted subsequences is cast as an
instance of the minimumcost flow problem on a carefully designed directed
graph. Through experiments, we demonstrate that our approach outperforms
stateof- the-art taxonomy induction approaches across four languages.
Importantly, we also show that our approach is robust to the presence of noise
in the input vocabulary. To the best of our knowledge, no previous approaches
have been empirically proven to manifest noise-robustness in the input
vocabulary
Towards Building a Knowledge Base of Monetary Transactions from a News Collection
We address the problem of extracting structured representations of economic
events from a large corpus of news articles, using a combination of natural
language processing and machine learning techniques. The developed techniques
allow for semi-automatic population of a financial knowledge base, which, in
turn, may be used to support a range of data mining and exploration tasks. The
key challenge we face in this domain is that the same event is often reported
multiple times, with varying correctness of details. We address this challenge
by first collecting all information pertinent to a given event from the entire
corpus, then considering all possible representations of the event, and
finally, using a supervised learning method, to rank these representations by
the associated confidence scores. A main innovative element of our approach is
that it jointly extracts and stores all attributes of the event as a single
representation (quintuple). Using a purpose-built test set we demonstrate that
our supervised learning approach can achieve 25% improvement in F1-score over
baseline methods that consider the earliest, the latest or the most frequent
reporting of the event.Comment: Proceedings of the 17th ACM/IEEE-CS Joint Conference on Digital
Libraries (JCDL '17), 201
Knowledge Organization Systems (KOS) in the Semantic Web: A Multi-Dimensional Review
Since the Simple Knowledge Organization System (SKOS) specification and its
SKOS eXtension for Labels (SKOS-XL) became formal W3C recommendations in 2009 a
significant number of conventional knowledge organization systems (KOS)
(including thesauri, classification schemes, name authorities, and lists of
codes and terms, produced before the arrival of the ontology-wave) have made
their journeys to join the Semantic Web mainstream. This paper uses "LOD KOS"
as an umbrella term to refer to all of the value vocabularies and lightweight
ontologies within the Semantic Web framework. The paper provides an overview of
what the LOD KOS movement has brought to various communities and users. These
are not limited to the colonies of the value vocabulary constructors and
providers, nor the catalogers and indexers who have a long history of applying
the vocabularies to their products. The LOD dataset producers and LOD service
providers, the information architects and interface designers, and researchers
in sciences and humanities, are also direct beneficiaries of LOD KOS. The paper
examines a set of the collected cases (experimental or in real applications)
and aims to find the usages of LOD KOS in order to share the practices and
ideas among communities and users. Through the viewpoints of a number of
different user groups, the functions of LOD KOS are examined from multiple
dimensions. This paper focuses on the LOD dataset producers, vocabulary
producers, and researchers (as end-users of KOS).Comment: 31 pages, 12 figures, accepted paper in International Journal on
Digital Librarie
A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web
Over the past decade, rapid advances in web technologies, coupled with
innovative models of spatial data collection and consumption, have generated a
robust growth in geo-referenced information, resulting in spatial information
overload. Increasing 'geographic intelligence' in traditional text-based
information retrieval has become a prominent approach to respond to this issue
and to fulfill users' spatial information needs. Numerous efforts in the
Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the
Linking Open Data initiative have converged in a constellation of open
knowledge bases, freely available online. In this article, we survey these open
knowledge bases, focusing on their geospatial dimension. Particular attention
is devoted to the crucial issue of the quality of geo-knowledge bases, as well
as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic
Network, is outlined as our contribution to this area. Research directions in
information integration and Geographic Information Retrieval (GIR) are then
reviewed, with a critical discussion of their current limitations and future
prospects
- …