3,313 research outputs found
Automatic Gloss Finding for a Knowledge Base using Ontological Constraints
While there has been much research on automatically construct-ing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB. However, a useful KB must go beyond facts. For example, glosses (short natural language defi-nitions) have been found to be very useful in tasks such as Word Sense Disambiguation. However, the important problem of Auto-matic Gloss Finding, i.e., assigning glosses to entities in an ini-tially gloss-free KB, is relatively unexplored. We address that gap in this paper. In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available onto-logical constraints. To the best of our knowledge, GLOFIN is the first system for this task. Through extensive experiments on real-world datasets, we demon-strate GLOFINâs effectiveness. It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings. We also demonstrate GLOFINâs robustness to noise through experiments on a wide variety of KBs, ranging from user contributed (e.g., Freebase) to automatically constructed (e.g., NELL). To facilitate further research in this area, we have already made the datasets and code used in this paper publicly available. 1
Large-Scale information extraction from textual definitions through deep syntactic and semantic analysis
We present DEFIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DEFIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations
Dealing with uncertain entities in ontology alignment using rough sets
This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Ontology alignment facilitates exchange of knowledge among heterogeneous data sources. Many approaches to ontology alignment use multiple similarity measures to map entities between ontologies. However, it remains a key challenge in dealing with uncertain entities for which the employed ontology alignment measures produce conflicting results on similarity of the mapped entities. This paper presents OARS, a rough-set based approach to ontology alignment which achieves a high degree of accuracy in situations where uncertainty arises because of the conflicting results generated by different similarity measures. OARS employs a combinational approach and considers both lexical and structural similarity measures. OARS is extensively evaluated with the benchmark ontologies of the ontology alignment evaluation initiative (OAEI) 2010, and performs best in the aspect of recall in comparison with a number of alignment systems while generating a comparable performance in precision
Towards semantic alignment of heterogeneous structures and its application to digital humanities
Different variants of the notion of âalignmentâ have been adopted in a range of areas, focusing on homogeneous structures (e.g., text alignment [8], database alignment [1] or ontology alignment [4]) or heterogeneous structures (e.g., annotation of text with on- tologies [3], alignment of dictionaries and ontologies [2], alignments between relational databases and ontologies [9]). These alignment approaches, however, take little account of the alignment of multiple structures. This type of approach is becoming increasingly necessary to manage the growing volume of unstructured information sources available on the Web (encyclopedias such as Wikipedia, social media data, etc.) and LOD knowl- edge bases. In addition, the approaches are mostly developed for the English language. These needs have to be addressed through a global vision of alignment that takes into account a multiplicity of structures in which knowledge can be expressed. This paper seeks a holistic approach to semantic computing and alignment, when considering het- erogeneous structures in which knowledge is represented.FCT CEECIND/01997/2017, UIDB/00057/202
Semantic framework for regulatory compliance support
Regulatory Compliance Management (RCM) is a management process, which an organization
implements to conform to regulatory guidelines. Some processes that contribute towards
automating RCM are: (i) extraction of meaningful entities from the regulatory text and (ii)
mapping regulatory guidelines with organisational processes. These processes help in updating
the RCM with changes in regulatory guidelines. The update process is still manual since there
are comparatively less research in this direction. The Semantic Web technologies are potential
candidates in order to make the update process automatic. There are stand-alone frameworks
that use Semantic Web technologies such as Information Extraction, Ontology Population,
Similarities and Ontology Mapping. However, integration of these innovative approaches in
the semantic compliance management has not been explored yet. Considering these two
processes as crucial constituents, the aim of this thesis is to automate the processes of RCM. It
proposes a framework called, RegCMantic.
The proposed framework is designed and developed in two main phases. The first part of the
framework extracts the regulatory entities from regulatory guidelines. The extraction of
meaningful entities from the regulatory guidelines helps in relating the regulatory guidelines
with organisational processes. The proposed framework identifies the document-components
and extracts the entities from the document-components. The framework extracts important
regulatory entities using four components: (i) parser, (ii) definition terms, (iii) ontological
concepts and (iv) rules. The parsers break down a sentence into useful segments. The
extraction is carried out by using the definition terms, ontological concepts and the rules in the
segments. The entities extracted are the core-entities such as subject, action and obligation, and
the aux-entities such as time, place, purpose, procedure and condition.
The second part of the framework relates the regulatory guidelines with organisational
processes. The proposed framework uses a mapping algorithm, which considers three types of
Abstract
3
entities in the regulatory-domain and two types of entities in the process-domains. In the
regulatory-domain, the considered entities are regulation-topic, core-entities and aux-entities.
Whereas, in the process-domain, the considered entities are subject and action. Using these
entities, it computes aggregation of three types of similarity scores: topic-score, core-score and
aux-score. The aggregate similarity score determines whether a regulatory guideline is related
to an organisational process.
The RegCMantic framework is validated through the development of a prototype system. The
prototype system implements a case study, which involves regulatory guidelines governing the
Pharmaceutical industries in the UK. The evaluation of the results from the case-study has
shown improved accuracy in extraction of the regulatory entities and relating regulatory
guidelines with organisational processes. This research has contributed in extracting
meaningful entities from regulatory guidelines, which are provided in unstructured text and
mapping the regulatory guidelines with organisational processes semantically
Foundational Ontologies meet Ontology Matching: A Survey
Ontology matching is a research area aimed at finding ways to make different ontologies interoperable. Solutions to the problem have been proposed from different disciplines, including databases, natural language processing, and machine learning. The role of foundational ontologies for ontology matching is an important one. It is multifaceted and with room for development. This paper presents an overview of the different tasks involved in ontology matching that consider foundational ontologies. We discuss the strengths and weaknesses of existing proposals and highlight the challenges to be addressed in the future
Probabilistic Label Relation Graphs with Ising Models
We consider classification problems in which the label space has structure. A
common example is hierarchical label spaces, corresponding to the case where
one label subsumes another (e.g., animal subsumes dog). But labels can also be
mutually exclusive (e.g., dog vs cat) or unrelated (e.g., furry, carnivore). To
jointly model hierarchy and exclusion relations, the notion of a HEX (hierarchy
and exclusion) graph was introduced in [7]. This combined a conditional random
field (CRF) with a deep neural network (DNN), resulting in state of the art
results when applied to visual object classification problems where the
training labels were drawn from different levels of the ImageNet hierarchy
(e.g., an image might be labeled with the basic level category "dog", rather
than the more specific label "husky"). In this paper, we extend the HEX model
to allow for soft or probabilistic relations between labels, which is useful
when there is uncertainty about the relationship between two labels (e.g., an
antelope is "sort of" furry, but not to the same degree as a grizzly bear). We
call our new model pHEX, for probabilistic HEX. We show that the pHEX graph can
be converted to an Ising model, which allows us to use existing off-the-shelf
inference methods (in contrast to the HEX method, which needed specialized
inference algorithms). Experimental results show significant improvements in a
number of large-scale visual object classification tasks, outperforming the
previous HEX model.Comment: International Conference on Computer Vision (2015
- âŠ