3,313 research outputs found

    Automatic Gloss Finding for a Knowledge Base using Ontological Constraints

    Full text link
    While there has been much research on automatically construct-ing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB. However, a useful KB must go beyond facts. For example, glosses (short natural language defi-nitions) have been found to be very useful in tasks such as Word Sense Disambiguation. However, the important problem of Auto-matic Gloss Finding, i.e., assigning glosses to entities in an ini-tially gloss-free KB, is relatively unexplored. We address that gap in this paper. In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available onto-logical constraints. To the best of our knowledge, GLOFIN is the first system for this task. Through extensive experiments on real-world datasets, we demon-strate GLOFIN’s effectiveness. It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings. We also demonstrate GLOFIN’s robustness to noise through experiments on a wide variety of KBs, ranging from user contributed (e.g., Freebase) to automatically constructed (e.g., NELL). To facilitate further research in this area, we have already made the datasets and code used in this paper publicly available. 1

    Large-Scale information extraction from textual definitions through deep syntactic and semantic analysis

    Get PDF
    We present DEFIE, an approach to large-scale Information Extraction (IE) based on a syntactic-semantic analysis of textual definitions. Given a large corpus of definitions we leverage syntactic dependencies to reduce data sparsity, then disambiguate the arguments and content words of the relation strings, and finally exploit the resulting information to organize the acquired relations hierarchically. The output of DEFIE is a high-quality knowledge base consisting of several million automatically acquired semantic relations

    Dealing with uncertain entities in ontology alignment using rough sets

    Get PDF
    This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Ontology alignment facilitates exchange of knowledge among heterogeneous data sources. Many approaches to ontology alignment use multiple similarity measures to map entities between ontologies. However, it remains a key challenge in dealing with uncertain entities for which the employed ontology alignment measures produce conflicting results on similarity of the mapped entities. This paper presents OARS, a rough-set based approach to ontology alignment which achieves a high degree of accuracy in situations where uncertainty arises because of the conflicting results generated by different similarity measures. OARS employs a combinational approach and considers both lexical and structural similarity measures. OARS is extensively evaluated with the benchmark ontologies of the ontology alignment evaluation initiative (OAEI) 2010, and performs best in the aspect of recall in comparison with a number of alignment systems while generating a comparable performance in precision

    Towards semantic alignment of heterogeneous structures and its application to digital humanities

    Get PDF
    Different variants of the notion of ‘alignment’ have been adopted in a range of areas, focusing on homogeneous structures (e.g., text alignment [8], database alignment [1] or ontology alignment [4]) or heterogeneous structures (e.g., annotation of text with on- tologies [3], alignment of dictionaries and ontologies [2], alignments between relational databases and ontologies [9]). These alignment approaches, however, take little account of the alignment of multiple structures. This type of approach is becoming increasingly necessary to manage the growing volume of unstructured information sources available on the Web (encyclopedias such as Wikipedia, social media data, etc.) and LOD knowl- edge bases. In addition, the approaches are mostly developed for the English language. These needs have to be addressed through a global vision of alignment that takes into account a multiplicity of structures in which knowledge can be expressed. This paper seeks a holistic approach to semantic computing and alignment, when considering het- erogeneous structures in which knowledge is represented.FCT CEECIND/01997/2017, UIDB/00057/202

    Semantic framework for regulatory compliance support

    Get PDF
    Regulatory Compliance Management (RCM) is a management process, which an organization implements to conform to regulatory guidelines. Some processes that contribute towards automating RCM are: (i) extraction of meaningful entities from the regulatory text and (ii) mapping regulatory guidelines with organisational processes. These processes help in updating the RCM with changes in regulatory guidelines. The update process is still manual since there are comparatively less research in this direction. The Semantic Web technologies are potential candidates in order to make the update process automatic. There are stand-alone frameworks that use Semantic Web technologies such as Information Extraction, Ontology Population, Similarities and Ontology Mapping. However, integration of these innovative approaches in the semantic compliance management has not been explored yet. Considering these two processes as crucial constituents, the aim of this thesis is to automate the processes of RCM. It proposes a framework called, RegCMantic. The proposed framework is designed and developed in two main phases. The first part of the framework extracts the regulatory entities from regulatory guidelines. The extraction of meaningful entities from the regulatory guidelines helps in relating the regulatory guidelines with organisational processes. The proposed framework identifies the document-components and extracts the entities from the document-components. The framework extracts important regulatory entities using four components: (i) parser, (ii) definition terms, (iii) ontological concepts and (iv) rules. The parsers break down a sentence into useful segments. The extraction is carried out by using the definition terms, ontological concepts and the rules in the segments. The entities extracted are the core-entities such as subject, action and obligation, and the aux-entities such as time, place, purpose, procedure and condition. The second part of the framework relates the regulatory guidelines with organisational processes. The proposed framework uses a mapping algorithm, which considers three types of Abstract 3 entities in the regulatory-domain and two types of entities in the process-domains. In the regulatory-domain, the considered entities are regulation-topic, core-entities and aux-entities. Whereas, in the process-domain, the considered entities are subject and action. Using these entities, it computes aggregation of three types of similarity scores: topic-score, core-score and aux-score. The aggregate similarity score determines whether a regulatory guideline is related to an organisational process. The RegCMantic framework is validated through the development of a prototype system. The prototype system implements a case study, which involves regulatory guidelines governing the Pharmaceutical industries in the UK. The evaluation of the results from the case-study has shown improved accuracy in extraction of the regulatory entities and relating regulatory guidelines with organisational processes. This research has contributed in extracting meaningful entities from regulatory guidelines, which are provided in unstructured text and mapping the regulatory guidelines with organisational processes semantically

    Foundational Ontologies meet Ontology Matching: A Survey

    Get PDF
    Ontology matching is a research area aimed at finding ways to make different ontologies interoperable. Solutions to the problem have been proposed from different disciplines, including databases, natural language processing, and machine learning. The role of foundational ontologies for ontology matching is an important one. It is multifaceted and with room for development. This paper presents an overview of the different tasks involved in ontology matching that consider foundational ontologies. We discuss the strengths and weaknesses of existing proposals and highlight the challenges to be addressed in the future

    Probabilistic Label Relation Graphs with Ising Models

    Full text link
    We consider classification problems in which the label space has structure. A common example is hierarchical label spaces, corresponding to the case where one label subsumes another (e.g., animal subsumes dog). But labels can also be mutually exclusive (e.g., dog vs cat) or unrelated (e.g., furry, carnivore). To jointly model hierarchy and exclusion relations, the notion of a HEX (hierarchy and exclusion) graph was introduced in [7]. This combined a conditional random field (CRF) with a deep neural network (DNN), resulting in state of the art results when applied to visual object classification problems where the training labels were drawn from different levels of the ImageNet hierarchy (e.g., an image might be labeled with the basic level category "dog", rather than the more specific label "husky"). In this paper, we extend the HEX model to allow for soft or probabilistic relations between labels, which is useful when there is uncertainty about the relationship between two labels (e.g., an antelope is "sort of" furry, but not to the same degree as a grizzly bear). We call our new model pHEX, for probabilistic HEX. We show that the pHEX graph can be converted to an Ising model, which allows us to use existing off-the-shelf inference methods (in contrast to the HEX method, which needed specialized inference algorithms). Experimental results show significant improvements in a number of large-scale visual object classification tasks, outperforming the previous HEX model.Comment: International Conference on Computer Vision (2015
    • 

    corecore