339,872 research outputs found

    Learning Correlations between Linguistic Indicators and Semantic Constraints: Reuse of Context-Dependent Descriptions of Entities

    Get PDF
    This paper presents the results of a study on the semantic constraints imposed on lexical choice by certain contextual indicators. We show how such indicators are computed and how correlations between them and the choice of a noun phrase description of a named entity can be automatically established using supervised learning. Based on this correlation, we have developed a technique for automatic lexical choice of descriptions of entities in text generation. We discuss the underlying relationship between the pragmatics of choosing an appropriate description that serves a specific purpose in the automatically generated text and the semantics of the description itself. We present our work in the framework of the more general concept of reuse of linguistic structures that are automatically extracted from large corpora. We present a formal evaluation of our approach and we conclude with some thoughts on potential applications of our method.Comment: 7 pages, uses colacl.sty and acl.bst, uses epsfig. To appear in the Proceedings of the Joint 17th International Conference on Computational Linguistics 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL'98

    An Application of Fuzzy Inductive Logic Programming for Textual Entailment and Value Mining

    Get PDF
    The aim of this preliminary report is to give an overview of textual entailment in natural language processing (NLP), to present our approach to research and to explain the possible applications for such a system. Our system presupposes several modules, namely the sentiment analysis module, the anaphora resolution module, the named entity recognition module and the relationship extraction module. State-of-the-art modules will be used but no amount of research will go into this. The research focuses on the main module that extracts background knowledge from the extracted relationships via resolution and inverse resolution (inductive logic programming). The last part focuses on possible economic applications of our research

    Towards hypermedia support in database systems

    Get PDF
    The general goal of our research is to automatically generate links and other hypermedia related services to analytical applications. Using a dynamic hypermedia engine (DHE), the following features have been automated for database systems. Based on the database\u27s relational (physical) schema and its original (non-normalized) entity-relationship specification links are generated, database application developers may also specify the relationship between different classes of database elements. These elements can be controlled by the same or different database application, or even by another software system. A DHE prototype has been developed and illustrates the above for a relational database management system. The DHE is the only approach to automated linking that specializes in adding a hyperlinks automatically to analytical applications that generate their displays dynamically (e.g., as the result of a user query). The DHE\u27s linking is based on the structure of the application, not keyword search or lexical analysis based on the display values within its screens and documents. The DHE aims to provide hypermedia functionality without altering applications by building application wrappers as an intermediary between the applications and the engine

    An automated materials and processes identification tool for material informatics using deep learning approach

    Get PDF
    This article reports a tool that enables Materials Informatics, termed as MatRec, via a deep learning approach. The tool captures data, makes appropriate domain suggestions, extracts various entities such as materials and processes, and helps to establish entity-value relationships. This tool uses keyword extraction, a document similarity index to suggest relevant documents, and a deep learning approach employing Bi-LSTM for entity extraction. For example, materials and processes for electrical charge storage under an electric double layer capacitor (EDLC) mechanism are demonstrated herewith. A knowledge graph approach finds and visualizes different latent knowledge sets from the processed information. The MatRec received an F1 score of 9̃6% for entity extraction, 8̃3% for material-value relationship extraction, and 8̃7% for process-value relationship extraction, respectively. The proposed MatRec could be extended to solve material selection issues for various applications and could be an excellent tool for academia and industry

    Advanced Entity Resolution Techniques

    Get PDF
    Entity resolution is the task of determining which records in one or more data sets correspond to the same real-world entities. Entity resolution is an important problem with a range of applications for government agencies, commercial organisations, and research institutions. Due to the important practical applications and many open challenges, entity resolution is an active area of research and a variety of techniques have been developed for each part of the entity resolution process. This thesis is about trying to improve the viability of sophisticated entity resolution techniques for real-world entity resolution problems. Collective entity resolution techniques are a subclass of entity resolution approaches that incorporate relationships into the entity resolution process and introduce dependencies between matching decisions. Group linkage techniques match multiple related records at the same time. Temporal entity resolution techniques incorporate changing attribute values and relationships into the entity resolution process. Population reconstruction techniques match records with different entity roles and very limited information in the presence of domain constraints. Sophisticated entity resolution techniques such as these produce good results when applied to small data sets in an academic environment. However, they suffer from a number of limitations which make them harder to apply to real-world problems. In this thesis, we aim to address several of these limitations with the goal that this will enable such advanced entity resolution techniques to see more use in practical applications. One of the main limitations of existing advanced entity resolution techniques is poor scalability. We propose a novel size-constrained blocking framework, that allows the user to set minimum and maximum block-size thresholds, and then generates blocks where the number of records in each block is within the size range. This allows efficiency requirements to be met, improves parallelisation, and allows expensive techniques with poor scalability such as Markov logic networks to be used. Another significant limitation of advanced entity resolution techniques in practice is a lack of training data. Collective entity resolution techniques make use of relationship information so a bootstrapping process is required in order to generate initial relationships. Many techniques for temporal entity resolution, group linkage and population reconstruction also require training data. In this thesis we propose a novel approach for automatically generating high quality training data using a combination of domain constraints and ambiguity. We also show how we can incorporate these constraints and ambiguity measures into active learning to further improve the training data set. We also address the problem of parameter tuning and evaluation. Advanced entity resolution approaches typically have a large number of parameters that need to be tuned for good performance. We propose a novel approach using transitive closure that eliminates unsound parameter choices in the blocking and similarity calculation steps and reduces the number of iterations of the entity resolution process and the corresponding evaluation. Finally, we present a case study where we extend our training data generation approach for situations where relationships exist between records. We make use of the relationship information to validate the matches generated by our technique, and we also extend the concept of ambiguity to cover groups, allowing us to increase the size of the generated set of matches. We apply this approach to a very complex and challenging data set of population registry data and demonstrate that we can still create high quality training data when other approaches are inadequate

    Exploring the Effect of Curvature on the Consistency of Dead Reckoned Paths for Different Error Threshold Metrics

    Get PDF
    Dead reckoning is widely employed as an entity update packet reduction technique in Distributed Interactive Applications (DIAs). Such techniques reduce network bandwidth consumption and thus limit the effects of network latency on the consistency of networked simulations. A key component of the dead reckoning method is the underlying error threshold metric, as this directly determines when an entity update packet is to be sent between local and remote users. The most common metric is the spatial threshold, which is simply based on the distance between a local user’s actual position and their predicted position. Other, recently proposed, metrics include the time-space threshold and the hybrid threshold, both of which are summarised within. This paper investigates the issue of user movement in relation to dead reckoning and each of the threshold metrics. In particular the relationship between the curvature of movement, the various threshold metrics and absolute consistency is studied. Experimental live trials across the Internet allow a comparative analysis of how users behave when different threshold metrics are used with varying degrees of curvature. The presented results provide justification for the use of a hybrid threshold approach when dead reckoning is employed in DIAs
    corecore