7,077 research outputs found

    Supervised Typing of Big Graphs using Semantic Embeddings

    Full text link
    We propose a supervised algorithm for generating type embeddings in the same semantic vector space as a given set of entity embeddings. The algorithm is agnostic to the derivation of the underlying entity embeddings. It does not require any manual feature engineering, generalizes well to hundreds of types and achieves near-linear scaling on Big Graphs containing many millions of triples and instances by virtue of an incremental execution. We demonstrate the utility of the embeddings on a type recommendation task, outperforming a non-parametric feature-agnostic baseline while achieving 15x speedup and near-constant memory usage on a full partition of DBpedia. Using state-of-the-art visualization, we illustrate the agreement of our extensionally derived DBpedia type embeddings with the manually curated domain ontology. Finally, we use the embeddings to probabilistically cluster about 4 million DBpedia instances into 415 types in the DBpedia ontology.Comment: 6 pages, to be published in Semantic Big Data Workshop at ACM, SIGMOD 2017; extended version in preparation for Open Journal of Semantic Web (OJSW

    Data-Driven Shape Analysis and Processing

    Full text link
    Data-driven methods play an increasingly important role in discovering geometric, structural, and semantic relationships between 3D shapes in collections, and applying this analysis to support intelligent modeling, editing, and visualization of geometric data. In contrast to traditional approaches, a key feature of data-driven approaches is that they aggregate information from a collection of shapes to improve the analysis and processing of individual shapes. In addition, they are able to learn models that reason about properties and relationships of shapes without relying on hard-coded rules or explicitly programmed instructions. We provide an overview of the main concepts and components of these techniques, and discuss their application to shape classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis, through reviewing the literature and relating the existing works with both qualitative and numerical comparisons. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing.Comment: 10 pages, 19 figure

    Input variable selection in time-critical knowledge integration applications: A review, analysis, and recommendation paper

    Get PDF
    This is the post-print version of the final paper published in Advanced Engineering Informatics. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.The purpose of this research is twofold: first, to undertake a thorough appraisal of existing Input Variable Selection (IVS) methods within the context of time-critical and computation resource-limited dimensionality reduction problems; second, to demonstrate improvements to, and the application of, a recently proposed time-critical sensitivity analysis method called EventTracker to an environment science industrial use-case, i.e., sub-surface drilling. Producing time-critical accurate knowledge about the state of a system (effect) under computational and data acquisition (cause) constraints is a major challenge, especially if the knowledge required is critical to the system operation where the safety of operators or integrity of costly equipment is at stake. Understanding and interpreting, a chain of interrelated events, predicted or unpredicted, that may or may not result in a specific state of the system, is the core challenge of this research. The main objective is then to identify which set of input data signals has a significant impact on the set of system state information (i.e. output). Through a cause-effect analysis technique, the proposed technique supports the filtering of unsolicited data that can otherwise clog up the communication and computational capabilities of a standard supervisory control and data acquisition system. The paper analyzes the performance of input variable selection techniques from a series of perspectives. It then expands the categorization and assessment of sensitivity analysis methods in a structured framework that takes into account the relationship between inputs and outputs, the nature of their time series, and the computational effort required. The outcome of this analysis is that established methods have a limited suitability for use by time-critical variable selection applications. By way of a geological drilling monitoring scenario, the suitability of the proposed EventTracker Sensitivity Analysis method for use in high volume and time critical input variable selection problems is demonstrated.E

    Evolutionary Learning of Fuzzy Rules for Regression

    Get PDF
    The objective of this PhD Thesis is to design Genetic Fuzzy Systems (GFS) that learn Fuzzy Rule Based Systems to solve regression problems in a general manner. Particularly, the aim is to obtain models with low complexity while maintaining high precision without using expert-knowledge about the problem to be solved. This means that the GFSs have to work with raw data, that is, without any preprocessing that help the learning process to solve a particular problem. This is of particular interest, when no knowledge about the input data is available or for a first approximation to the problem. Moreover, within this objective, GFSs have to cope with large scale problems, thus the algorithms have to scale with the data

    IIVFDT: Ignorance Functions based Interval-Valued Fuzzy Decision Tree with Genetic Tuning

    Get PDF
    The choice of membership functions plays an essential role in the success of fuzzy systems. This is a complex problem due to the possible lack of knowledge when assigning punctual values as membership degrees. To face this handicap, we propose a methodology called Ignorance functions based Interval-Valued Fuzzy Decision Tree with genetic tuning, IIVFDT for short, which allows to improve the performance of fuzzy decision trees by taking into account the ignorance degree. This ignorance degree is the result of a weak ignorance function applied to the punctual value set as membership degree. Our IIVFDT proposal is composed of four steps: (1) the base fuzzy decision tree is generated using the fuzzy ID3 algorithm; (2) the linguistic labels are modeled with Interval-Valued Fuzzy Sets. To do so, a new parametrized construction method of Interval-Valued Fuzzy Sets is defined, whose length represents such ignorance degree; (3) the fuzzy reasoning method is extended to work with this representation of the linguistic terms; (4) an evolutionary tuning step is applied for computing the optimal ignorance degree for each Interval-Valued Fuzzy Set. The experimental study shows that the IIVFDT method allows the results provided by the initial fuzzy ID3 with and without Interval-Valued Fuzzy Sets to be outperformed. The suitability of the proposed methodology is shown with respect to both several state-of-the-art fuzzy decision trees and C4.5. Furthermore, we analyze the quality of our approach versus two methods that learn the fuzzy decision tree using genetic algorithms. Finally, we show that a superior performance can be achieved by means of the positive synergy obtained when applying the well known genetic tuning of the lateral position after the application of the IIVFDT method.Spanish Government TIN2011-28488 TIN2010-1505
    corecore