16 research outputs found

    Querying Factorized Probabilistic Triple Databases

    Full text link
    Abstract. An increasing amount of data is becoming available in the form of large triple stores, with the Semantic Web’s linked open data cloud (LOD) as one of the most prominent examples. Data quality and completeness are key issues in many community-generated data stores, like LOD, which motivates probabilistic and statistical approaches to data representation, reasoning and querying. In this paper we address the issue from the perspective of probabilistic databases, which account for uncertainty in the data via a probability distribution over all database instances. We obtain a highly compressed representation using the re-cently developed RESCAL approach and demonstrate experimentally that efficient querying can be obtained by exploiting inherent features of RESCAL via sub-query approximations of deterministic views

    Fusing Multiview Functional Brain Networks by Joint Embedding for Brain Disease Identification

    Get PDF
    Background: Functional brain networks (FBNs) derived from resting-state functional MRI (rs-fMRI) have shown great potential in identifying brain disorders, such as autistic spectrum disorder (ASD). Therefore, many FBN estimation methods have been proposed in recent years. Most existing methods only model the functional connections between brain regions of interest (ROIs) from a single view (e.g., by estimating FBNs through a specific strategy), failing to capture the complex interactions among ROIs in the brain. Methods: To address this problem, we propose fusion of multiview FBNs through joint embedding, which can make full use of the common information of multiview FBNs estimated by different strategies. More specifically, we first stack the adjacency matrices of FBNs estimated by different methods into a tensor and use tensor factorization to learn the joint embedding (i.e., a common factor of all FBNs) for each ROI. Then, we use Pearson’s correlation to calculate the connections between each embedded ROI in order to reconstruct a new FBN. Results: Experimental results obtained on the public ABIDE dataset with rs-fMRI data reveal that our method is superior to several state-of-the-art methods in automated ASD diagnosis. Moreover, by exploring FBN “features” that contributed most to ASD identification, we discovered potential biomarkers for ASD diagnosis. The proposed framework achieves an accuracy of 74.46%, which is generally better than the compared individual FBN methods. In addition, our method achieves the best performance compared to other multinetwork methods, i.e., an accuracy improvement of at least 2.72%. Conclusions: We present a multiview FBN fusion strategy through joint embedding for fMRI-based ASD identification. The proposed fusion method has an elegant theoretical explanation from the perspective of eigenvector centrality

    A Review of Relational Machine Learning for Knowledge Graphs

    Get PDF
    Relational machine learning studies methods for the statistical analysis of relational, or graph-structured, data. In this paper, we provide a review of how such statistical models can be “trained” on large knowledge graphs, and then used to predict new facts about the world (which is equivalent to predicting new edges in the graph). In particular, we discuss two different kinds of statistical relational models, both of which can scale to massive datasets. The first is based on tensor factorization methods and related latent variable models. The second is based on mining observable patterns in the graph. We also show how to combine these latent and observable models to get improved modeling power at decreased computational cost. Finally, we discuss how such statistical models of graphs can be combined with text-based information extraction methods for automatically constructing knowledge graphs from the Web. In particular, we discuss Google’s Knowledge Vault project.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216

    Content-based image analysis with applications to the multifunction printer imaging pipeline and image databases

    Get PDF
    Image understanding is one of the most important topics for various applications. Most of image understanding studies focus on content-based approach while some others also rely on meta data of images. Image understanding includes several sub-topics such as classification, segmentation, retrieval and automatic annotation etc., which are heavily studied recently. This thesis proposes several new methods and algorithms for image classification, retrieval and automatic tag generation. The proposed algorithms have been tested and verified in multiple platforms. For image classification, our proposed method can complete classification in real-time under hardware constraints of all-in-one printer and adaptively improve itself by online learning. Another image understanding engine includes both classification and image quality analysis is designed to solve the optimal compression problem of printing system. Our proposed image retrieval algorithm can be applied to either PC or mobile device to improve the hybrid learning experience. We also develop a new matrix factorization algorithm to better recover the image meta data (tag). The proposed algorithm outperforms other existing matrix factorization methods
    corecore