835 research outputs found

    Supervised Typing of Big Graphs using Semantic Embeddings

    Full text link
    We propose a supervised algorithm for generating type embeddings in the same semantic vector space as a given set of entity embeddings. The algorithm is agnostic to the derivation of the underlying entity embeddings. It does not require any manual feature engineering, generalizes well to hundreds of types and achieves near-linear scaling on Big Graphs containing many millions of triples and instances by virtue of an incremental execution. We demonstrate the utility of the embeddings on a type recommendation task, outperforming a non-parametric feature-agnostic baseline while achieving 15x speedup and near-constant memory usage on a full partition of DBpedia. Using state-of-the-art visualization, we illustrate the agreement of our extensionally derived DBpedia type embeddings with the manually curated domain ontology. Finally, we use the embeddings to probabilistically cluster about 4 million DBpedia instances into 415 types in the DBpedia ontology.Comment: 6 pages, to be published in Semantic Big Data Workshop at ACM, SIGMOD 2017; extended version in preparation for Open Journal of Semantic Web (OJSW

    Network Embedding Learning in Knowledge Graph

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Knowledge Graph stores a large number of human knowledge facts in form of multi-relational network structure, is widely used as a core technique in real-world applications including search engine, question answering system, and recommender system. Knowledge Graph is used to provide extra info box for user query in Google search engine, the WolframAlpha site provides question answering service relying on Knowledge Graph, and the eBay uses Knowledge Graph as semantic enhance for their recommendation service. Motivated by several characteristics of Knowledge Graph including incompleteness, structural inferability, and semantical application enhancement, a few efforts have been put into the Knowledge Graph analysis area. Some works contribute to Knowledge Graph construction and maintenance through crowdsourcing. Some previous network embedding learning models show good performance on homogeneous network analysis, while the performance of directly using them on Knowledge Graph is limited because the multiple relationship information of the Knowledge Graph is ignored. Then, the concept of Knowledge Graph embedding learning is given, by learning representation for Knowledge Graph components including entities and relations, the latent semantic information is extracted into embedding representation. And the embedding techniques are also utilized in collaborative learning for Knowledge Graph and external application scenarios, the target is to use Knowledge Graph as a semantic enhancement to improve the performance of external applications. However, some problems still remain in Knowledge Graph completion, reasoning, and external application. First, a proper model is required for Knowledge Graph self-completion, and a proper integration solution is also required to add extra conceptual taxonomy information into the process of Knowledge Graph completion. Then, a framework to use sub-structure information of Knowledge Graph network into knowledge reasoning is needed. After that, a collaborative learning framework for knowledge graph completion and downstream machine learning tasks is needed to be designed. In this thesis, we take recommender systems as an example of downstream machine learning tasks. To address the aforementioned research problems, a few approaches are proposed in the works introduced in this thesis. • A bipartite graph embedding based Knowledge Graph completion approach for Knowledge Graph self-completion, each knowledge fact is represented in the form of bipartite graph structure for more reasonable triple inference. • An embedding based cross completion approach for completing the factual Knowledge Graph with additive conceptual taxonomy information, the components of factual Knowledge Graph and conceptual taxonomy, entities, relations, types, are jointly represented by embedding representation. • Two sub-structure based Knowledge Graph transitive relation embedding approaches for knowledge reasoning analysis based on Knowledge Graph sub-structure, the transitive structural information contained in Knowledge Graph network substructure is learned into relation embedding. • Two hierarchical collaborative embedding approaches for proper collaborative learning on Knowledge Graph and Recommender System through linking Knowledge Graph entities with Recommender items, then entities, relations, items, and users are represented by embedding in collaborative space. The main contributions of this thesis are proposing a few approaches which can be used in multiple Knowledge Graph related domains, Knowledge Graph completion, reasoning and application. Two approaches achieve more accurate Knowledge Graph completion, other two approaches model knowledge reasoning based on network substructure analysis, and the other approaches apply Knowledge Graph into a recommender system application

    Crosslingual Document Embedding as Reduced-Rank Ridge Regression

    Get PDF
    There has recently been much interest in extending vector-based word representations to multiple languages, such that words can be compared across languages. In this paper, we shift the focus from words to documents and introduce a method for embedding documents written in any language into a single, language-independent vector space. For training, our approach leverages a multilingual corpus where the same concept is covered in multiple languages (but not necessarily via exact translations), such as Wikipedia. Our method, Cr5 (Crosslingual reduced-rank ridge regression), starts by training a ridge-regression-based classifier that uses language-specific bag-of-word features in order to predict the concept that a given document is about. We show that, when constraining the learned weight matrix to be of low rank, it can be factored to obtain the desired mappings from language-specific bags-of-words to language-independent embeddings. As opposed to most prior methods, which use pretrained monolingual word vectors, postprocess them to make them crosslingual, and finally average word vectors to obtain document vectors, Cr5 is trained end-to-end and is thus natively crosslingual as well as document-level. Moreover, since our algorithm uses the singular value decomposition as its core operation, it is highly scalable. Experiments show that our method achieves state-of-the-art performance on a crosslingual document retrieval task. Finally, although not trained for embedding sentences and words, it also achieves competitive performance on crosslingual sentence and word retrieval tasks.Comment: In The Twelfth ACM International Conference on Web Search and Data Mining (WSDM '19

    Evaluating the Effectiveness of Margin Parameter when Learning Knowledge Embedding Representation for Domain-specific Multi-relational Categorized Data

    Get PDF
    Learning knowledge representation is an increasingly important technology that supports a variety of machine learning related applications. However, the choice of hyperparameters is seldom justified and usually relies on exhaustive search. Understanding the effect of hyperparameter combinations on embedding quality is crucial to avoid the inefficient process and enhance practicality of vector representation methods. We evaluate the effects of distinct values for the margin parameter focused on translational embedding representation models for multi-relational categorized data. We assess the margin influence regarding the quality of embedding models by contrasting traditional link prediction task accuracy against a classification task. The findings provide evidence that lower values of margin are not rigorous enough to help with the learning process, whereas larger values produce much noise pushing the entities beyond to the surface of the hyperspace, thus requiring constant regularization. Finally, the correlation between link prediction and classification accuracy shows traditional validation protocol for embedding models is a weak metric to represent the quality of embedding representation

    Novel Perspectives and Applications of Knowledge Graph Embeddings: From Link Prediction to Risk Assessment and Explainability

    Get PDF
    Knowledge graph representation is an important embedding technology that supports a variety of machine learning related applications. By learning the distributed representation of multi-relational data, knowledge embedding models are supposed to efficiently deal with the semantic relatedness of their constituents. However, failing in the fundamental task of creating an appropriate form to represent knowledge harms any attempt of designing subsequent machine learning tasks. Several knowledge embedding methods have been proposed in the last decade. Although there is a consensus on the idea that enhanced approaches are more efficient, more complex projections in the hyperspace that indeed favor link prediction (or knowledge graph completion) can result in a loss of semantic similarity. We propose a new evaluation task that aims at performing risk assessment on domain-specific categorized multi-relational datasets, designed as a classification problem based on the resulting embeddings. We assess the quality of embedding representations based on the synergy of the resulting clusters of target subjects. We show that more sophisticated embedding approaches do not necessarily favor embedding quality, and the traditional link prediction validation protocol is a weak metric to measure the quality of embedding representation. Finally, we present insights about using the synergy analysis to provide risk assessment explainability based on the probability distribution of feature-value pairs within embedded clusters

    Aspects of the topological dynamics of sparse graph automorphism groups

    Get PDF
    We examine sparse graph automorphism groups from the perspective of the Kechris-Pestov-Todorčević (KPT) correspondence. The sparse graphs that we discuss are Hrushovski constructions: we consider the 'ab initio’ Hrushovski construction M_0, the Fraïssé limit of the class of 2-sparse graphs with self-sufficient closure; M_1, a simplified version of M_0; and the ω-categorical Hrushovski construction M_F. We prove a series of results that show that the automorphism groups of these Hrushovski constructions demonstrate very different behaviour to previous classes studied in the KPT context. Extending results of Evans, Hubička and Nešetřil, we show that Aut(M_0) has no coprecompact amenable subgroup. We investigate the fixed points on type spaces property, a weakening of extreme amenability, and show that for a particular choice of control function F, Aut(M_F) does not have any closed oligomorphic subgroup with this property. Next we consider the Aut(M_1)-flow of linear orders on M_1, and show that minimal subflows of this have all Aut(M_1)-orbits meagre. We give partial analogous results for the Aut(M_0)-flow of linear orders on M_0, and find the universal minimal flow of the automorphism group of the “dimension 0” part of M_0.Open Acces

    Investigating the temporal dynamics of inter-organizational exchange: patient transfers among Italian hospitals

    Get PDF
    Previous research on interaction behavior among organizations (resource exchange, collaboration, communication) has typically aggregated records of those behaviors over time to constitute a ‘network’ of organizational relationships. We instead directly study structural-temporal patterns in organizational exchange, focusing on the dynamics of reciprocation. Applying this lens to a community of Italian hospitals during the period 2003-2007, we observe two mechanisms of interorganizational reciprocation: organizational embedding and resource dependence. We flesh out these two mechanisms by showing how they operate in distinct time frames: Dependence operates on contemporaneous exchange structures, whereas embedding develops through longer-term historical patterns. We also show how these processes operate differently in competitive and noncompetitive contexts, operationalized in terms of market differentiation and geographic space. In noncompetitive contexts, we observe both logics of reciprocation, dependence in the short term and embedding over the long term, developing into patterns of generalized exchange in this population. In competitive contexts, we observe neither form of reciprocation and instead observe the microfoundations of status hierarchies in exchange
    corecore