175 research outputs found

    Effects of Locality and Rule Language on Explanations for Knowledge Graph Embeddings

    Full text link
    Knowledge graphs (KGs) are key tools in many AI-related tasks such as reasoning or question answering. This has, in turn, propelled research in link prediction in KGs, the task of predicting missing relationships from the available knowledge. Solutions based on KG embeddings have shown promising results in this matter. On the downside, these approaches are usually unable to explain their predictions. While some works have proposed to compute post-hoc rule explanations for embedding-based link predictors, these efforts have mostly resorted to rules with unbounded atoms, e.g., bornIn(x,y) => residence(x,y), learned on a global scope, i.e., the entire KG. None of these works has considered the impact of rules with bounded atoms such as nationality(x,England) => speaks(x, English), or the impact of learning from regions of the KG, i.e., local scopes. We therefore study the effects of these factors on the quality of rule-based explanations for embedding-based link predictors. Our results suggest that more specific rules and local scopes can improve the accuracy of the explanations. Moreover, these rules can provide further insights about the inner-workings of KG embeddings for link prediction

    Reproducing and Explaining Entity and Relation Embeddings for Link Prediction in Knowledge Graphs

    Get PDF
    Embedding knowledge graphs is a common method used to encode information from the graph at hand projected in a low dimensional space. There are two shortcomings in the field of knowledge graph embeddings for link prediction. The first shortcoming is that, as far as we know, current software libraries to compute knowledge graph embeddings differ from the original papers proposing these embeddings. Certain implementations are faithful to the original papers, while others range from minute differences to significant variations. Due to these implementation variations, it is difficult to compare the same algorithm from multiple libraries and also affects our ability to reproduce results. In this report, we describe a new framework, AugmentedKGE (aKGE), to embed knowledge graphs. The library features multiple knowledge graph embedding algorithms, a rank-based evaluator, and is developed completely using Python and PyTorch. The second shortcoming is that, during the evaluation process of link prediction, the goal is to rank based on scores a positive triple over a (typically large) number of negative triples. Accuracy metrics used in the evaluation of link prediction are aggregations of the ranks of the positive triples under evaluation and do not typically provide enough details as to why a number of negative triples are ranked higher than their positive counterparts. Providing explanations to these triples aids in understanding the results of the link predictions based on knowledge graph embeddings. Current approaches mainly focus on explaining embeddings rather than predictions and single predictions rather than all the link predictions made by the embeddings of a certain knowledge graph. In this report, we present an approach to explain all these predictions by providing two metrics that serve to quantify and compare the explainability of different embeddings. From the results of evaluating aKGE, we observe that the accuracy metrics are better than the accuracy metrics obtained from the standard implementation of OpenKE. From the results of explainability, we observe that the horn rules obtained explain more than 50% of all the negative triples generated

    Large Language Models as General Pattern Machines

    Full text link
    We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences -- from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstraction and Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art. Surprisingly, pattern completion proficiency can be partially retained even when the sequences are expressed using tokens randomly sampled from the vocabulary. These results suggest that without any additional training, LLMs can serve as general sequence modelers, driven by in-context learning. In this work, we investigate how these zero-shot capabilities may be applied to problems in robotics -- from extrapolating sequences of numbers that represent states over time to complete simple motions, to least-to-most prompting of reward-conditioned trajectories that can discover and represent closed-loop policies (e.g., a stabilizing controller for CartPole). While difficult to deploy today for real systems due to latency, context size limitations, and compute costs, the approach of using LLMs to drive low-level control may provide an exciting glimpse into how the patterns among words could be transferred to actions.Comment: 21 pages, 25 figures. To appear at Conference on Robot Learning (CoRL) 202

    Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

    Full text link
    Software vulnerabilities bear enterprises significant costs. Despite extensive efforts in research and development of software vulnerability detection methods, uncaught vulnerabilities continue to put software owners and users at risk. Many current vulnerability detection methods require that code snippets can compile and build before attempting detection. This, unfortunately, introduces a long latency between the time a vulnerability is injected to the time it is removed, which can substantially increases the cost of fixing a vulnerability. We recognize that the current advances in machine learning can be used to detect vulnerable code patterns on syntactically incomplete code snippets as the developer is writing the code at EditTime. In this paper we present a practical system that leverages deep learning on a large-scale data set of vulnerable code patterns to learn complex manifestations of more than 250 vulnerability types and detect vulnerable code patterns at EditTime. We discuss zero-shot, few-shot, and fine-tuning approaches on state of the art pre-trained Large Language Models (LLMs). We show that in comparison with state of the art vulnerability detection models our approach improves the state of the art by 10%. We also evaluate our approach to detect vulnerability in auto-generated code by code LLMs. Evaluation on a benchmark of high-risk code scenarios shows a reduction of up to 90% vulnerability reduction

    Efficient Methods for Knowledge Base Construction and Query

    Full text link
    Recently, knowledge bases have been widely used in search engines, question-answering systems, and many other applications. The abundant entity profiles and relational information in knowledge bases help the downstream applications learn more about the user queries. However, in automated knowledge base construction, ambiguity in data sources is one of the main challenges. Given a constructed knowledge base, it is hard to efficiently find entities of interest and extract their relatedness information from the knowledge base due to its large capacity. In this thesis, we adopt natural language processing tools, machine learning and graph/text query techniques to deal with such challenges. First, we introduce a machine-learning based framework for efficient entity linking to deal with the ambiguity issue in documents. For entity linking, deep-learning-based methods have outperformed traditional machine-learning-based ones but demand a large amount of data and have a high cost on the training time. We propose a lightweight, customisable and time-efficient method, which is based on traditional machine learning techniques. Our approach achieves comparable performances to the state-of-the-art deep learning-based ones while being significantly faster to train. Second, we adopt deep learning to deal with the Entity Resolution (ER) problem, which aims to reduce the data ambiguity in structural data sources. The existing BERT-based method has set new state-of-the-art performance on the ER task, but it suffers from the high computational cost due to the large cardinality to match. We propose to use Bert in a siamese network to encode the entities separately and adopt the blocking-matching scheme in a multi-task learning framework. The blocking module filters out candidate entity pairs that are unlikely to be matched, while the matching module uses an enhanced alignment network to decide if a pair is a match. Experiments show that our approach outperforms state-of-the-art models in both efficiency and effectiveness. Third, we proposed a flexible Query auto-completion (QAC) framework to support efficient error-tolerant QAC for entity queries in the knowledge base. Most existing works overlook the quality of the suggested completions, and the efficiency needs to be improved. Our framework is designed on the basis of a noisy channel model, which consists of a language model and an error model. Thus, many QAC ranking methods and spelling correction methods can be easily plugged into the framework. To address the efficiency issue, we devise a neighbourhood generation method accompanied by a trie index to quickly find candidates for the error model. The experiments show that our method improves the state of the art of error-tolerant QAC. Last but not least, we designed a visualisation system to facilitate efficient relatedness queries in a large-scale knowledge graph. Given a pair of entities, we aim to efficiently extract a succinct sub-graph to explain the relatedness of the pair of entities. Existing methods, either graph-based or list-based, all have some limitations when dealing with large complex graphs. We propose to use Bi-simulation to summarise the sub-graph, where semantically similar entities are combined. Our method exhibits the most prominent patterns while keeping them in an integrated graph

    Towards Automated Circuit Discovery for Mechanistic Interpretability

    Full text link
    Recent work in mechanistic interpretability has reverse-engineered nontrivial behaviors of transformer models. These contributions required considerable effort and researcher intuition, which makes it difficult to apply the same methods to understand the complex behavior that current models display. At their core however, the workflow for these discoveries is surprisingly similar. Researchers create a data set and metric that elicit the desired model behavior, subdivide the network into appropriate abstract units, replace activations of those units to identify which are involved in the behavior, and then interpret the functions that these units implement. By varying the data set, metric, and units under investigation, researchers can understand the functionality of each neural network region and the circuits they compose. This work proposes a novel algorithm, Automatic Circuit DisCovery (ACDC), to automate the identification of the important units in the network. Given a model's computational graph, ACDC finds subgraphs that explain a behavior of the model. ACDC was able to reproduce a previously identified circuit for Python docstrings in a small transformer, identifying 6/7 important attention heads that compose up to 3 layers deep, while including 91% fewer the connections

    Neural Networks forBuilding Semantic Models and Knowledge Graphs

    Get PDF
    1noL'abstract è presente nell'allegato / the abstract is in the attachmentopen677. INGEGNERIA INFORMATInoopenFutia, Giusepp

    GENEOs, compactifications, and graphs

    Get PDF
    Our objective in this thesis is to study the pseudo-metric and topological structure of the space of group equivariant non-expansive operators (GENEOs). We introduce the notions of compactification of a perception pair, collectionwise surjectivity, and compactification of a space of GENEOs. We obtain some compactification results for perception pairs and the space of GENEOs. We show that when the data spaces are totally bounded and endow the common domains with metric structures, the perception pairs and every collectionwise surjective space of GENEOs can be embedded isometrically into the compact ones through compatible embeddings. An important part of the study of topology of the space of GENEOs is to populate it in a rich manner. We introduce the notion of a generalized permutant and show that this concept too, like that of a permutant, is useful in defining new GENEOs. We define the analogues of some of the aforementioned concepts in a graph theoretic setting, enabling us to use the power of the theory of GENEOs for the study of graphs in an efficient way. We define the notions of a graph perception pair, graph permutant, and a graph GENEO. We develop two models for the theory of graph GENEOs. The first model addresses the case of graphs having weights assigned to their vertices, while the second one addresses weighted on the edges. We prove some new results in the proposed theory of graph GENEOs and exhibit the power of our models by describing their applications to the structural study of simple graphs. We introduce the concept of a graph permutant and show that this concept can be used to define new graph GENEOs between distinct graph perception pairs, thereby enabling us to populate the space of graph GENEOs in a rich manner and shed more light on its structure.Il nostro obiettivo in questa tesi è studiare la struttura pseudo-metrica e topologica dello spazio degli operatori di gruppo equivarianti non espansivi (GENEO). Introduciamo le nozioni di compattazione di una coppia di percezioni, suriettività collezionistica e compattazione di uno spazio di GENEO. Otteniamo alcuni risultati di compattazione per le coppie di percezione e lo spazio dei GENEO. Mostriamo che quando gli spazi di dati sono totalmente limitati e dotano i domini comuni di strutture metriche, le coppie di percezione e ogni spazio suriettivo di raccolta dei GENEO possono essere incorporati isometricamente in quelli compatti attraverso incorporamenti compatibili. Una parte importante dello studio della topologia dello spazio dei GENEO consiste nel popolarlo in modo ricco. Introduciamo la nozione di permutante generalizzato e dimostriamo che anche questo concetto, come quello di permutante, è utile per definire nuovi GENEO. Definiamo gli analoghi di alcuni dei suddetti concetti in un contesto di teoria dei grafi, consentendoci di utilizzare la potenza della teoria dei GENEO per lo studio dei grafi in modo efficiente. Definiamo le nozioni di coppia di percezione del grafo, permutante del grafo e GENEO del grafo. Sviluppiamo due modelli per la teoria dei grafi GENEO. Il primo modello si rivolge al caso di grafi aventi pesi assegnati ai vertici, mentre il secondo si rivolge a pesi sugli spigoli. Dimostriamo alcuni nuovi risultati nella proposta teoria dei grafi GENEO e mostriamo la potenza dei nostri modelli descrivendo le loro applicazioni allo studio strutturale di grafi semplici. Introduciamo il concetto di permutante del grafo e mostriamo che questo concetto può essere utilizzato per definire nuovi GENEO del grafo tra coppie distinte di percezione del grafo, consentendoci così di popolare lo spazio del grafo GENEO in modo ricco e fare più luce sulla sua struttura
    • …
    corecore