2,732 research outputs found

    On the non-efficient PAC learnability of conjunctive queries

    Get PDF
    This note serves three purposes: (i) we provide a self-contained exposition of the fact that conjunctive queries are not efficiently learnable in the Probably-Approximately-Correct (PAC) model, paying clear attention to the complicating fact that this concept class lacks the polynomial-size fitting property, a property that is tacitly assumed in much of the computational learning theory literature; (ii) we establish a strong negative PAC learnability result that applies to many restricted classes of conjunctive queries (CQs), including acyclic CQs for a wide range of notions of acyclicity; (iii) we show that CQs (and UCQs) are efficiently PAC learnable with membership queries.<p/

    How to Turn Your Knowledge Graph Embeddings into Generative Models

    Full text link
    Some of the most successful knowledge graph embedding (KGE) models for link prediction -- CP, RESCAL, TuckER, ComplEx -- can be interpreted as energy-based models. Under this perspective they are not amenable for exact maximum-likelihood estimation (MLE), sampling and struggle to integrate logical constraints. This work re-interprets the score functions of these KGEs as circuits -- constrained computational graphs allowing efficient marginalisation. Then, we design two recipes to obtain efficient generative circuit models by either restricting their activations to be non-negative or squaring their outputs. Our interpretation comes with little or no loss of performance for link prediction, while the circuits framework unlocks exact learning by MLE, efficient sampling of new triples, and guarantee that logical constraints are satisfied by design. Furthermore, our models scale more gracefully than the original KGEs on graphs with millions of entities

    Fuzzy Norm-Explicit Product Quantization for Recommender Systems

    Get PDF
    As the data resources grow, providing recommendations that best meet the demands has become a vital requirement in business and life to overcome the information overload problem. However, building a system suggesting relevant recommendations has always been a point of debate. One of the most cost-efficient techniques in terms of producing relevant recommendations at a low complexity is Product Quantization (PQ). PQ approaches have continued developing in recent years. This system’s crucial challenge is improving product quantization performance in terms of recall measures without compromising its complexity. This makes the algorithm suitable for problems that require a greater number of potentially relevant items without disregarding others, at high-speed and low-cost to keep up with traffic. This is the case of online shops where the recommendations for the purpose are important, although customers can be susceptible to scoping other products. A recent approach has been exploiting the notion of norm sub-vectors encoded in product quantizers. This research proposes a fuzzy approach to perform norm-based product quantization. Type-2 Fuzzy sets (T2FSs) define the codebook allowing sub-vectors (T2FSs) to be associated with more than one element of the codebook, and next, its norm calculus is resolved by means of integration. Our method finesses the recall measure up, making the algorithm suitable for problems that require querying at most possible potential relevant items without disregarding others. The proposed approach is tested with three public recommender benchmark datasets and compared against seven PQ approaches for Maximum Inner-Product Search (MIPS). The proposed method outperforms all PQ approaches such as NEQ, PQ, and RQ up to +6%, +5%, and +8% by achieving a recall of 94%, 69%, 59% in Netflix, Audio, Cifar60k datasets, respectively. More and over, computing time and complexity nearly equals the most computationally efficient existing PQ method in the state-of-the-art

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Der Visual Sensory Memory Task: Integration von neuem Wissen und Untersuchung zur Mustertrennung anhand einer neuen Gedächtnisaufgabe mit abstrakten und ähnlichkeits-anpassbaren Stimuli

    Get PDF
    In classical memory tasks, it is often necessary to distinguish between old and new stimuli. Recent studies also use tasks in which stimuli appear that are similar to, but not identical with, familiar stimuli, so-called lures. These tasks were designed to study two postulated sub-functions of memory: Pattern separation and pattern completion. The stimuli are usually pictures of everyday objects, but for which different prior knowledge may influence memory performance and the degree of similarity between two pictures cannot be determined objectively. The Visual Sensory Memory Task (VSMT) developed by Kaernbach and colleagues is a visual pink noise-based task that can be used to construct lure stimuli with precisely quantifiable degrees of similarity. In this dissertation, existing test procedures and the newly developed test are first compared to investigate the validity of the VSMT. Then, the performance of the VSMT is transferred to different age groups to study the reliability of the VSMT, and finally, the neural processes in the hippocampus during memory retrieval are examined. The results of the first experiment demonstrate the validity of the VSMT as a neuropsychological measurement tool to study declarative memory. The second experiment showed that it is possible to perform the VSMT with different age groups and that the performance thereby shows the same profile in all age groups. In addition, it showed the expected better performance of young adults compared to four- to five-year-old children as well as adults over 65 years of age. The results of the third experiment confirm existing findings about the dentate gyrus as the central region of pattern separation and about the CA3 region as the core for the dynamic balance between pattern separation and pattern completion. Furthermore, they point to an involvement of the subiculum in pattern separation

    Evaluating Symbolic AI as a Tool to Understand Cell Signalling

    Get PDF
    The diverse and highly complex nature of modern phosphoproteomics research produces a high volume of data. Chemical phosphoproteomics especially, is amenable to a variety of analytical approaches. In this thesis we evaluate novel Symbolic AI based algorithms as potential tools in the analysis of cell signalling. Initially we developed a first order deductive, logic-based model. This allowed us to identify previously unreported inhibitor-kinase relationships which could offer novel therapeutic targets for further investigation. Following this we made use of the probabilistic reasoning of ProbLog to augment the aforementioned Prolog based model with an intuitively calculated degree of belief. This allowed us to rank previous associations while also further increasing our confidence in already established predictions. Finally we applied our methodology to a Saccharomyces cerevisiae gene perturbation, phosphoproteomics dataset. In this context we were able to confirm the majority of ground truths, i.e. gene deletions as having taken place as intended. For the remaining deletions, again using a purely symbolic based approach we were able to provide predictions on the rewiring of kinase based signalling networks following kinase encoding gene deletions. The explainable, human readable and white-box nature of this approach were highlighted, however its brittleness due to missing, inconsistent or conflicting background knowledge was also examined

    Explainable temporal data mining techniques to support the prediction task in Medicine

    Get PDF
    In the last decades, the increasing amount of data available in all fields raises the necessity to discover new knowledge and explain the hidden information found. On one hand, the rapid increase of interest in, and use of, artificial intelligence (AI) in computer applications has raised a parallel concern about its ability (or lack thereof) to provide understandable, or explainable, results to users. In the biomedical informatics and computer science communities, there is considerable discussion about the `` un-explainable" nature of artificial intelligence, where often algorithms and systems leave users, and even developers, in the dark with respect to how results were obtained. Especially in the biomedical context, the necessity to explain an artificial intelligence system result is legitimate of the importance of patient safety. On the other hand, current database systems enable us to store huge quantities of data. Their analysis through data mining techniques provides the possibility to extract relevant knowledge and useful hidden information. Relationships and patterns within these data could provide new medical knowledge. The analysis of such healthcare/medical data collections could greatly help to observe the health conditions of the population and extract useful information that can be exploited in the assessment of healthcare/medical processes. Particularly, the prediction of medical events is essential for preventing disease, understanding disease mechanisms, and increasing patient quality of care. In this context, an important aspect is to verify whether the database content supports the capability of predicting future events. In this thesis, we start addressing the problem of explainability, discussing some of the most significant challenges need to be addressed with scientific and engineering rigor in a variety of biomedical domains. We analyze the ``temporal component" of explainability, focusing on detailing different perspectives such as: the use of temporal data, the temporal task, the temporal reasoning, and the dynamics of explainability in respect to the user perspective and to knowledge. Starting from this panorama, we focus our attention on two different temporal data mining techniques. The first one, based on trend abstractions, starting from the concept of Trend-Event Pattern and moving through the concept of prediction, we propose a new kind of predictive temporal patterns, namely Predictive Trend-Event Patterns (PTE-Ps). The framework aims to combine complex temporal features to extract a compact and non-redundant predictive set of patterns composed by such temporal features. The second one, based on functional dependencies, we propose a methodology for deriving a new kind of approximate temporal functional dependencies, called Approximate Predictive Functional Dependencies (APFDs), based on a three-window framework. We then discuss the concept of approximation, the data complexity of deriving an APFD, the introduction of two new error measures, and finally the quality of APFDs in terms of coverage and reliability. Exploiting these methodologies, we analyze intensive care unit data from the MIMIC dataset

    Understanding comparative questions and retrieving argumentative answers

    Get PDF
    Making decisions is an integral part of everyday life, yet it can be a difficult and complex process. While peoples’ wants and needs are unlimited, resources are often scarce, making it necessary to research the possible alternatives and weigh the pros and cons before making a decision. Nowadays, the Internet has become the main source of information when it comes to comparing alternatives, making search engines the primary means for collecting new information. However, relying only on term matching is not sufficient to adequately address requests for comparisons. Therefore, search systems should go beyond this approach to effectively address comparative information needs. In this dissertation, I explore from different perspectives how search systems can respond to comparative questions. First, I examine approaches to identifying comparative questions and study their underlying information needs. Second, I investigate a methodology to identify important constituents of comparative questions like the to-be-compared options and to detect the stance of answers towards these comparison options. Then, I address ambiguous comparative search queries by studying an interactive clarification search interface. And finally, addressing answering comparative questions, I investigate retrieval approaches that consider not only the topical relevance of potential answers but also account for the presence of arguments towards the comparison options mentioned in the questions. By addressing these facets, I aim to provide a comprehensive understanding of how to effectively satisfy the information needs of searchers seeking to compare different alternatives

    Intégration de la recherche de connexion dans les requêtes de graphe

    Get PDF
    When graph database users explore unfamiliar graphs, potentially with heterogeneous structure, users may need to find how two or more groups of nodes are connected in a graph, even when users are not able to describe the connections. This is only partially supported by existing query languages, which allow searching for paths, but not for trees connecting three or more node groups. In this work, we formally show how to integrate connecting tree patterns (CTPs, in short) with a graph query language such as GPML, SPARQL or Cypher, leading to Extended Queries (or EQs, in short). We then study a set of algorithms for evaluating CTPs; we generalize prior keyword search work to be complete, most importantly by (i) considering bidirectional edge traversal, (ii) allowing users to select any score function for ranking CTP results and (iii) returning all results. To cope with very large search spaces, we propose efficient pruning techniques and formally establish a large set of cases where our best algorithm, MOLESP, is complete even with pruning. Our experiments validate the performance of our algorithms on many synthetic and real-world workloads

    Intelligent interface agents for biometric applications

    Get PDF
    This thesis investigates the benefits of applying the intelligent agent paradigm to biometric identity verification systems. Multimodal biometric systems, despite their additional complexity, hold the promise of providing a higher degree of accuracy and robustness. Multimodal biometric systems are examined in this work leading to the design and implementation of a novel distributed multi-modal identity verification system based on an intelligent agent framework. User interface design issues are also important in the domain of biometric systems and present an exceptional opportunity for employing adaptive interface agents. Through the use of such interface agents, system performance may be improved, leading to an increase in recognition rates over a non-adaptive system while producing a more robust and agreeable user experience. The investigation of such adaptive systems has been a focus of the work reported in this thesis. The research presented in this thesis is divided into two main parts. Firstly, the design, development and testing of a novel distributed multi-modal authentication system employing intelligent agents is presented. The second part details design and implementation of an adaptive interface layer based on interface agent technology and demonstrates its integration with a commercial fingerprint recognition system. The performance of these systems is then evaluated using databases of biometric samples gathered during the research. The results obtained from the experimental evaluation of the multi-modal system demonstrated a clear improvement in the accuracy of the system compared to a unimodal biometric approach. The adoption of the intelligent agent architecture at the interface level resulted in a system where false reject rates were reduced when compared to a system that did not employ an intelligent interface. The results obtained from both systems clearly express the benefits of combining an intelligent agent framework with a biometric system to provide a more robust and flexible application
    • …
    corecore