29 research outputs found

    Going beyond persistent homology using persistent homology

    Full text link
    Representational limits of message-passing graph neural networks (MP-GNNs), e.g., in terms of the Weisfeiler-Leman (WL) test for isomorphism, are well understood. Augmenting these graph models with topological features via persistent homology (PH) has gained prominence, but identifying the class of attributed graphs that PH can recognize remains open. We introduce a novel concept of color-separating sets to provide a complete resolution to this important problem. Specifically, we establish the necessary and sufficient conditions for distinguishing graphs based on the persistence of their connected components, obtained from filter functions on vertex and edge colors. Our constructions expose the limits of vertex- and edge-level PH, proving that neither category subsumes the other. Leveraging these theoretical insights, we propose RePHINE for learning topological features on graphs. RePHINE efficiently combines vertex- and edge-level PH, achieving a scheme that is provably more powerful than both. Integrating RePHINE into MP-GNNs boosts their expressive power, resulting in gains over standard PH on several benchmarks for graph classification.Comment: Accepted to NeurIPS 202

    Provably expressive temporal graph networks

    Full text link
    Temporal graph networks (TGNs) have gained prominence as models for embedding dynamic interactions, but little is known about their theoretical underpinnings. We establish fundamental results about the representational power and limits of the two main categories of TGNs: those that aggregate temporal walks (WA-TGNs), and those that augment local message passing with recurrent memory modules (MP-TGNs). Specifically, novel constructions reveal the inadequacy of MP-TGNs and WA-TGNs, proving that neither category subsumes the other. We extend the 1-WL (Weisfeiler-Leman) test to temporal graphs, and show that the most powerful MP-TGNs should use injective updates, as in this case they become as expressive as the temporal WL. Also, we show that sufficiently deep MP-TGNs cannot benefit from memory, and MP/WA-TGNs fail to compute graph properties such as girth. These theoretical insights lead us to PINT -- a novel architecture that leverages injective temporal message passing and relative positional features. Importantly, PINT is provably more expressive than both MP-TGNs and WA-TGNs. PINT significantly outperforms existing TGNs on several real-world benchmarks.Comment: Accepted to NeurIPS 202

    Minimal Learning Machine: Theoretical Results and Clustering-Based Reference Point Selection

    Full text link
    The Minimal Learning Machine (MLM) is a nonlinear supervised approach based on learning a linear mapping between distance matrices computed in the input and output data spaces, where distances are calculated using a subset of points called reference points. Its simple formulation has attracted several recent works on extensions and applications. In this paper, we aim to address some open questions related to the MLM. First, we detail theoretical aspects that assure the interpolation and universal approximation capabilities of the MLM, which were previously only empirically verified. Second, we identify the task of selecting reference points as having major importance for the MLM's generalization capability. Several clustering-based methods for reference point selection in regression scenarios are then proposed and analyzed. Based on an extensive empirical evaluation, we conclude that the evaluated methods are both scalable and useful. Specifically, for a small number of reference points, the clustering-based methods outperformed the standard random selection of the original MLM formulation.Comment: 29 pages, Accepted to JML

    Sudden cardiac death multiparametric classification system for Chagas heart disease's patients based on clinical data and 24-hours ECG monitoring

    Get PDF
    About 6.5 million people are infected with Chagas disease (CD) globally, and WHO estimates that $ > million people worldwide suffer from ChHD. Sudden cardiac death (SCD) represents one of the leading causes of death worldwide and affects approximately 65% of ChHD patients at a rate of 24 per 1000 patient-years, much greater than the SCD rate in the general population. Its occurrence in the specific context of ChHD needs to be better exploited. This paper provides the first evidence supporting the use of machine learning (ML) methods within non-invasive tests: patients' clinical data and cardiac restitution metrics (CRM) features extracted from ECG-Holter recordings as an adjunct in the SCD risk assessment in ChHD. The feature selection (FS) flows evaluated 5 different groups of attributes formed from patients' clinical and physiological data to identify relevant attributes among 57 features reported by 315 patients at HUCFF-UFRJ. The FS flow with FS techniques (variance, ANOVA, and recursive feature elimination) and Naive Bayes (NB) model achieved the best classification performance with 90.63% recall (sensitivity) and 80.55% AUC. The initial feature set is reduced to a subset of 13 features (4 Classification; 1 Treatment; 1 CRM; and 7 Heart Tests). The proposed method represents an intelligent diagnostic support system that predicts the high risk of SCD in ChHD patients and highlights the clinical and CRM data that most strongly impact the final outcome
    corecore