12 research outputs found

    Lifelong learning in evolving graphs with limited labeled data and unseen class detection

    Get PDF
    Large-scale graph data in the real-world are often dynamic rather than static. The data are changing with new nodes, edges, and even classes appearing over time, such as in citation networks and research-and-development collaboration networks. Graph neural networks (GNNs) have emerged as the standard method for numerous tasks on graph-structured data. In this work, we employ a two-step procedure to explore how GNNs can be incrementally adapted to new unseen graph data. First, we analyze the verge between transductive and inductive learning on standard benchmark datasets. After inductive pretraining, we add unlabeled data to the graph and show that the models are stable. Then, we explore the case of continually adding more and more labeled data, while considering cases, where not all past instances are annotated with class labels. Furthermore, we introduce new classes while the graph evolves and explore methods that automatically detect instances from previously unseen classes. In order to deal with evolving graphs in a principled way, we propose a lifelong learning framework for graph data along with an evaluation protocol. In this framework, we evaluate representative GNN architectures. We observe that implicit knowledge within model parameters becomes more important when explicit knowledge, i.e., data from past tasks, is limited. We find that in open-world node classification, the data from surprisingly few past tasks are sufficient to reach the performance reached by remembering data from all past tasks. In the challenging task of unseen class detection, we find that using a weighted cross-entropy loss is important for stabilit

    Representation Learning for Texts and Graphs: A Unified Perspective on Efficiency, Multimodality, and Adaptability

    Get PDF
    [...] This thesis is situated between natural language processing and graph representation learning and investigates selected connections. First, we introduce matrix embeddings as an efficient text representation sensitive to word order. [...] Experiments with ten linguistic probing tasks, 11 supervised, and five unsupervised downstream tasks reveal that vector and matrix embeddings have complementary strengths and that a jointly trained hybrid model outperforms both. Second, a popular pretrained language model, BERT, is distilled into matrix embeddings. [...] The results on the GLUE benchmark show that these models are competitive with other recent contextualized language models while being more efficient in time and space. Third, we compare three model types for text classification: bag-of-words, sequence-, and graph-based models. Experiments on five datasets show that, surprisingly, a wide multilayer perceptron on top of a bag-of-words representation is competitive with recent graph-based approaches, questioning the necessity of graphs synthesized from the text. [...] Fourth, we investigate the connection between text and graph data in document-based recommender systems for citations and subject labels. Experiments on six datasets show that the title as side information improves the performance of autoencoder models. [...] We find that the meaning of item co-occurrence is crucial for the choice of input modalities and an appropriate model. Fifth, we introduce a generic framework for lifelong learning on evolving graphs in which new nodes, edges, and classes appear over time. [...] The results show that by reusing previous parameters in incremental training, it is possible to employ smaller history sizes with only a slight decrease in accuracy compared to training with complete history. Moreover, weighting the binary cross-entropy loss function is crucial to mitigate the problem of class imbalance when detecting newly emerging classes. [...

    Unsupervised Real-Time Network Intrusion and Anomaly Detection by Memristor Based Autoencoder

    Get PDF
    Custom low power hardware systems for real-time network security and anomaly detection are in high demand, as these would allow for adequate protection in battery-powered network devices, such as edge devices and the internet of the things. This paper presents a memristor based system for real-time intrusion detection, as well as an anomaly detection based on autoencoders. Intrusion detection is performed by training only on a single autoencoder, and the overall detection accuracy of this system is 92.91%, with a malicious packet detection accuracy of 98.89%. The system described in this paper is also capable of using two autoencoders to perform anomaly detection using real-time online learning. Using this system, we show that the system flags anomalous data, but over time the system stops flagging a particular datatype if its presence is abundant. Utilizing memristors in these designs allows us to present extremely low power systems for intrusion and anomaly detection while sacrificing little accuracy.https://ecommons.udayton.edu/stander_posters/2850/thumbnail.jp

    Learning Representations Toward the Understanding of Out-of-Distribution for Neural Networks

    Get PDF
    Data-driven representations achieve powerful generalization performance in diverse information processing tasks. However, the generalization is often limited to test data from the same distribution as training data (in-distribution (ID)). In addition, the neural networks often make overconfident and incorrect predictions for data outside training distribution, called out-of-distribution (OOD). In this dissertation, we develop representations that can characterize OOD for the neural networks and utilize the characterization to efficiently generalize to OOD. We categorize the data-driven representations based on information flow in neural networks and develop novel gradient-based representations. In particular, we utilize the backpropagated gradients to represent what the neural networks has not learned in the data. The capability of gradient-based representations for OOD characterization is comprehensively analyzed in comparison with standard activation-based representations. We also utilize a regularization technique for the gradient-based representations to better characterize OOD. Finally, we develop activation-based representations learned with auxiliary information to efficiently generalize to data from OOD. We use an unsupervised learning framework to learn the aligned representations of visual and attribute data. These aligned representations are utilized to calibrate the overconfident prediction toward ID classes and the generalization performance is validated in the application of generalized zero-shot learning (GZSL). The developed GZSL method, GatingAE, achieves state-of-the-art performance in generalizing to OOD with significantly less number of model parameters compared to other state-of-the-art methods.Ph.D

    FLUID: A Unified Evaluation Framework for Flexible Sequential Data

    Full text link
    Modern ML methods excel when training data is IID, large-scale, and well labeled. Learning in less ideal conditions remains an open challenge. The sub-fields of few-shot, continual, transfer, and representation learning have made substantial strides in learning under adverse conditions; each affording distinct advantages through methods and insights. These methods address different challenges such as data arriving sequentially or scarce training examples, however often the difficult conditions an ML system will face over its lifetime cannot be anticipated prior to deployment. Therefore, general ML systems which can handle the many challenges of learning in practical settings are needed. To foster research towards the goal of general ML methods, we introduce a new unified evaluation framework - FLUID (Flexible Sequential Data). FLUID integrates the objectives of few-shot, continual, transfer, and representation learning while enabling comparison and integration of techniques across these subfields. In FLUID, a learner faces a stream of data and must make sequential predictions while choosing how to update itself, adapt quickly to novel classes, and deal with changing data distributions; while accounting for the total amount of compute. We conduct experiments on a broad set of methods which shed new insight on the advantages and limitations of current solutions and indicate new research problems to solve. As a starting point towards more general methods, we present two new baselines which outperform other evaluated methods on FLUID. Project page: https://raivn.cs.washington.edu/projects/FLUID/.Comment: 27 pages, 6 figures. Project page: https://raivn.cs.washington.edu/projects/FLUID
    corecore