9,221 research outputs found

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    An innovative EEG-based emotion recognition using a single channel-specific feature from the brain rhythm code method

    Get PDF
    IntroductionEfficiently recognizing emotions is a critical pursuit in brain–computer interface (BCI), as it has many applications for intelligent healthcare services. In this work, an innovative approach inspired by the genetic code in bioinformatics, which utilizes brain rhythm code features consisting of δ, θ, α, β, or γ, is proposed for electroencephalography (EEG)-based emotion recognition.MethodsThese features are first extracted from the sequencing technique. After evaluating them using four conventional machine learning classifiers, an optimal channel-specific feature that produces the highest accuracy in each emotional case is identified, so emotion recognition through minimal data is realized. By doing so, the complexity of emotion recognition can be significantly reduced, making it more achievable for practical hardware setups.ResultsThe best classification accuracies achieved for the DEAP and MAHNOB datasets range from 83–92%, and for the SEED dataset, it is 78%. The experimental results are impressive, considering the minimal data employed. Further investigation of the optimal features shows that their representative channels are primarily on the frontal region, and associated rhythmic characteristics are typical of multiple kinds. Additionally, individual differences are found, as the optimal feature varies with subjects.DiscussionCompared to previous studies, this work provides insights into designing portable devices, as only one electrode is appropriate to generate satisfactory performances. Consequently, it would advance the understanding of brain rhythms, which offers an innovative solution for classifying EEG signals in diverse BCI applications, including emotion recognition

    Sputter deposition on composites : interplay between film and substrate properties

    Get PDF

    GripNet: Graph information propagation on supergraph for heterogeneous graphs

    Get PDF
    Heterogeneous graph representation learning aims to learn low-dimensional vector representations of different types of entities and relations to empower downstream tasks. Existing popular methods either capture semantic relationships but indirectly leverage node/edge attributes in a complex way, or leverage node/edge attributes directly without taking semantic relationships into account. When involving multiple convolution operations, they also have poor scalability. To overcome these limitations, this paper proposes a flexible and efficient Graph information propagation Network (GripNet) framework. Specifically, we introduce a new supergraph data structure consisting of supervertices and superedges. A supervertex is a semantically-coherent subgraph. A superedge defines an information propagation path between two supervertices. GripNet learns new representations for the supervertex of interest by propagating information along the defined path using multiple layers. We construct multiple large-scale graphs and evaluate GripNet against competing methods to show its superiority in link prediction, node classification, and data integration

    GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection

    Full text link
    High spectral resolution imagery of the Earth's surface enables users to monitor changes over time in fine-grained scale, playing an increasingly important role in agriculture, defense, and emergency response. However, most current algorithms are still confined to describing local features and fail to incorporate a global perspective, which limits their ability to capture interactions between global features, thus usually resulting in incomplete change regions. In this paper, we propose a Global Multi-head INteractive self-attention change Detection network (GlobalMind) to explore the implicit correlation between different surface objects and variant land cover transformations, acquiring a comprehensive understanding of the data and accurate change detection result. Firstly, a simple but effective Global Axial Segmentation (GAS) strategy is designed to expand the self-attention computation along the row space or column space of hyperspectral images, allowing the global connection with high efficiency. Secondly, with GAS, the global spatial multi-head interactive self-attention (Global-M) module is crafted to mine the abundant spatial-spectral feature involving potential correlations between the ground objects from the entire rich and complex hyperspectral space. Moreover, to acquire the accurate and complete cross-temporal changes, we devise a global temporal interactive multi-head self-attention (GlobalD) module which incorporates the relevance and variation of bi-temporal spatial-spectral features, deriving the integrate potential same kind of changes in the local and global range with the combination of GAS. We perform extensive experiments on five mostly used hyperspectral datasets, and our method outperforms the state-of-the-art algorithms with high accuracy and efficiency.Comment: 14 page, 18 figure

    FairGen: Towards Fair Graph Generation

    Full text link
    There have been tremendous efforts over the past decades dedicated to the generation of realistic graphs in a variety of domains, ranging from social networks to computer networks, from gene regulatory networks to online transaction networks. Despite the remarkable success, the vast majority of these works are unsupervised in nature and are typically trained to minimize the expected graph reconstruction loss, which would result in the representation disparity issue in the generated graphs, i.e., the protected groups (often minorities) contribute less to the objective and thus suffer from systematically higher errors. In this paper, we aim to tailor graph generation to downstream mining tasks by leveraging label information and user-preferred parity constraint. In particular, we start from the investigation of representation disparity in the context of graph generative models. To mitigate the disparity, we propose a fairness-aware graph generative model named FairGen. Our model jointly trains a label-informed graph generation module and a fair representation learning module by progressively learning the behaviors of the protected and unprotected groups, from the `easy' concepts to the `hard' ones. In addition, we propose a generic context sampling strategy for graph generative models, which is proven to be capable of fairly capturing the contextual information of each group with a high probability. Experimental results on seven real-world data sets, including web-based graphs, demonstrate that FairGen (1) obtains performance on par with state-of-the-art graph generative models across six network properties, (2) mitigates the representation disparity issues in the generated graphs, and (3) substantially boosts the model performance by up to 17% in downstream tasks via data augmentation

    An investigation of speaker independent phrase break models in End-to-End TTS systems

    Full text link
    This paper presents our work on phrase break prediction in the context of end-to-end TTS systems, motivated by the following questions: (i) Is there any utility in incorporating an explicit phrasing model in an end-to-end TTS system?, and (ii) How do you evaluate the effectiveness of a phrasing model in an end-to-end TTS system? In particular, the utility and effectiveness of phrase break prediction models are evaluated in in the context of childrens story synthesis, using listener comprehension. We show by means of perceptual listening evaluations that there is a clear preference for stories synthesized after predicting the location of phrase breaks using a trained phrasing model, over stories directly synthesized without predicting the location of phrase breaks.Comment: Submitted for review to IEEE Acces

    Disentanglement of Latent Representations via Sparse Causal Interventions

    Full text link
    The process of generating data such as images is controlled by independent and unknown factors of variation. The retrieval of these variables has been studied extensively in the disentanglement, causal representation learning, and independent component analysis fields. Recently, approaches merging these domains together have shown great success. Instead of directly representing the factors of variation, the problem of disentanglement can be seen as finding the interventions on one image that yield a change to a single factor. Following this assumption, we introduce a new method for disentanglement inspired by causal dynamics that combines causality theory with vector-quantized variational autoencoders. Our model considers the quantized vectors as causal variables and links them in a causal graph. It performs causal interventions on the graph and generates atomic transitions affecting a unique factor of variation in the image. We also introduce a new task of action retrieval that consists of finding the action responsible for the transition between two images. We test our method on standard synthetic and real-world disentanglement datasets. We show that it can effectively disentangle the factors of variation and perform precise interventions on high-level semantic attributes of an image without affecting its quality, even with imbalanced data distributions.Comment: 16 pages, 10 pages for the main paper and 6 pages for the supplement, 14 figures, submitted to IJCAI 2023. V2: added link to repositor

    Boosting Few-shot Action Recognition with Graph-guided Hybrid Matching

    Full text link
    Class prototype construction and matching are core aspects of few-shot action recognition. Previous methods mainly focus on designing spatiotemporal relation modeling modules or complex temporal alignment algorithms. Despite the promising results, they ignored the value of class prototype construction and matching, leading to unsatisfactory performance in recognizing similar categories in every task. In this paper, we propose GgHM, a new framework with Graph-guided Hybrid Matching. Concretely, we learn task-oriented features by the guidance of a graph neural network during class prototype construction, optimizing the intra- and inter-class feature correlation explicitly. Next, we design a hybrid matching strategy, combining frame-level and tuple-level matching to classify videos with multivariate styles. We additionally propose a learnable dense temporal modeling module to enhance the video feature temporal representation to build a more solid foundation for the matching process. GgHM shows consistent improvements over other challenging baselines on several few-shot datasets, demonstrating the effectiveness of our method. The code will be publicly available at https://github.com/jiazheng-xing/GgHM.Comment: Accepted by ICCV202

    Digital Twin-Oriented Complex Networked Systems based on Heterogeneous node features and interaction rules

    Full text link
    This study proposes an extendable modelling framework for Digital Twin-Oriented Complex Networked Systems (DT-CNSs) with a goal of generating networks that faithfully represent real systems. Modelling process focuses on (i) features of nodes and (ii) interaction rules for creating connections that are built based on individual node's preferences. We conduct experiments on simulation-based DT-CNSs that incorporate various features and rules about network growth and different transmissibilities related to an epidemic spread on these networks. We present a case study on disaster resilience of social networks given an epidemic outbreak by investigating the infection occurrence within specific time and social distance. The experimental results show how different levels of the structural and dynamics complexities, concerned with feature diversity and flexibility of interaction rules respectively, influence network growth and epidemic spread. The analysis revealed that, to achieve maximum disaster resilience, mitigation policies should be targeted at nodes with preferred features as they have higher infection risks and should be the focus of the epidemic control
    • …
    corecore