1,454 research outputs found

    Exploring the Efficacy of Transfer Learning in Mining Image-Based Software Artifacts

    Get PDF
    Transfer learning allows us to train deep architectures requiring a large number of learned parameters, even if the amount of available data is limited, by leveraging existing models previously trained for another task. Here we explore the applicability of transfer learning utilizing models pre-trained on non-software engineering data applied to the problem of classifying software UML diagrams. Our experimental results show training reacts positively to transfer learning as related to sample size, even though the pre-trained model was not exposed to training instances from the software domain. We contrast the transferred network with other networks to show its advantage on different sized training sets, which indicates that transfer learning is equally effective to custom deep architectures when large amounts of training data is not available

    Exploring the Applicability of Low‑Shot Learning in Mining Software Repositories

    Get PDF
    Background: Despite the well-documented and numerous recent successes of deep learning, the application of standard deep architectures to many classification problems within empirical software engineering remains problematic due to the large volumes of labeled data required for training. Here we make the argument that, for some problems, this hurdle can be overcome by taking advantage of low-shot learning in combination with simpler deep architectures that reduce the total number of parameters that need to be learned. Findings: We apply low-shot learning to the task of classifying UML class and sequence diagrams from Github, and demonstrate that surprisingly good performance can be achieved by using only tens or hundreds of examples for each category when paired with an appropriate architecture. Using a large, off-the-shelf architecture, on the other hand, doesn’t perform beyond random guessing even when trained on thousands of samples. Conclusion: Our findings suggest that identifying problems within empirical software engineering that lend themselves to low-shot learning could accelerate the adoption of deep learning algorithms within the empirical software engineering community

    Learning in the Machine: To Share or Not to Share?

    Get PDF
    Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight-sharing

    Learning in the Machine: To Share or Not to Share?

    Get PDF
    Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight sharing

    Level velocity statistics of hyperbolic chaos

    Get PDF
    A generalized version of standard map is quantized as a model of quantum chaos. It is shown that, in hyperbolic chaotic regime, second moment of quantum level velocity is 1/\sim 1/\hbar as predicted by the random matrix theory.Comment: 11 pages, 4 figure

    An End-to-End CNN with Attentional Mechanism Applied to Raw EEG in a BCI Classification Task

    Get PDF
    Objective. Motor-imagery (MI) classification base on electroencephalography (EEG) has been long studied in neuroscience and more recently widely used in healthcare applications such as mobile assistive robots and neurorehabilitation. In particular, EEG-based motor-imagery classification methods that rely on convolutional neural networks (CNNs) have achieved relatively high classification accuracy. However, naively training CNNs to classify raw EEG data from all channels, especially for high-density EEG, is computationally demanding and requires huge training sets. It often also introduces many irrelevant input features, making it difficult for the CNN to extract the informative ones. This problem is compounded by a dearth of training data, which is particularly acute for MI tasks, because these are cognitively demanding and thus fatigue inducing. Approach. To address these issues, we proposed an end-to-end CNN-based neural network with attentional mechanism together with different data augmentation (DA) techniques. We tested it on two benchmark MI datasets, Brain-Computer Interface (BCI) Competition IV 2a and 2b. BCI. Main results. Our proposed neural-network architecture outperformed all state-of-the-art methods that we found in the literature, with and without DA, reaching an average classification accuracy of 93.6% and 87.83% on BCI 2a and 2b, respectively. We also directly compare decoding of MI and ME tasks. Focusing on MI classification, we find optimal channel configurations and the best DA techniques as well as investigate combining data across participants and the role of transfer learning. Significance. Our proposed approach improves the classification accuracy for MI in the benchmark datasets. In addition, collecting our own dataset enables us to compare MI and ME and investigate various aspects of EEG decoding critical for neuroscience and BCI
    corecore