1,565 research outputs found

    Exploring the Efficacy of Transfer Learning in Mining Image-Based Software Artifacts

    Get PDF
    Transfer learning allows us to train deep architectures requiring a large number of learned parameters, even if the amount of available data is limited, by leveraging existing models previously trained for another task. Here we explore the applicability of transfer learning utilizing models pre-trained on non-software engineering data applied to the problem of classifying software UML diagrams. Our experimental results show training reacts positively to transfer learning as related to sample size, even though the pre-trained model was not exposed to training instances from the software domain. We contrast the transferred network with other networks to show its advantage on different sized training sets, which indicates that transfer learning is equally effective to custom deep architectures when large amounts of training data is not available

    Exploring the Applicability of Low‑Shot Learning in Mining Software Repositories

    Get PDF
    Background: Despite the well-documented and numerous recent successes of deep learning, the application of standard deep architectures to many classification problems within empirical software engineering remains problematic due to the large volumes of labeled data required for training. Here we make the argument that, for some problems, this hurdle can be overcome by taking advantage of low-shot learning in combination with simpler deep architectures that reduce the total number of parameters that need to be learned. Findings: We apply low-shot learning to the task of classifying UML class and sequence diagrams from Github, and demonstrate that surprisingly good performance can be achieved by using only tens or hundreds of examples for each category when paired with an appropriate architecture. Using a large, off-the-shelf architecture, on the other hand, doesn’t perform beyond random guessing even when trained on thousands of samples. Conclusion: Our findings suggest that identifying problems within empirical software engineering that lend themselves to low-shot learning could accelerate the adoption of deep learning algorithms within the empirical software engineering community

    Level velocity statistics of hyperbolic chaos

    Get PDF
    A generalized version of standard map is quantized as a model of quantum chaos. It is shown that, in hyperbolic chaotic regime, second moment of quantum level velocity is 1/\sim 1/\hbar as predicted by the random matrix theory.Comment: 11 pages, 4 figure

    Learning in the Machine: To Share or Not to Share?

    Get PDF
    Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight-sharing

    Learning in the Machine: To Share or Not to Share?

    Get PDF
    Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight sharing

    An End-to-End CNN with Attentional Mechanism Applied to Raw EEG in a BCI Classification Task

    Get PDF
    Objective. Motor-imagery (MI) classification base on electroencephalography (EEG) has been long studied in neuroscience and more recently widely used in healthcare applications such as mobile assistive robots and neurorehabilitation. In particular, EEG-based motor-imagery classification methods that rely on convolutional neural networks (CNNs) have achieved relatively high classification accuracy. However, naively training CNNs to classify raw EEG data from all channels, especially for high-density EEG, is computationally demanding and requires huge training sets. It often also introduces many irrelevant input features, making it difficult for the CNN to extract the informative ones. This problem is compounded by a dearth of training data, which is particularly acute for MI tasks, because these are cognitively demanding and thus fatigue inducing. Approach. To address these issues, we proposed an end-to-end CNN-based neural network with attentional mechanism together with different data augmentation (DA) techniques. We tested it on two benchmark MI datasets, Brain-Computer Interface (BCI) Competition IV 2a and 2b. BCI. Main results. Our proposed neural-network architecture outperformed all state-of-the-art methods that we found in the literature, with and without DA, reaching an average classification accuracy of 93.6% and 87.83% on BCI 2a and 2b, respectively. We also directly compare decoding of MI and ME tasks. Focusing on MI classification, we find optimal channel configurations and the best DA techniques as well as investigate combining data across participants and the role of transfer learning. Significance. Our proposed approach improves the classification accuracy for MI in the benchmark datasets. In addition, collecting our own dataset enables us to compare MI and ME and investigate various aspects of EEG decoding critical for neuroscience and BCI

    A Fortran-Keras Deep Learning Bridge for Scientific Computing

    Get PDF
    Implementing artificial neural networks is commonly achieved via high-level programming languages such as Python and easy-to-use deep learning libraries such as Keras. These software libraries come preloaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way bridge connects environments where deep learning resources are plentiful with those where they are scarce. The paper describes several unique features offered by FKB, such as customizable layers, loss functions, and network ensembles. The paper concludes with a case study that applies FKB to address open questions about the robustness of an experimental approach to global climate simulation, in which subgrid physics are outsourced to deep neural network emulators. In this context, FKB enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran. Such a process allows the model’s emergent behavior to be assessed, i.e., when fit imperfections are coupled to explicit planetary-scale fluid dynamics. The results reveal a previously unrecognized strong relationship between offline validation error and online performance, in which the choice of the optimizer proves unexpectedly critical. This in turn reveals many new neural network architectures that produce considerable improvements in climate model stability including some with reduced error, for an especially challenging training dataset
    corecore