1,454 research outputs found
Exploring the Efficacy of Transfer Learning in Mining Image-Based Software Artifacts
Transfer learning allows us to train deep architectures requiring a large
number of learned parameters, even if the amount of available data is limited,
by leveraging existing models previously trained for another task. Here we
explore the applicability of transfer learning utilizing models pre-trained on
non-software engineering data applied to the problem of classifying software
UML diagrams. Our experimental results show training reacts positively to
transfer learning as related to sample size, even though the pre-trained model
was not exposed to training instances from the software domain. We contrast the
transferred network with other networks to show its advantage on different
sized training sets, which indicates that transfer learning is equally
effective to custom deep architectures when large amounts of training data is
not available
Exploring the Applicability of Low‑Shot Learning in Mining Software Repositories
Background: Despite the well-documented and numerous recent successes of deep learning, the application of standard deep architectures to many classification problems within empirical software engineering remains problematic due to the large volumes of labeled data required for training. Here we make the argument that, for some problems, this hurdle can be overcome by taking advantage of low-shot learning in combination with simpler deep architectures that reduce the total number of parameters that need to be learned.
Findings: We apply low-shot learning to the task of classifying UML class and sequence diagrams from Github, and demonstrate that surprisingly good performance can be achieved by using only tens or hundreds of examples for each category when paired with an appropriate architecture. Using a large, off-the-shelf architecture, on the other hand, doesn’t perform beyond random guessing even when trained on thousands of samples.
Conclusion: Our findings suggest that identifying problems within empirical software engineering that lend themselves to low-shot learning could accelerate the adoption of deep learning algorithms within the empirical software engineering community
Learning in the Machine: To Share or Not to Share?
Weight-sharing is one of the pillars behind Convolutional Neural Networks and their successes. However, in physical neural systems such as the brain, weight-sharing is implausible. This discrepancy raises the fundamental question of whether weight-sharing is necessary. If so, to which degree of precision? If not, what are the alternatives? The goal of this study is to investigate these questions, primarily through simulations where the weight-sharing assumption is relaxed. Taking inspiration from neural circuitry, we explore the use of Free Convolutional Networks and neurons with variable connection patterns. Using Free Convolutional Networks, we show that while weight-sharing is a pragmatic optimization approach, it is not a necessity in computer vision applications. Furthermore, Free Convolutional Networks match the performance observed in standard architectures when trained using properly translated data (akin to video). Under the assumption of translationally augmented data, Free Convolutional Networks learn translationally invariant representations that yield an approximate form of weight-sharing
Learning in the Machine: To Share or Not to Share?
Weight-sharing is one of the pillars behind Convolutional Neural Networks and
their successes. However, in physical neural systems such as the brain,
weight-sharing is implausible. This discrepancy raises the fundamental question
of whether weight-sharing is necessary. If so, to which degree of precision? If
not, what are the alternatives? The goal of this study is to investigate these
questions, primarily through simulations where the weight-sharing assumption is
relaxed. Taking inspiration from neural circuitry, we explore the use of Free
Convolutional Networks and neurons with variable connection patterns. Using
Free Convolutional Networks, we show that while weight-sharing is a pragmatic
optimization approach, it is not a necessity in computer vision applications.
Furthermore, Free Convolutional Networks match the performance observed in
standard architectures when trained using properly translated data (akin to
video). Under the assumption of translationally augmented data, Free
Convolutional Networks learn translationally invariant representations that
yield an approximate form of weight sharing
Level velocity statistics of hyperbolic chaos
A generalized version of standard map is quantized as a model of quantum
chaos. It is shown that, in hyperbolic chaotic regime, second moment of quantum
level velocity is as predicted by the random matrix theory.Comment: 11 pages, 4 figure
An End-to-End CNN with Attentional Mechanism Applied to Raw EEG in a BCI Classification Task
Objective. Motor-imagery (MI) classification base on electroencephalography (EEG) has been long studied in neuroscience and more recently widely used in healthcare applications such as mobile assistive robots and neurorehabilitation. In particular, EEG-based motor-imagery classification methods that rely on convolutional neural networks (CNNs) have achieved relatively high classification accuracy. However, naively training CNNs to classify raw EEG data from all channels, especially for high-density EEG, is computationally demanding and requires huge training sets. It often also introduces many irrelevant input features, making it difficult for the CNN to extract the informative ones. This problem is compounded by a dearth of training data, which is particularly acute for MI tasks, because these are cognitively demanding and thus fatigue inducing. Approach. To address these issues, we proposed an end-to-end CNN-based neural network with attentional mechanism together with different data augmentation (DA) techniques. We tested it on two benchmark MI datasets, Brain-Computer Interface (BCI) Competition IV 2a and 2b. BCI. Main results. Our proposed neural-network architecture outperformed all state-of-the-art methods that we found in the literature, with and without DA, reaching an average classification accuracy of 93.6% and 87.83% on BCI 2a and 2b, respectively. We also directly compare decoding of MI and ME tasks. Focusing on MI classification, we find optimal channel configurations and the best DA techniques as well as investigate combining data across participants and the role of transfer learning. Significance. Our proposed approach improves the classification accuracy for MI in the benchmark datasets. In addition, collecting our own dataset enables us to compare MI and ME and investigate various aspects of EEG decoding critical for neuroscience and BCI
- …