57,064 research outputs found
Feature emergence via margin maximization: case studies in algebraic tasks
Understanding the internal representations learned by neural networks is a
cornerstone challenge in the science of machine learning. While there have been
significant recent strides in some cases towards understanding how neural
networks implement specific target functions, this paper explores a
complementary question -- why do networks arrive at particular computational
strategies? Our inquiry focuses on the algebraic learning tasks of modular
addition, sparse parities, and finite group operations. Our primary theoretical
findings analytically characterize the features learned by stylized neural
networks for these algebraic tasks. Notably, our main technique demonstrates
how the principle of margin maximization alone can be used to fully specify the
features learned by the network. Specifically, we prove that the trained
networks utilize Fourier features to perform modular addition and employ
features corresponding to irreducible group-theoretic representations to
perform compositions in general groups, aligning closely with the empirical
observations of Nanda et al. and Chughtai et al. More generally, we hope our
techniques can help to foster a deeper understanding of why neural networks
adopt specific computational strategies
Molding the Knowledge in Modular Neural Networks
Problem description. The learning of monolithic neural networks becomes harder with growing network size. Likewise the knowledge obtained while learning becomes harder to extract. Such disadvantages are caused by a lack of internal structure, that by its presence would reduce the degrees of freedom in evolving to a training target. A suitable internal structure with respect to modular network construction as well as to nodal discrimination is required. Details on the grouping and selection of nodes can sometimes be concluded from the characteristics of the application area; otherwise a comprehensive search within the solution space is necessary
Modular Neural Network Approaches for Surgical Image Recognition
Deep learning-based applications have seen a lot of success in recent years.
Text, audio, image, and video have all been explored with great success using
deep learning approaches. The use of convolutional neural networks (CNN) in
computer vision, in particular, has yielded reliable results. In order to
achieve these results, a large amount of data is required. However, the dataset
cannot always be accessible. Moreover, annotating data can be difficult and
time-consuming. Self-training is a semi-supervised approach that managed to
alleviate this problem and achieve state-of-the-art performances. Theoretical
analysis even proved that it may result in a better generalization than a
normal classifier. Another problem neural networks can face is the increasing
complexity of modern problems, requiring a high computational and storage cost.
One way to mitigate this issue, a strategy that has been inspired by human
cognition known as modular learning, can be employed. The principle of the
approach is to decompose a complex problem into simpler sub-tasks. This
approach has several advantages, including faster learning, better
generalization, and enables interpretability.
In the first part of this paper, we introduce and evaluate different
architectures of modular learning for Dorsal Capsulo-Scapholunate Septum (DCSS)
instability classification. Our experiments have shown that modular learning
improves performances compared to non-modular systems. Moreover, we found that
weighted modular, that is to weight the output using the probabilities from the
gating module, achieved an almost perfect classification. In the second part,
we present our approach for data labeling and segmentation with self-training
applied on shoulder arthroscopy images
A neural network with modular hierarchical learning
This invention provides a new hierarchical approach for supervised neural learning of time dependent trajectories. The modular hierarchical methodology leads to architectures which are more structured than fully interconnected networks. The networks utilize a general feedforward flow of information and sparse recurrent connections to achieve dynamic effects. The advantages include the sparsity of units and connections, the modular organization. A further advantage is that the learning is much more circumscribed learning than in fully interconnected systems. The present invention is embodied by a neural network including a plurality of neural modules each having a pre-established performance capability wherein each neural module has an output outputting present results of the performance capability and an input for changing the present results of the performance capabilitiy. For pattern recognition applications, the performance capability may be an oscillation capability producing a repeating wave pattern as the present results. In the preferred embodiment, each of the plurality of neural modules includes a pre-established capability portion and a performance adjustment portion connected to control the pre-established capability portion
- …