57,064 research outputs found

    Feature emergence via margin maximization: case studies in algebraic tasks

    Full text link
    Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning. While there have been significant recent strides in some cases towards understanding how neural networks implement specific target functions, this paper explores a complementary question -- why do networks arrive at particular computational strategies? Our inquiry focuses on the algebraic learning tasks of modular addition, sparse parities, and finite group operations. Our primary theoretical findings analytically characterize the features learned by stylized neural networks for these algebraic tasks. Notably, our main technique demonstrates how the principle of margin maximization alone can be used to fully specify the features learned by the network. Specifically, we prove that the trained networks utilize Fourier features to perform modular addition and employ features corresponding to irreducible group-theoretic representations to perform compositions in general groups, aligning closely with the empirical observations of Nanda et al. and Chughtai et al. More generally, we hope our techniques can help to foster a deeper understanding of why neural networks adopt specific computational strategies

    Molding the Knowledge in Modular Neural Networks

    Get PDF
    Problem description. The learning of monolithic neural networks becomes harder with growing network size. Likewise the knowledge obtained while learning becomes harder to extract. Such disadvantages are caused by a lack of internal structure, that by its presence would reduce the degrees of freedom in evolving to a training target. A suitable internal structure with respect to modular network construction as well as to nodal discrimination is required. Details on the grouping and selection of nodes can sometimes be concluded from the characteristics of the application area; otherwise a comprehensive search within the solution space is necessary

    Modular Neural Network Approaches for Surgical Image Recognition

    Full text link
    Deep learning-based applications have seen a lot of success in recent years. Text, audio, image, and video have all been explored with great success using deep learning approaches. The use of convolutional neural networks (CNN) in computer vision, in particular, has yielded reliable results. In order to achieve these results, a large amount of data is required. However, the dataset cannot always be accessible. Moreover, annotating data can be difficult and time-consuming. Self-training is a semi-supervised approach that managed to alleviate this problem and achieve state-of-the-art performances. Theoretical analysis even proved that it may result in a better generalization than a normal classifier. Another problem neural networks can face is the increasing complexity of modern problems, requiring a high computational and storage cost. One way to mitigate this issue, a strategy that has been inspired by human cognition known as modular learning, can be employed. The principle of the approach is to decompose a complex problem into simpler sub-tasks. This approach has several advantages, including faster learning, better generalization, and enables interpretability. In the first part of this paper, we introduce and evaluate different architectures of modular learning for Dorsal Capsulo-Scapholunate Septum (DCSS) instability classification. Our experiments have shown that modular learning improves performances compared to non-modular systems. Moreover, we found that weighted modular, that is to weight the output using the probabilities from the gating module, achieved an almost perfect classification. In the second part, we present our approach for data labeling and segmentation with self-training applied on shoulder arthroscopy images

    A neural network with modular hierarchical learning

    Get PDF
    This invention provides a new hierarchical approach for supervised neural learning of time dependent trajectories. The modular hierarchical methodology leads to architectures which are more structured than fully interconnected networks. The networks utilize a general feedforward flow of information and sparse recurrent connections to achieve dynamic effects. The advantages include the sparsity of units and connections, the modular organization. A further advantage is that the learning is much more circumscribed learning than in fully interconnected systems. The present invention is embodied by a neural network including a plurality of neural modules each having a pre-established performance capability wherein each neural module has an output outputting present results of the performance capability and an input for changing the present results of the performance capabilitiy. For pattern recognition applications, the performance capability may be an oscillation capability producing a repeating wave pattern as the present results. In the preferred embodiment, each of the plurality of neural modules includes a pre-established capability portion and a performance adjustment portion connected to control the pre-established capability portion
    • …
    corecore