19,830 research outputs found

    Parallel growing and training of neural networks using output parallelism

    Get PDF
    In order to find an appropriate architecture for a large-scale real-world application automatically and efficiently, a natural method is to divide the original problem into a set of sub-problems. In this paper, we propose a simple neural network task decomposition method based on output parallelism. By using this method, a problem can be divided flexibly into several sub-problems as chosen, each of which is composed of the whole input vector and a fraction of the output vector. Each module (for one sub-problem) is responsible for producing a fraction of the output vector of the original problem. The hidden structure for the original problem’s output units are decoupled. These modules can be grown and trained in parallel on parallel processing elements. Incorporated with a constructive learning algorithm, our method does not require excessive computation and any prior knowledge concerning decomposition. The feasibility of output parallelism is analyzed and proved. Some benchmarks are implemented to test the validity of this method. Their results show that this method can reduce computational time, increase learning speed and improve generalization accuracy for both classification and regression problems

    Incremental learning with respect to new incoming input attributes

    Get PDF
    Neural networks are generally exposed to a dynamic environment where the training patterns or the input attributes (features) will likely be introduced into the current domain incrementally. This paper considers the situation where a new set of input attributes must be considered and added into the existing neural network. The conventional method is to discard the existing network and redesign one from scratch. This approach wastes the old knowledge and the previous effort. In order to reduce computational time, improve generalization accuracy, and enhance intelligence of the learned models, we present ILIA algorithms (namely ILIA1, ILIA2, ILIA3, ILIA4 and ILIA5) capable of Incremental Learning in terms of Input Attributes. Using the ILIA algorithms, when new input attributes are introduced into the original problem, the existing neural network can be retained and a new sub-network is constructed and trained incrementally. The new sub-network and the old one are merged later to form a new network for the changed problem. In addition, ILIA algorithms have the ability to decide whether the new incoming input attributes are relevant to the output and consistent with the existing input attributes or not and suggest to accept or reject them. Experimental results show that the ILIA algorithms are efficient and effective both for the classification and regression problems

    SCANN: Synthesis of Compact and Accurate Neural Networks

    Full text link
    Deep neural networks (DNNs) have become the driving force behind recent artificial intelligence (AI) research. An important problem with implementing a neural network is the design of its architecture. Typically, such an architecture is obtained manually by exploring its hyperparameter space and kept fixed during training. This approach is time-consuming and inefficient. Another issue is that modern neural networks often contain millions of parameters, whereas many applications and devices require small inference models. However, efforts to migrate DNNs to such devices typically entail a significant loss of classification accuracy. To address these challenges, we propose a two-step neural network synthesis methodology, called DR+SCANN, that combines two complementary approaches to design compact and accurate DNNs. At the core of our framework is the SCANN methodology that uses three basic architecture-changing operations, namely connection growth, neuron growth, and connection pruning, to synthesize feed-forward architectures with arbitrary structure. SCANN encapsulates three synthesis methodologies that apply a repeated grow-and-prune paradigm to three architectural starting points. DR+SCANN combines the SCANN methodology with dataset dimensionality reduction to alleviate the curse of dimensionality. We demonstrate the efficacy of SCANN and DR+SCANN on various image and non-image datasets. We evaluate SCANN on MNIST and ImageNet benchmarks. In addition, we also evaluate the efficacy of using dimensionality reduction alongside SCANN (DR+SCANN) on nine small to medium-size datasets. We also show that our synthesis methodology yields neural networks that are much better at navigating the accuracy vs. energy efficiency space. This would enable neural network-based inference even on Internet-of-Things sensors.Comment: 13 pages, 8 figure

    Evolutionary cellular configurations for designing feed-forward neural networks architectures

    Get PDF
    Proceeding of: 6th International Work-Conference on Artificial and Natural Neural Networks, IWANN 2001 Granada, Spain, June 13–15, 2001In the recent years, the interest to develop automatic methods to determine appropriate architectures of feed-forward neural networks has increased. Most of the methods are based on evolutionary computation paradigms. Some of the designed methods are based on direct representations of the parameters of the network. These representations do not allow scalability, so to represent large architectures, very large structures are required. An alternative more interesting are the indirect schemes. They codify a compact representation of the neural network. In this work, an indirect constructive encoding scheme is presented. This scheme is based on cellular automata representations in order to increase the scalability of the method

    Neural Network architectures design by Cellular Automata evolution

    Get PDF
    4th Conference of Systemics Cybernetics and Informatics. Orlando, 23-26 July 2000The design of the architecture is a crucial step in the successful application of a neural network. However, the architecture design is basically, in most cases, a human experts job. The design depends heavily on both, the expert experience and on a tedious trial-and-error process. Therefore, the development of automatic methods to determine the architecture of feedforward neural networks is a field of interest in the neural network community. These methods are generally based on search techniques, as genetic algorithms, simulated annealing or evolutionary strategies. Most of the designed methods are based on direct representation of the parameters of the network. This representation does not allow scalability, so to represent large architectures very large structures are required. In this work, an indirect constructive encoding scheme is proposed to find optimal architectures of feed-forward neural networks. This scheme is based on cellular automata representations in order to increase the scalability of the method.Publicad
    corecore