479 research outputs found

    Function approximation in Hilbert spaces: a general sequential method and a particular implementation with neural networks

    Get PDF
    A sequential method for approximating vectors in Hilbert spaces, called Sequential Approximation with Optimal Coefficients (SAOC), is presented. Most of the existing sequential methods choose the new term so that it matches the previous residue as best as possible. Although this strategy leads to approximations convergent towards the target function, it may be far from being the best strategy with regard to the number of terms of the approximation. SAOC combines two key ideas. The first is the optimization of the coefficients (the linear part of the approximation). The second is the flexibility to choose the frequencies (the nonlinear part). The only relation with the residue has to do with its approximation capability of the target vector ff. SAOC maintains orthogonal-like properties. The theoretical results obtained proof that, under reasonable conditions, the construction of the approximation is always possible and, in the limit, the residue of the approximation obtained with SAOC is the best one that can be obtained with any subset of the given set of vectors. In addition, it seems that it should achieve the same accuracy that other existent sequential methods with fewer terms. In the particular case of L2L^2, it can be applied to polynomials, Fourier series, wavelets and neural networks, among others. Also, a particular implementation using neural networks is presented. In fact, the profit is reciprocal, because SAOC can be used as an inspiration to construct and train a neural network.Postprint (published version

    Comparing error minimized extreme learning machines and support vector sequential feed-forward neural networks

    Get PDF
    Recently, error minimized extreme learning machines (EM-ELMs) have been proposed as a simple and efficient approach to build single-hidden-layer feed-forward networks (SLFNs) sequentially. They add random hidden nodes one by one (or group by group) and update the output weights incrementally to minimize the sum-of-squares error in the training set. Other very similar methods that also construct SLFNs sequentially had been reported earlier with the main difference that their hidden-layer weights are a subset of the data instead of being random. By analogy with the concept of support vectors original of support vector machines (SVMs), these approaches can be referred to as support vector sequential feed-forward neural networks (SV-SFNNs), and they are a particular case of the Sequential Approximation with Optimal Coefficients and Interacting Frequencies (SAOCIF) method. In this paper, it is firstly shown that EM-ELMs can also be cast as a particular case of SAOCIF. In particular, EM-ELMs can easily be extended to test some number of random candidates at each step and select the best of them, as SAOCIF does. Moreover, it is demonstrated that the cost of the calculation of the optimal output-layer weights in the originally proposed EM-ELMs can be improved if it is replaced by the one included in SAOCIF. Secondly, we present the results of an experimental study on 10 benchmark classification and 10 benchmark regression data sets, comparing EM-ELMs and SV-SFNNs, that was carried out under the same conditions for the two models. Although both models have the same (efficient) computational cost, a statistically significant improvement in generalization performance of SV-SFNNs vs. EM-ELMs was found in 12 out of the 20 benchmark problems.Postprint (published version

    Benchmarking the selection of the hidden-layer weights in extreme learning machines

    Get PDF
    Recent years have seen a growing interest in neural networks whose hidden-layer weights are randomly selected, such as Extreme Learning Machines (ELMs). These models are motivated by their ease of development, high computational learning speed and relatively good results. Alternatively, constructive models that select the hidden-layer weights as a subset of the data have shown superior performance than random-based ones in some cases. In this work, we present a comparison between original ELMs (i.e., ELMs where the hidden-layer weights are selected randomly, that we will call ELM-Random) and a modified version of ELMs where the hidden-layer weights are a subset of the input data (that we will call ELM-Input). We will focus our comparison on the behavior of both strategies for different sizes of the training set and different network sizes. The results on several benchmark data sets for classification problems show that ELM-Input has superior performance than ELM-Random in some cases and similar in the rest. In some cases, this general trend is observed for all sizes of the training set and all network sizes. In other cases, it is mostly observed when the size of the training set is small. Therefore, the strategy of selecting the hidden-layer weights among the data can be considered as a good alternative or complement to the standard random selection for ELMs.Peer ReviewedPostprint (author's final draft

    Weighted Contrastive Divergence

    Get PDF
    Learning algorithms for energy based Boltzmann architectures that rely on gradient descent are in general computationally prohibitive, typically due to the exponential number of terms involved in computing the partition function. In this way one has to resort to approximation schemes for the evaluation of the gradient. This is the case of Restricted Boltzmann Machines (RBM) and its learning algorithm Contrastive Divergence (CD). It is well-known that CD has a number of shortcomings, and its approximation to the gradient has several drawbacks. Overcoming these defects has been the basis of much research and new algorithms have been devised, such as persistent CD. In this manuscript we propose a new algorithm that we call Weighted CD (WCD), built from small modifications of the negative phase in standard CD. However small these modifications may be, experimental work reported in this paper suggest that WCD provides a significant improvement over standard CD and persistent CD at a small additional computational cost

    On the use of pairwise distance learning for brain signal classification with limited observations

    Get PDF
    The increasing access to brain signal data using electroencephalography creates new opportunities to study electrophysiological brain activity and perform ambulatory diagnoses of neurological disorders. This work proposes a pairwise distance learning approach for schizophrenia classification relying on the spectral properties of the signal. To be able to handle clinical trials with a limited number of observations (i.e. case and/or control individuals), we propose a Siamese neural network architecture to learn a discriminative feature space from pairwise combinations of observations per channel. In this way, the multivariate order of the signal is used as a form of data augmentation, further supporting the network generalization ability. Convolutional layers with parameters learned under a cosine contrastive loss are proposed to adequately explore spectral images derived from the brain signal. The proposed approach for schizophrenia diagnostic was tested on reference clinical trial data under resting-state protocol, achieving 0.95 ± 0.05 accuracy, 0.98 ± 0.02 sensitivity and 0.92 ± 0.07 specificity. Results show that the features extracted using the proposed neural network are remarkably superior than baselines to diagnose schizophrenia (+20pp in accuracy and sensitivity), suggesting the existence of non-trivial electrophysiological brain patterns able to capture discriminative neuroplasticity profiles among individuals. The code is available on Github: https://github.com/DCalhas/siamese_schizophrenia_eeg.Peer ReviewedPostprint (author's final draft

    A fuzzy rule model for high level musical features on automated composition systems

    Get PDF
    Algorithmic composition systems are now well-understood. However, when they are used for specific tasks like creating material for a part of a piece, it is common to prefer, from all of its possible outputs, those exhibiting specific properties. Even though the number of valid outputs is huge, many times the selection is performed manually, either using expertise in the algorithmic model, by means of sampling techniques, or some times even by chance. Automations of this process have been done traditionally by using machine learning techniques. However, whether or not these techniques are really capable of capturing the human rationality, through which the selection is done, to a great degree remains as an open question. The present work discusses a possible approach, that combines expert’s opinion and a fuzzy methodology for rule extraction, to model high level features. An early implementation able to explore the universe of outputs of a particular algorithm by means of the extracted rules is discussed. The rules search for objects similar to those having a desired and pre-identified feature. In this sense, the model can be seen as a finder of objects with specific properties.Peer ReviewedPostprint (author's final draft

    On the selection of hidden neurons with heuristic search strategies for approximation

    Get PDF
    Feature Selection techniques usually follow some search strategy to select a suitable subset from a set of features. Most neural network growing algorithms perform a search with Forward Selection with the objective of finding a reasonably good subset of neurons. Using this link between both fields (feature selection and neuron selection), we propose and analyze different algorithms for the construction of neural networks based on heuristic search strategies coming from the feature selection field. The results of an experimental comparison to Forward Selection using both synthetic and real data show that a much better approximation can be achieved, though at the expense of a higher computational cost.Peer ReviewedPostprint (published version

    Extended linear models with Gaussian prior on the parameters and adaptive expansion vectors

    Get PDF
    We present an approximate Bayesian method for regression and classification with models linear in the parameters. Similar to the Relevance Vector Machine (RVM), each parameter is associated with an expansion vector. Unlike the RVM, the number of expansion vectors is specified beforehand. We assume an overall Gaussian prior on the parameters and find, with a gradient based process, the expansion vectors that (locally) maximize the evidence. This approach has lower computational demands than the RVM, and has the advantage that the vectors do not necessarily belong to the training set. Therefore, in principle, better vectors can be found. Furthermore, other hyperparameters can be learned in the same smooth joint optimization. Experimental results show that the freedom of the expansion vectors to be located away from the training data causes overfitting problems. These problems are alleviated by including a hyperprior that penalizes expansion vectors located far away from the input data.Peer ReviewedPostprint (author's final draft

    A methodological approach for algorithmic composition systems' parameter spaces aesthetic exploration

    Get PDF
    Algorithmic composition is the process of creating musical material by means of formal methods. As a consequence of its design, algorithmic composition systems are (explicitly or implicitly) described in terms of parameters. Thus, parameter space exploration plays a key role in learning the system's capabilities. However, in the computer music field, this task has received little attention. This is due in part, because the produced changes on the human perception of the outputs, as a response to changes on the parameters, could be highly nonlinear, therefore models with strongly predictable outputs are needed. The present work describes a methodology for the human perceptual (or aesthetic) exploration of generative systems' parameter spaces. As the systems' outputs are intended to produce an aesthetic experience on humans, audition plays a central role in the process. The methodology starts from a set of parameter combinations which are perceptually evaluated by the user. The sampling process of such combinations depends on the system under study and possible on heuristic considerations. The evaluated set is processed by a compaction algorithm able to generate linguistic rules describing the distinct perceptions (classes) of the user evaluation. The semantic level of the extracted rules allows for interpretability, while showing great potential in describing high and low-level musical entities. As the resulting rules represent discrete points in the parameter space, further possible extensions for interpolation between points are also discussed. Finally, some practical implementations and paths for further research are presented.Peer ReviewedPostprint (author's final draft

    Classifying and generalizing successful parameter combinations for sound design

    Get PDF
    Operating parametric systems in the context of sound design imposes cognitive and practical challenges. The present contribution applies rule extraction to analyze and to generalize a set of parameter combinations, which have been preselected by a user since they produce sound results within a desired perceptual category. Then, it is discussed how and under which conditions these generalizations can be used, for example, for the automation of specific tasks.Peer ReviewedPostprint (author's final draft
    corecore