190 research outputs found

    Correlations between hidden units in multilayer neural networks and replica symmetry breaking

    Full text link
    We consider feed-forward neural networks with one hidden layer, tree architecture and a fixed hidden-to-output Boolean function. Focusing on the saturation limit of the storage problem the influence of replica symmetry breaking on the distribution of local fields at the hidden units is investigated. These field distributions determine the probability for finding a specific activation pattern of the hidden units as well as the corresponding correlation coefficients and therefore quantify the division of labor among the hidden units. We find that although modifying the storage capacity and the distribution of local fields markedly replica symmetry breaking has only a minor effect on the correlation coefficients. Detailed numerical results are provided for the PARITY, COMMITTEE and AND machines with K=3 hidden units and nonoverlapping receptive fields.Comment: 9 pages, 3 figures, RevTex, accepted for publication in Phys. Rev.

    Correlation of internal representations in feed-forward neural networks

    Full text link
    Feed-forward multilayer neural networks implementing random input-output mappings develop characteristic correlations between the activity of their hidden nodes which are important for the understanding of the storage and generalization performance of the network. It is shown how these correlations can be calculated from the joint probability distribution of the aligning fields at the hidden units for arbitrary decoder function between hidden layer and output. Explicit results are given for the parity-, and-, and committee-machines with arbitrary number of hidden nodes near saturation.Comment: 6 pages, latex, 1 figur

    Odor naming and interpretation performance in 881 schizophrenia subjects: association with clinical parameters

    Get PDF
    BACKGROUND: Olfactory function tests are sensitive tools for assessing sensory-cognitive processing in schizophrenia. However, associations of central olfactory measures with clinical outcome parameters have not been simultaneously studied in large samples of schizophrenia patients. METHODS: In the framework of the comprehensive phenotyping of the GRAS (Göttingen Research Association for Schizophrenia) cohort, we modified and extended existing odor naming (active memory retrieval) and interpretation (attribute assignment) tasks to evaluate them in 881 schizophrenia patients and 102 healthy controls matched for age, gender and smoking behavior. Associations with emotional processing, neuropsychological test performance and disease outcome were studied. RESULTS: Schizophrenia patients underperformed controls in both olfactory tasks. Odor naming deficits were primarily associated with compromised cognition, interpretation deficits with positive symptom severity and general alertness. Contrasting schizophrenia extreme performers of odor interpretation (best versus worst percentile; N=88 each) and healthy individuals (N=102) underscores the obvious relationship between impaired odor interpretation and psychopathology, cognitive dysfunctioning, and emotional processing (all p<0.004). CONCLUSIONS: The strong association of performance in higher olfactory measures, odor naming and interpretation, with lead symptoms of schizophrenia and determinants of disease severity highlights their clinical and scientific significance. Based on the results obtained here in an exploratory fashion in a large patient sample, the development of an easy-to-use clinical test with improved psychometric properties may be encouraged

    Statistical Mechanics of Learning: A Variational Approach for Real Data

    Full text link
    Using a variational technique, we generalize the statistical physics approach of learning from random examples to make it applicable to real data. We demonstrate the validity and relevance of our method by computing approximate estimators for generalization errors that are based on training data alone.Comment: 4 pages, 2 figure

    Storage capacity of correlated perceptrons

    Full text link
    We consider an ensemble of KK single-layer perceptrons exposed to random inputs and investigate the conditions under which the couplings of these perceptrons can be chosen such that prescribed correlations between the outputs occur. A general formalism is introduced using a multi-perceptron costfunction that allows to determine the maximal number of random inputs as a function of the desired values of the correlations. Replica-symmetric results for K=2K=2 and K=3K=3 are compared with properties of two-layer networks of tree-structure and fixed Boolean function between hidden units and output. The results show which correlations in the hidden layer of multi-layer neural networks are crucial for the value of the storage capacity.Comment: 16 pages, Latex2

    Statistical Mechanical Development of a Sparse Bayesian Classifier

    Full text link
    The demand for extracting rules from high dimensional real world data is increasing in various fields. However, the possible redundancy of such data sometimes makes it difficult to obtain a good generalization ability for novel samples. To resolve this problem, we provide a scheme that reduces the effective dimensions of data by pruning redundant components for bicategorical classification based on the Bayesian framework. First, the potential of the proposed method is confirmed in ideal situations using the replica method. Unfortunately, performing the scheme exactly is computationally difficult. So, we next develop a tractable approximation algorithm, which turns out to offer nearly optimal performance in ideal cases when the system size is large. Finally, the efficacy of the developed classifier is experimentally examined for a real world problem of colon cancer classification, which shows that the developed method can be practically useful.Comment: 13 pages, 6 figure

    Training a perceptron in a discrete weight space

    Full text link
    On-line and batch learning of a perceptron in a discrete weight space, where each weight can take 2L+12 L+1 different values, are examined analytically and numerically. The learning algorithm is based on the training of the continuous perceptron and prediction following the clipped weights. The learning is described by a new set of order parameters, composed of the overlaps between the teacher and the continuous/clipped students. Different scenarios are examined among them on-line learning with discrete/continuous transfer functions and off-line Hebb learning. The generalization error of the clipped weights decays asymptotically as exp(Kα2)exp(-K \alpha^2)/exp(eλα)exp(-e^{|\lambda| \alpha}) in the case of on-line learning with binary/continuous activation functions, respectively, where α\alpha is the number of examples divided by N, the size of the input vector and KK is a positive constant that decays linearly with 1/L. For finite NN and LL, a perfect agreement between the discrete student and the teacher is obtained for αLln(NL)\alpha \propto \sqrt{L \ln(NL)}. A crossover to the generalization error 1/α\propto 1/\alpha, characterized continuous weights with binary output, is obtained for synaptic depth L>O(N)L > O(\sqrt{N}).Comment: 10 pages, 5 figs., submitted to PR

    Replica theory for learning curves for Gaussian processes on random graphs

    Full text link
    Statistical physics approaches can be used to derive accurate predictions for the performance of inference methods learning from potentially noisy data, as quantified by the learning curve defined as the average error versus number of training examples. We analyse a challenging problem in the area of non-parametric inference where an effectively infinite number of parameters has to be learned, specifically Gaussian process regression. When the inputs are vertices on a random graph and the outputs noisy function values, we show that replica techniques can be used to obtain exact performance predictions in the limit of large graphs. The covariance of the Gaussian process prior is defined by a random walk kernel, the discrete analogue of squared exponential kernels on continuous spaces. Conventionally this kernel is normalised only globally, so that the prior variance can differ between vertices; as a more principled alternative we consider local normalisation, where the prior variance is uniform
    corecore