4,954 research outputs found

    Inference of the Kinetic Ising Model with Heterogeneous Missing Data

    Get PDF
    We consider the problem of inferring a causality structure from multiple binary time series by using the Kinetic Ising Model in datasets where a fraction of observations is missing. We take our steps from a recent work on Mean Field methods for the inference of the model with hidden spins and develop a pseudo-Expectation-Maximization algorithm that is able to work even in conditions of severe data sparsity. The methodology relies on the Martin-Siggia-Rose path integral method with second order saddle-point solution to make it possible to calculate the log-likelihood in polynomial time, giving as output a maximum likelihood estimate of the couplings matrix and of the missing observations. We also propose a recursive version of the algorithm, where at every iteration some missing values are substituted by their maximum likelihood estimate, showing that the method can be used together with sparsification schemes like LASSO regularization or decimation. We test the performance of the algorithm on synthetic data and find interesting properties when it comes to the dependency on heterogeneity of the observation frequency of spins and when some of the hypotheses that are necessary to the saddle-point approximation are violated, such as the small couplings limit and the assumption of statistical independence between couplings

    Shaping the learning landscape in neural networks around wide flat minima

    Full text link
    Learning in Deep Neural Networks (DNN) takes place by minimizing a non-convex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points, and that such minimizers are often satisfactory at avoiding overfitting. How these two features can be kept under control in nonlinear devices composed of millions of tunable connections is a profound and far reaching open question. In this paper we study basic non-convex one- and two-layer neural network models which learn random patterns, and derive a number of basic geometrical and algorithmic features which suggest some answers. We first show that the error loss function presents few extremely wide flat minima (WFM) which coexist with narrower minima and critical points. We then show that the minimizers of the cross-entropy loss function overlap with the WFM of the error loss. We also show examples of learning devices for which WFM do not exist. From the algorithmic perspective we derive entropy driven greedy and message passing algorithms which focus their search on wide flat regions of minimizers. In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM. We corroborate the results by a numerical study of the correlations between the volumes of the minimizers, their Hessian and their generalization performance on real data.Comment: 37 pages (16 main text), 10 figures (7 main text

    Contextual impacts on industrial processes brought by the digital transformation of manufacturing: a systematic review

    Get PDF
    The digital transformation of manufacturing (a phenomenon also known as "Industry 4.0" or "Smart Manufacturing") is finding a growing interest both at practitioner and academic levels, but is still in its infancy and needs deeper investigation. Even though current and potential advantages of digital manufacturing are remarkable, in terms of improved efficiency, sustainability, customization, and flexibility, only a limited number of companies has already developed ad hoc strategies necessary to achieve a superior performance. Through a systematic review, this study aims at assessing the current state of the art of the academic literature regarding the paradigm shift occurring in the manufacturing settings, in order to provide definitions as well as point out recurring patterns and gaps to be addressed by future research. For the literature search, the most representative keywords, strict criteria, and classification schemes based on authoritative reference studies were used. The final sample of 156 primary publications was analyzed through a systematic coding process to identify theoretical and methodological approaches, together with other significant elements. This analysis allowed a mapping of the literature based on clusters of critical themes to synthesize the developments of different research streams and provide the most representative picture of its current state. Research areas, insights, and gaps resulting from this analysis contributed to create a schematic research agenda, which clearly indicates the space for future evolutions of the state of knowledge in this field
    • …
    corecore