8 research outputs found

    Distributed Learning for Multiple Source Data

    Get PDF
    Distributed learning is the problem of inferring a function when data to be analyzed is distributed across a network of agents. Separate domains of application may largely impose different constraints on the solution, including low computational power at every location, limited underlying connectivity (e.g. no broadcasting capability) or transferability constraints related to the enormous bandwidth requirement. Thus, it is no longer possible to send data in a central node where traditionally learning algorithms are used, while new techniques able to model and exploit locally the information on big data are necessary. Motivated by these observations, this thesis proposes new techniques able to efficiently overcome a fully centralized implementation, without requiring the presence of a coordinating node, while using only in-network communication. The focus is given on both supervised and unsupervised distributed learning procedures that, so far, have been addressed only in very specific settings only. For instance, some of them are not actually distributed because they just split the calculation between different subsystems, others call for the presence of a fusion center collecting at each iteration data from all the agents; some others are implementable only on specific network topologies such as fully connected graphs. In the first part of this thesis, these limits have been overcome by using spectral clustering, ensemble clustering or density-based approaches for realizing a pure distributed architecture where there is no hierarchy and all agents are peer. Each agent learns only from its own dataset, while the information about the others is unknown and obtained in a decentralized way through a process of communication and collaboration among the agents. Experimental results, and theoretical properties of convergence, prove the effectiveness of these proposals. In the successive part of the thesis, the proposed contributions have been tested in several real-word distributed applications. Telemedicine and e-health applications are found to be one of the most prolific area to this end. Moreover, also the mapping of learning algorithms onto low-power hardware resources is found as an interesting area of applications in the distributed wireless networks context. Finally, a study on the generation and control of renewable energy sources is also analyzed. Overall, the algorithms presented throughout the thesis cover a wide range of possible practical applications, and trace the path to many future extensions, either as scientific research or technological transfer results

    Scalable, probabilistic simulation in a distributed design environment

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2008.Cataloged from PDF version of thesis.Includes bibliographical references (p. 110-114).Integrated simulations have been used to predict and analyze the integrated behavior of large, complex product and technology systems throughout their design cycles. During the process of integration, uncertainties arise from many sources, such as material properties, manufacturing variations, inaccuracy of models and so on. Concerns about uncertainty and robustness in large-scale integrated design can be significant, especially under the situations where the system performance is sensitive to the variations. Probabilistic simulation can be an important tool to enable uncertainty analysis, sensitivity analysis, risk assessment and reliability-based design in integrated simulation environments. Monte Carlo methods have been widely used to resolve probabilistic simulation problems. To achieve desired estimation accuracy, typically a large number of samples are needed. However, large integrated simulation systems are often computationally heavy and time-consuming due to their complexity and large scale, making the conventional Monte Carlo approach computationally prohibitive. This work focuses on developing an efficient and scalable approach for probabilistic simulations in integrated simulation environments. A predictive machine learning and statistical approach is proposed in this thesis.(cont.) Using random sampling of the system input distributions and running the integrated simulation for each input state, a random sample of limited size can be attained for each system output. Based on this limited output sample, a multilayer, feed-forward neural network is constructed as an estimator for the underlying cumulative distribution function. A mathematical model for the cumulative probability distribution function is then derived and used to estimate the underlying probability density function using differentiation. Statistically processing the sample used by the neural network is important so as to provide a good training set to the neural network estimator. Combining the statistical information from the empirical output distribution and the kernel estimation, a training set containing as much information about the underlying distribution as possible is attained. A back-propagation algorithm using adaptive learning rates is implemented to train the neural network estimator. To incorporate a required cumulative probability distribution function monotonicity hint into the learning process, a novel hint-reinforced back-propagation approach is created. The neural network estimator trained by empirical and kernel information (NN-EK estimator) can then finally be attained. To further improve the estimation, the statistical method of bootstrap aggregating (Bagging) is used. Multiple versions of the estimator are generated using bootstrap resampling and are aggregated to improve the estimator. A prototype implementation of the proposed approach is developed and test results on different models show its advantage over the conventional Monte Carlo approach in reducing the time by tens of times to achieve the same level of estimation accuracy.by Wei Mao.Ph.D

    Systems Analytics and Integration of Big Omics Data

    Get PDF
    A “genotype"" is essentially an organism's full hereditary information which is obtained from its parents. A ""phenotype"" is an organism's actual observed physical and behavioral properties. These may include traits such as morphology, size, height, eye color, metabolism, etc. One of the pressing challenges in computational and systems biology is genotype-to-phenotype prediction. This is challenging given the amount of data generated by modern Omics technologies. This “Big Data” is so large and complex that traditional data processing applications are not up to the task. Challenges arise in collection, analysis, mining, sharing, transfer, visualization, archiving, and integration of these data. In this Special Issue, there is a focus on the systems-level analysis of Omics data, recent developments in gene ontology annotation, and advances in biological pathways and network biology. The integration of Omics data with clinical and biomedical data using machine learning is explored. This Special Issue covers new methodologies in the context of gene–environment interactions, tissue-specific gene expression, and how external factors or host genetics impact the microbiome

    Fractional Calculus and the Future of Science

    Get PDF
    Newton foresaw the limitations of geometry’s description of planetary behavior and developed fluxions (differentials) as the new language for celestial mechanics and as the way to implement his laws of mechanics. Two hundred years later Mandelbrot introduced the notion of fractals into the scientific lexicon of geometry, dynamics, and statistics and in so doing suggested ways to see beyond the limitations of Newton’s laws. Mandelbrot’s mathematical essays suggest how fractals may lead to the understanding of turbulence, viscoelasticity, and ultimately to end of dominance of the Newton’s macroscopic world view.Fractional Calculus and the Future of Science examines the nexus of these two game-changing contributions to our scientific understanding of the world. It addresses how non-integer differential equations replace Newton’s laws to describe the many guises of complexity, most of which lay beyond Newton’s experience, and many had even eluded Mandelbrot’s powerful intuition. The book’s authors look behind the mathematics and examine what must be true about a phenomenon’s behavior to justify the replacement of an integer-order with a noninteger-order (fractional) derivative. This window into the future of specific science disciplines using the fractional calculus lens suggests how what is seen entails a difference in scientific thinking and understanding

    Predicting conditional probability densities with the Gaussian mixture - RVFL network

    No full text
    The incorporation of the Random Vector Functional Link (RVFL) concept into mixture models for predicting conditional probability densities achieves a considerable speed-up of the training process. This al lows the creation of a large ensemble of predictors, which results in an improvement in the generalization performance

    Predicting conditional probability densities with the Gaussian mixture - RVFL network

    No full text
    The incorporation of the Random Vector Functional Link (RVFL) concept into mixture models for predicting conditional probability densities achieves a considerable speed-up of the training process. This al lows the creation of a large ensemble of predictors, which results in an improvement in the generalization performance

    Advanced Operation and Maintenance in Solar Plants, Wind Farms and Microgrids

    Get PDF
    This reprint presents advances in operation and maintenance in solar plants, wind farms and microgrids. This compendium of scientific articles will help clarify the current advances in this subject, so it is expected that it will please the reader
    corecore