171,220 research outputs found

    Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

    Full text link
    With the increasing volume of data in the world, the best approach for learning from this data is to exploit an online learning algorithm. Online ensemble methods are online algorithms which take advantage of an ensemble of classifiers to predict labels of data. Prediction with expert advice is a well-studied problem in the online ensemble learning literature. The Weighted Majority algorithm and the randomized weighted majority (RWM) are the most well-known solutions to this problem, aiming to converge to the best expert. Since among some expert, the best one does not necessarily have the minimum error in all regions of data space, defining specific regions and converging to the best expert in each of these regions will lead to a better result. In this paper, we aim to resolve this defect of RWM algorithms by proposing a novel online ensemble algorithm to the problem of prediction with expert advice. We propose a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.Comment: 15 pages, 3 figure

    Evaluating Data Assimilation Algorithms

    Get PDF
    Data assimilation leads naturally to a Bayesian formulation in which the posterior probability distribution of the system state, given the observations, plays a central conceptual role. The aim of this paper is to use this Bayesian posterior probability distribution as a gold standard against which to evaluate various commonly used data assimilation algorithms. A key aspect of geophysical data assimilation is the high dimensionality and low predictability of the computational model. With this in mind, yet with the goal of allowing an explicit and accurate computation of the posterior distribution, we study the 2D Navier-Stokes equations in a periodic geometry. We compute the posterior probability distribution by state-of-the-art statistical sampling techniques. The commonly used algorithms that we evaluate against this accurate gold standard, as quantified by comparing the relative error in reproducing its moments, are 4DVAR and a variety of sequential filtering approximations based on 3DVAR and on extended and ensemble Kalman filters. The primary conclusions are that: (i) with appropriate parameter choices, approximate filters can perform well in reproducing the mean of the desired probability distribution; (ii) however they typically perform poorly when attempting to reproduce the covariance; (iii) this poor performance is compounded by the need to modify the covariance, in order to induce stability. Thus, whilst filters can be a useful tool in predicting mean behavior, they should be viewed with caution as predictors of uncertainty. These conclusions are intrinsic to the algorithms and will not change if the model complexity is increased, for example by employing a smaller viscosity, or by using a detailed NWP model

    Entropy-based approach to missing-links prediction

    Get PDF
    Link-prediction is an active research field within network theory, aiming at uncovering missing connections or predicting the emergence of future relationships from the observed network structure. This paper represents our contribution to the stream of research concerning missing links prediction. Here, we propose an entropy-based method to predict a given percentage of missing links, by identifying them with the most probable non-observed ones. The probability coefficients are computed by solving opportunely defined null-models over the accessible network structure. Upon comparing our likelihood-based, local method with the most popular algorithms over a set of economic, financial and food networks, we find ours to perform best, as pointed out by a number of statistical indicators (e.g. the precision, the area under the ROC curve, etc.). Moreover, the entropy-based formalism adopted in the present paper allows us to straightforwardly extend the link-prediction exercise to directed networks as well, thus overcoming one of the main limitations of current algorithms. The higher accuracy achievable by employing these methods - together with their larger flexibility - makes them strong competitors of available link-prediction algorithms

    Modeling the Flow of Yield-Stress Fluids in Porous Media

    Full text link
    Yield-stress is a problematic and controversial non-Newtonian flow phenomenon. In this article, we investigate the flow of yield-stress substances through porous media within the framework of pore-scale network modeling. We also investigate the validity of the Minimum Threshold Path (MTP) algorithms to predict the pressure yield point of a network depicting random or regular porous media. Percolation theory as a basis for predicting the yield point of a network is briefly presented and assessed. In the course of this study, a yield-stress flow simulation model alongside several numerical algorithms related to yield-stress in porous media were developed, implemented and assessed. The general conclusion is that modeling the flow of yield-stress fluids in porous media is too difficult and problematic. More fundamental modeling strategies are required to tackle this problem in the future.Comment: 27 pages and 5 figure
    • …
    corecore