171,220 research outputs found
Recommended from our members
The limits of human predictions of recidivism.
Dressel and Farid recently found that laypeople were as accurate as statistical algorithms in predicting whether a defendant would reoffend, casting doubt on the value of risk assessment tools in the criminal justice system. We report the results of a replication and extension of Dressel and Farid's experiment. Under conditions similar to the original study, we found nearly identical results, with humans and algorithms performing comparably. However, algorithms beat humans in the three other datasets we examined. The performance gap between humans and algorithms was particularly pronounced when, in a departure from the original study, participants were not provided with immediate feedback on the accuracy of their responses. Algorithms also outperformed humans when the information provided for predictions included an enriched (versus restricted) set of risk factors. These results suggest that algorithms can outperform human predictions of recidivism in ecologically valid settings
Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm
With the increasing volume of data in the world, the best approach for
learning from this data is to exploit an online learning algorithm. Online
ensemble methods are online algorithms which take advantage of an ensemble of
classifiers to predict labels of data. Prediction with expert advice is a
well-studied problem in the online ensemble learning literature. The Weighted
Majority algorithm and the randomized weighted majority (RWM) are the most
well-known solutions to this problem, aiming to converge to the best expert.
Since among some expert, the best one does not necessarily have the minimum
error in all regions of data space, defining specific regions and converging to
the best expert in each of these regions will lead to a better result. In this
paper, we aim to resolve this defect of RWM algorithms by proposing a novel
online ensemble algorithm to the problem of prediction with expert advice. We
propose a cascading version of RWM to achieve not only better experimental
results but also a better error bound for sufficiently large datasets.Comment: 15 pages, 3 figure
Evaluating Data Assimilation Algorithms
Data assimilation leads naturally to a Bayesian formulation in which the
posterior probability distribution of the system state, given the observations,
plays a central conceptual role. The aim of this paper is to use this Bayesian
posterior probability distribution as a gold standard against which to evaluate
various commonly used data assimilation algorithms.
A key aspect of geophysical data assimilation is the high dimensionality and
low predictability of the computational model. With this in mind, yet with the
goal of allowing an explicit and accurate computation of the posterior
distribution, we study the 2D Navier-Stokes equations in a periodic geometry.
We compute the posterior probability distribution by state-of-the-art
statistical sampling techniques. The commonly used algorithms that we evaluate
against this accurate gold standard, as quantified by comparing the relative
error in reproducing its moments, are 4DVAR and a variety of sequential
filtering approximations based on 3DVAR and on extended and ensemble Kalman
filters.
The primary conclusions are that: (i) with appropriate parameter choices,
approximate filters can perform well in reproducing the mean of the desired
probability distribution; (ii) however they typically perform poorly when
attempting to reproduce the covariance; (iii) this poor performance is
compounded by the need to modify the covariance, in order to induce stability.
Thus, whilst filters can be a useful tool in predicting mean behavior, they
should be viewed with caution as predictors of uncertainty. These conclusions
are intrinsic to the algorithms and will not change if the model complexity is
increased, for example by employing a smaller viscosity, or by using a detailed
NWP model
Entropy-based approach to missing-links prediction
Link-prediction is an active research field within network theory, aiming at
uncovering missing connections or predicting the emergence of future
relationships from the observed network structure. This paper represents our
contribution to the stream of research concerning missing links prediction.
Here, we propose an entropy-based method to predict a given percentage of
missing links, by identifying them with the most probable non-observed ones.
The probability coefficients are computed by solving opportunely defined
null-models over the accessible network structure. Upon comparing our
likelihood-based, local method with the most popular algorithms over a set of
economic, financial and food networks, we find ours to perform best, as pointed
out by a number of statistical indicators (e.g. the precision, the area under
the ROC curve, etc.). Moreover, the entropy-based formalism adopted in the
present paper allows us to straightforwardly extend the link-prediction
exercise to directed networks as well, thus overcoming one of the main
limitations of current algorithms. The higher accuracy achievable by employing
these methods - together with their larger flexibility - makes them strong
competitors of available link-prediction algorithms
Modeling the Flow of Yield-Stress Fluids in Porous Media
Yield-stress is a problematic and controversial non-Newtonian flow
phenomenon. In this article, we investigate the flow of yield-stress substances
through porous media within the framework of pore-scale network modeling. We
also investigate the validity of the Minimum Threshold Path (MTP) algorithms to
predict the pressure yield point of a network depicting random or regular
porous media. Percolation theory as a basis for predicting the yield point of a
network is briefly presented and assessed. In the course of this study, a
yield-stress flow simulation model alongside several numerical algorithms
related to yield-stress in porous media were developed, implemented and
assessed. The general conclusion is that modeling the flow of yield-stress
fluids in porous media is too difficult and problematic. More fundamental
modeling strategies are required to tackle this problem in the future.Comment: 27 pages and 5 figure
- …