14,331 research outputs found

    Universality of Bayesian mixture predictors

    Full text link
    The problem is that of sequential probability forecasting for finite-valued time series. The data is generated by an unknown probability distribution over the space of all one-way infinite sequences. It is known that this measure belongs to a given set C, but the latter is completely arbitrary (uncountably infinite, without any structure given). The performance is measured with asymptotic average log loss. In this work it is shown that the minimax asymptotic performance is always attainable, and it is attained by a convex combination of a countably many measures from the set C (a Bayesian mixture). This was previously only known for the case when the best achievable asymptotic error is 0. This also contrasts previous results that show that in the non-realizable case all Bayesian mixtures may be suboptimal, while there is a predictor that achieves the optimal performance

    Reconciling modern machine learning practice and the bias-variance trade-off

    Full text link
    Breakthroughs in machine learning are rapidly changing science and society, yet our fundamental understanding of this technology has lagged far behind. Indeed, one of the central tenets of the field, the bias-variance trade-off, appears to be at odds with the observed behavior of methods used in the modern machine learning practice. The bias-variance trade-off implies that a model should balance under-fitting and over-fitting: rich enough to express underlying structure in data, simple enough to avoid fitting spurious patterns. However, in the modern practice, very rich models such as neural networks are trained to exactly fit (i.e., interpolate) the data. Classically, such models would be considered over-fit, and yet they often obtain high accuracy on test data. This apparent contradiction has raised questions about the mathematical foundations of machine learning and their relevance to practitioners. In this paper, we reconcile the classical understanding and the modern practice within a unified performance curve. This "double descent" curve subsumes the textbook U-shaped bias-variance trade-off curve by showing how increasing model capacity beyond the point of interpolation results in improved performance. We provide evidence for the existence and ubiquity of double descent for a wide spectrum of models and datasets, and we posit a mechanism for its emergence. This connection between the performance and the structure of machine learning models delineates the limits of classical analyses, and has implications for both the theory and practice of machine learning

    Ecological Traits Fail to Consistently Predict Moth Species Persistance in Managed Forest Stands

    Get PDF
    Species traits have been used as predictors of species extinction and colonization probabilities in fragmented landscapes. Thus far, trait-based analytical frameworks have been less commonly employed as predictive tools for species persistence following a disturbance. I tested whether life history traits, dietary traits, and functional traits were correlated with moth species persistence probabilities in forest stands subjected to varying levels of timber harvest. Three harvest treatments were used: control stands (unharvested since 1960), shelterwood cut stands (15% canopy removed), and patch cut stands (80% standing bole removed). Logistic regression models were built to assess whether species persistence probabilities were a function of species traits; separate models were constructed for each level of timber harvest treatment. Species persistence probabilities were mainly a function of pre-harvest abundances. Species traits had idiosyncratic effects on species persistence depending on the level of timber harvest employed. These results suggest that species traits may indirectly influence how moth species assemblages change as a result of forest management by determining pre-harvest abundance rather than persistence per se. The absence of significant trait effects on persistence probabilities may also reflect prior reduction in species trait space. That is, the range of species trait combinations sampled in this study was much lower than observed in historically unlogged eastern deciduous forest systems. Thus, the lack of significant trait-persistence correlations observed here might indicate historic extinctions of species from prior logging events that have not been offset by post-harvest recovery of original species assemblages

    Probabilistic and fuzzy reasoning in simple learning classifier systems

    Get PDF
    This paper is concerned with the general stimulus-response problem as addressed by a variety of simple learning c1assifier systems (CSs). We suggest a theoretical model from which the assessment of uncertainty emerges as primary concern. A number of representation schemes borrowing from fuzzy logic theory are reviewed, and sorne connections with a well-known neural architecture revisited. In pursuit of the uncertainty measuring goal, usage of explicit probability distributions in the action part of c1assifiers is advocated. Sorne ideas supporting the design of a hybrid system incorpo'rating bayesian learning on top of the CS basic algorithm are sketched
    • …
    corecore