367 research outputs found

    Multivariate Analysis from a Statistical Point of View

    Full text link
    Multivariate Analysis is an increasingly common tool in experimental high energy physics; however, many of the common approaches were borrowed from other fields. We clarify what the goal of a multivariate algorithm should be for the search for a new particle and compare different approaches. We also translate the Neyman-Pearson theory into the language of statistical learning theory.Comment: Talk from PhyStat2003, Stanford, Ca, USA, September 2003, 4 pages, LaTeX, 1 eps figures. PSN WEJT00

    Fast rates in statistical and online learning

    Get PDF
    The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning --- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most of these conditions are special cases of a single, unifying condition, that comes in two forms: the central condition for 'proper' learning algorithms that always output a hypothesis in the given model, and stochastic mixability for online algorithms that may make predictions outside of the model. We show that under surprisingly weak assumptions both conditions are, in a certain sense, equivalent. The central condition has a re-interpretation in terms of convexity of a set of pseudoprobabilities, linking it to density estimation under misspecification. For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning. Yet, while the Bernstein condition is two-sided, the central condition is one-sided, making it more suitable to deal with unbounded losses. In its stochastic mixability form, our condition generalizes both a stochastic exp-concavity condition identified by Juditsky, Rigollet and Tsybakov and Vovk's notion of mixability. Our unifying conditions thus provide a substantial step towards a characterization of fast rates in statistical learning, similar to how classical mixability characterizes constant regret in the sequential prediction with expert advice setting.Comment: 69 pages, 3 figure

    What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation

    Get PDF
    One of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a pre-specified maximum number of training updates, we investigate the convergence of the backpropagation algorithm with respect to a) the complexity of the required function approximation, b) the size of the network in relation to the size required for an optimal solution, and c) the degree of noise in the training data. In general, for a) the solution found is worse when the function to be approximated is more complex, for b) oversize networks can result in lower training and generalization error, and for c) the use of committee or ensemble techniques can be more beneficial as the amount of noise in the training data is increased. For the experiments we performed, we do not obtain the optimal solution in any case. We further support the observation that larger networks can produce better training and generalization error using a face recognition example where a network with many more parameters than training points generalizes better than smaller networks. (Also cross-referenced as UMIACS-TR-96-22

    Automatic solar feature detection using image processing and pattern recognition techniques

    Get PDF
    The objective of the research in this dissertation is to develop a software system to automatically detect and characterize solar flares, filaments and Corona Mass Ejections (CMEs), the core of so-called solar activity. These tools will assist us to predict space weather caused by violent solar activity. Image processing and pattern recognition techniques are applied to this system. For automatic flare detection, the advanced pattern recognition techniques such as Multi-Layer Perceptron (MLP), Radial Basis Function (RBF), and Support Vector Machine (SVM) are used. By tracking the entire process of flares, the motion properties of two-ribbon flares are derived automatically. In the applications of the solar filament detection, the Stabilized Inverse Diffusion Equation (SIDE) is used to enhance and sharpen filaments; a new method for automatic threshold selection is proposed to extract filaments from background; an SVM classifier with nine input features is used to differentiate between sunspots and filaments. Once a filament is identified, morphological thinning, pruning, and adaptive edge linking methods are applied to determine filament properties. Furthermore, a filament matching method is proposed to detect filament disappearance. The automatic detection and characterization of flares and filaments have been successfully applied on Hα full-disk images that are continuously obtained at Big Bear Solar Observatory (BBSO). For automatically detecting and classifying CMEs, the image enhancement, segmentation, and pattern recognition techniques are applied to Large Angle Spectrometric Coronagraph (LASCO) C2 and C3 images. The processed LASCO and BBSO images are saved to file archive, and the physical properties of detected solar features such as intensity and speed are recorded in our database. Researchers are able to access the solar feature database and analyze the solar data efficiently and effectively. The detection and characterization system greatly improves the ability to monitor the evolution of solar events and has potential to be used to predict the space weather

    A Model of Inductive Bias Learning

    Full text link
    A major problem in machine learning is that of inductive bias: how to choose a learner's hypothesis space so that it is large enough to contain a solution to the problem being learnt, yet small enough to ensure reliable generalization from reasonably-sized training sets. Typically such bias is supplied by hand through the skill and insights of experts. In this paper a model for automatically learning bias is investigated. The central assumption of the model is that the learner is embedded within an environment of related learning tasks. Within such an environment the learner can sample from multiple tasks, and hence it can search for a hypothesis space that contains good solutions to many of the problems in the environment. Under certain restrictions on the set of all hypothesis spaces available to the learner, we show that a hypothesis space that performs well on a sufficiently large number of training tasks will also perform well when learning novel tasks in the same environment. Explicit bounds are also derived demonstrating that learning multiple tasks within an environment of related tasks can potentially give much better generalization than learning a single task
    • …
    corecore