17,652 research outputs found

    Comparison of the CPU and memory performance of StatPatternRecognition (SPR) and Toolkit for MultiVariate Analysis (TMVA)

    Full text link
    High Energy Physics data sets are often characterized by a huge number of events. Therefore, it is extremely important to use statistical packages able to efficiently analyze these unprecedented amounts of data. We compare the performance of the statistical packages StatPatternRecognition (SPR) and Toolkit for MultiVariate Analysis (TMVA). We focus on how CPU time and memory usage of the learning process scale versus data set size. As classifiers, we consider Random Forests, Boosted Decision Trees and Neural Networks. For our tests, we employ a data set widely used in the machine learning community, "Threenorm" data set, as well as data tailored for testing various edge cases. For each data set, we constantly increase its size and check CPU time and memory needed to build the classifiers implemented in SPR and TMVA. We show that SPR is often significantly faster and consumes significantly less memory. For example, the SPR implementation of Random Forest is by an order of magnitude faster and consumes an order of magnitude less memory than TMVA on Threenorm data

    Guinea pigs sublethally infected with aerosolized Legionella pneumophila develop humoral and cell-mediated immune responses and are protected against lethal aerosol challenge. A model for studying host defense against lung infections caused by intracellular pathogens.

    Get PDF
    We have employed the guinea pig model of L. pneumophila infection, which mimics Legionnaires' disease in humans both clinically and pathologically, to study humoral and cell-mediated immune responses to L. pneumophila and to examine protective immunity after aerosol exposure, the natural route of infection. Guinea pigs exposed to sublethal concentrations of L. pneumophila by aerosol developed strong humoral immune responses. By the indirect fluorescent antibody assay, exposed guinea pigs had a median serum antibody titer (expressed as the reciprocal of the highest positive dilution) of 32, whereas control guinea pigs had a median titer of less than 1. Sublethally infected (immunized) guinea pigs also developed strong cell-mediated immune responses. In response to L. pneumophila antigens, splenic lymphocytes from immunized but not control animals proliferated strongly in vitro, as measured by their capacity to incorporate [3H]thymidine. Moreover, immunized but not control guinea pigs developed strong cutaneous delayed-type hypersensitivity to intradermally injected L. pneumophila antigens. Sublethally infected (immunized) guinea pigs exhibited strong protective immunity to L. pneumophila. In two independent experiments, all 22 immunized guinea pigs survived aerosol challenge with one or three times the lethal dose of L. pneumophila whereas none of 16 sham-immunized control guinea pigs survived (p less than 0.0001 in each experiment). Immunized guinea pigs were not protected significantly from challenge with 10 times the lethal dose. Immunized but not control animals cleared the bacteria from their lungs. This study demonstrates that guinea pigs sublethally infected with L. pneumophila by the aerosol route develop strong humoral immune responses to this pathogen, develop strong cell-mediated immune responses and cutaneous delayed-type hypersensitivity to L. pneumophila antigens, are protected against subsequent lethal aerosol challenge, and are able to clear the bacteria from their lungs. The guinea pig model of L. pneumophila pulmonary infection is as an excellent one for studying general principles of host defense against pulmonary infections caused by intracellular pathogens

    Semi-supervised Learning for Photometric Supernova Classification

    Full text link
    We present a semi-supervised method for photometric supernova typing. Our approach is to first use the nonlinear dimension reduction technique diffusion map to detect structure in a database of supernova light curves and subsequently employ random forest classification on a spectroscopically confirmed training set to learn a model that can predict the type of each newly observed supernova. We demonstrate that this is an effective method for supernova typing. As supernova numbers increase, our semi-supervised method efficiently utilizes this information to improve classification, a property not enjoyed by template based methods. Applied to supernova data simulated by Kessler et al. (2010b) to mimic those of the Dark Energy Survey, our methods achieve (cross-validated) 95% Type Ia purity and 87% Type Ia efficiency on the spectroscopic sample, but only 50% Type Ia purity and 50% efficiency on the photometric sample due to their spectroscopic follow-up strategy. To improve the performance on the photometric sample, we search for better spectroscopic follow-up procedures by studying the sensitivity of our machine learned supernova classification on the specific strategy used to obtain training sets. With a fixed amount of spectroscopic follow-up time, we find that deeper magnitude-limited spectroscopic surveys are better for producing training sets. For supernova Ia (II-P) typing, we obtain a 44% (1%) increase in purity to 72% (87%) and 30% (162%) increase in efficiency to 65% (84%) of the sample using a 25th (24.5th) magnitude-limited survey instead of the shallower spectroscopic sample used in the original simulations. When redshift information is available, we incorporate it into our analysis using a novel method of altering the diffusion map representation of the supernovae. Incorporating host redshifts leads to a 5% improvement in Type Ia purity and 13% improvement in Type Ia efficiency.Comment: 16 pages, 11 figures, accepted for publication in MNRA

    A Methodology for the Diagnostic of Aircraft Engine Based on Indicators Aggregation

    Full text link
    Aircraft engine manufacturers collect large amount of engine related data during flights. These data are used to detect anomalies in the engines in order to help companies optimize their maintenance costs. This article introduces and studies a generic methodology that allows one to build automatic early signs of anomaly detection in a way that is understandable by human operators who make the final maintenance decision. The main idea of the method is to generate a very large number of binary indicators based on parametric anomaly scores designed by experts, complemented by simple aggregations of those scores. The best indicators are selected via a classical forward scheme, leading to a much reduced number of indicators that are tuned to a data set. We illustrate the interest of the method on simulated data which contain realistic early signs of anomalies.Comment: Proceedings of the 14th Industrial Conference, ICDM 2014, St. Petersburg : Russian Federation (2014

    Texture-based crowd detection and localisation

    Get PDF
    This paper presents a crowd detection system based on texture analysis. The state-of-the-art techniques based on co-occurrence matrix have been revisited and a novel set of features proposed. These features provide a richer description of the co-occurrence matrix, and can be exploited to obtain stronger classification results, especially when smaller portions of the image are considered. This is extremely useful for crowd localisation: acquired images are divided into smaller regions in order to perform a classification on each one. A thorough evaluation of the proposed system on a real world data set is also presented: this validates the improvements in reliability of the crowd detection and localisation

    Machine learning with the hierarchy‐of‐hypotheses (HoH) approach discovers novel pattern in studies on biological invasions

    Get PDF
    Research synthesis on simple yet general hypotheses and ideas is challenging in scientific disciplines studying highly context‐dependent systems such as medical, social, and biological sciences. This study shows that machine learning, equation‐free statistical modeling of artificial intelligence, is a promising synthesis tool for discovering novel patterns and the source of controversy in a general hypothesis. We apply a decision tree algorithm, assuming that evidence from various contexts can be adequately integrated in a hierarchically nested structure. As a case study, we analyzed 163 articles that studied a prominent hypothesis in invasion biology, the enemy release hypothesis. We explored if any of the nine attributes that classify each study can differentiate conclusions as classification problem. Results corroborated that machine learning can be useful for research synthesis, as the algorithm could detect patterns that had been already focused in previous narrative reviews. Compared with the previous synthesis study that assessed the same evidence collection based on experts' judgement, the algorithm has newly proposed that the studies focusing on Asian regions mostly supported the hypothesis, suggesting that more detailed investigations in these regions can enhance our understanding of the hypothesis. We suggest that machine learning algorithms can be a promising synthesis tool especially where studies (a) reformulate a general hypothesis from different perspectives, (b) use different methods or variables, or (c) report insufficient information for conducting meta‐analyses

    ICA as a preprocessing technique for classification

    Get PDF
    In this paper we propose the use of the independent component analysis (ICA) [1] technique for improving the classification rate of decision trees and multilayer perceptrons [2], [3]. The use of an ICA for the preprocessing stage, makes the structure of both classifiers simpler, and therefore improves the generalization properties. The hypothesis behind the proposed preprocessing is that an ICA analysis will transform the feature space into a space where the components are independent, and aligned to the axes and therefore will be more adapted to the way that a decision tree is constructed. Also the inference of the weights of a multilayer perceptron will be much easier because the gradient search in the weight space will follow independent trajectories. The result is that classifiers are less complex and on some databases the error rate is lower. This idea is also applicable to regressio

    Improving access to Special Collections by automating descriptive metadata creation

    Get PDF
    presentationPresentation given at the Utah Library Association Annual Conference, Layton, UT

    Predictive Maintenance on the Machining Process and Machine Tool

    Get PDF
    This paper presents the process required to implement a data driven Predictive Maintenance (PdM) not only in the machine decision making, but also in data acquisition and processing. A short review of the different approaches and techniques in maintenance is given. The main contribution of this paper is a solution for the predictive maintenance problem in a real machining process. Several steps are needed to reach the solution, which are carefully explained. The obtained results show that the Preventive Maintenance (PM), which was carried out in a real machining process, could be changed into a PdM approach. A decision making application was developed to provide a visual analysis of the Remaining Useful Life (RUL) of the machining tool. This work is a proof of concept of the methodology presented in one process, but replicable for most of the process for serial productions of pieces

    Peer Observation as a Job-Embedded Professional Development Tool

    Get PDF
    Teacher professional development is typically provided outside of the workplace, and is therefore disconnected to daily classroom practices. An alternative model of professional development is peer observation, which is contextualized through coaching and collaboration in the classroom. To date, research and investigation into the practice of peer observation is lacking. To fill that gap, this study examined the influence of peer observation on teacher practice, while identifying factors that were most beneficial and challenging about peer observation and its influence on workplace collegiality. This study used qualitative methods and action research that allowed teachers to be part of the research process. Three teams of teachers participated in the study at a suburban high school. Each team consisted of two teachers, pairing an experienced teacher with an inexperienced teacher. Participants in the study reported how peer observation provided professional development in the context of their workplace. Teachers in each team shared the same instructional content area which, according to findings, made the peer observation process more relevant. Peer observation was also found to build and strengthen collegiality, facilitate an exchange of instructional techniques between teachers, and break down isolating instructional practices. Participants also appreciated receiving feedback from a colleague in a non-threatening way
    corecore