217 research outputs found

    Automated data pre-processing via meta-learning

    Get PDF
    The final publication is available at link.springer.comA data mining algorithm may perform differently on datasets with different characteristics, e.g., it might perform better on a dataset with continuous attributes rather than with categorical attributes, or the other way around. As a matter of fact, a dataset usually needs to be pre-processed. Taking into account all the possible pre-processing operators, there exists a staggeringly large number of alternatives and nonexperienced users become overwhelmed. We show that this problem can be addressed by an automated approach, leveraging ideas from metalearning. Specifically, we consider a wide range of data pre-processing techniques and a set of data mining algorithms. For each data mining algorithm and selected dataset, we are able to predict the transformations that improve the result of the algorithm on the respective dataset. Our approach will help non-expert users to more effectively identify the transformations appropriate to their applications, and hence to achieve improved results.Peer ReviewedPostprint (published version

    Numerical Implementation of lepton-nucleus interactions and its effect on neutrino oscillation analysis

    Full text link
    We discuss the implementation of the nuclear model based on realistic nuclear spectral functions in the GENIE neutrino interaction generator. Besides improving on the Fermi gas description of the nuclear ground state, our scheme involves a new prescription for Q2Q^2 selection, meant to efficiently enforce energy momentum conservation. The results of our simulations, validated through comparison to electron scattering data, have been obtained for a variety of target nuclei, ranging from carbon to argon, and cover the kinematical region in which quasi elastic scattering is the dominant reaction mechanism. We also analyse the influence of the adopted nuclear model on the determination of neutrino oscillation parameters.Comment: 19 pages, 35 figures, version accepted by Phys. Rev.

    Determining appropriate approaches for using data in feature selection

    Get PDF
    Feature selection is increasingly important in data analysis and machine learning in big data era. However, how to use the data in feature selection, i.e. using either ALL or PART of a dataset, has become a serious and tricky issue. Whilst the conventional practice of using all the data in feature selection may lead to selection bias, using part of the data may, on the other hand, lead to underestimating the relevant features under some conditions. This paper investigates these two strategies systematically in terms of reliability and effectiveness, and then determines their suitability for datasets with different characteristics. The reliability is measured by the Average Tanimoto Index and the Inter-method Average Tanimoto Index, and the effectiveness is measured by the mean generalisation accuracy of classification. The computational experiments are carried out on ten real-world benchmark datasets and fourteen synthetic datasets. The synthetic datasets are generated with a pre-set number of relevant features and varied numbers of irrelevant features and instances, and added with different levels of noise. The results indicate that the PART approach is more effective in reducing the bias when the size of a dataset is small but starts to lose its advantage as the dataset size increases

    Development of a quality assurance process for the SoLid experiment

    Get PDF
    The SoLid experiment has been designed to search for an oscillation pattern induced by a light sterile neutrino state, utilising the BR2 reactor of SCK circle CEN, in Belgium. The detector leverages a new hybrid technology, utilising two distinct scintillators in a cubic array, creating a highly segmented detector volume. A combination of 5 cm cubic polyvinyltoluene cells, with (LiF)-Li-6:ZnS(Ag) sheets on two faces of each cube, facilitate reconstruction of the neutrino signals. Whilst the high granularity provides a powerful toolset to discriminate backgrounds; by itself the segmentation also represents a challenge in terms of homogeneity and calibration, for a consistent detector response. The search for this light sterile neutrino implies a sensitivity to distortions of around O(10)% in the energy spectrum of reactor (v) over bare. Hence, a very good neutron detection efficiency, light yield and homogeneous detector response are critical for data validation. The minimal requirements for the SoLid physics program are a light yield and a neutron detection efficiency larger than 40 PA/MeV/cube and 50% respectively. In order to guarantee these minimal requirements, the collaboration developed a rigorous quality assurance process for all 12800 cubic cells of the detector. To carry out the quality assurance process, an automated calibration system called CALIPSO was designed and constructed. CALIPSO provides precise, automatic placement of radioactive sources in front of each cube of a given detector plane (16 x 16 cubes). A combination of Na-22, Cf-252 and AmBe gamma and neutron sources were used by CALIPSO during the quality assurance process. Initially, the scanning identified defective components allowing for repair during initial construction of the SoLid detector. Secondly, a full analysis of the calibration data revealed initial estimations for the light yield of over 60 PA/MeV and neutron reconstruction efficiency of 68%, validating the SoLid physics requirements

    Algebraic Comparison of Partial Lists in Bioinformatics

    Get PDF
    The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset

    Determination of muon momentum in the MicroBooNE LArTPC using an improved model of multiple Coulomb scattering

    Full text link
    We discuss a technique for measuring a charged particle's momentum by means of multiple Coulomb scattering (MCS) in the MicroBooNE liquid argon time projection chamber (LArTPC). This method does not require the full particle ionization track to be contained inside of the detector volume as other track momentum reconstruction methods do (range-based momentum reconstruction and calorimetric momentum reconstruction). We motivate use of this technique, describe a tuning of the underlying phenomenological formula, quantify its performance on fully contained beam-neutrino-induced muon tracks both in simulation and in data, and quantify its performance on exiting muon tracks in simulation. Using simulation, we have shown that the standard Highland formula should be re-tuned specifically for scattering in liquid argon, which significantly improves the bias and resolution of the momentum measurement. With the tuned formula, we find agreement between data and simulation for contained tracks, with a small bias in the momentum reconstruction and with resolutions that vary as a function of track length, improving from about 10% for the shortest (one meter long) tracks to 5% for longer (several meter) tracks. For simulated exiting muons with at least one meter of track contained, we find a similarly small bias, and a resolution which is less than 15% for muons with momentum below 2 GeV/c. Above 2 GeV/c, results are given as a first estimate of the MCS momentum measurement capabilities of MicroBooNE for high momentum exiting tracks

    Long-Baseline Neutrino Facility (LBNF) and Deep Underground Neutrino Experiment (DUNE) Conceptual Design Report Volume 2: The Physics Program for DUNE at LBNF

    Full text link
    The Physics Program for the Deep Underground Neutrino Experiment (DUNE) at the Fermilab Long-Baseline Neutrino Facility (LBNF) is described

    Conditional Neural Relational Inference for Interacting Systems

    Full text link
    In this work, we want to learn to model the dynamics of similar yet distinct groups of interacting objects. These groups follow some common physical laws that exhibit specificities that are captured through some vectorial description. We develop a model that allows us to do conditional generation from any such group given its vectorial description. Unlike previous work on learning dynamical systems that can only do trajectory completion and require a part of the trajectory dynamics to be provided as input in generation time, we do generation using only the conditioning vector with no access to generation time's trajectories. We evaluate our model in the setting of modeling human gait and, in particular pathological human gait

    A Proposal for a Three Detector Short-Baseline Neutrino Oscillation Program in the Fermilab Booster Neutrino Beam

    Get PDF
    A Short-Baseline Neutrino (SBN) physics program of three LAr-TPC detectors located along the Booster Neutrino Beam (BNB) at Fermilab is presented. This new SBN Program will deliver a rich and compelling physics opportunity, including the ability to resolve a class of experimental anomalies in neutrino physics and to perform the most sensitive search to date for sterile neutrinos at the eV mass-scale through both appearance and disappearance oscillation channels. Using data sets of 6.6e20 protons on target (P.O.T.) in the LAr1-ND and ICARUS T600 detectors plus 13.2e20 P.O.T. in the MicroBooNE detector, we estimate that a search for muon neutrino to electron neutrino appearance can be performed with ~5 sigma sensitivity for the LSND allowed (99% C.L.) parameter region. In this proposal for the SBN Program, we describe the physics analysis, the conceptual design of the LAr1-ND detector, the design and refurbishment of the T600 detector, the necessary infrastructure required to execute the program, and a possible reconfiguration of the BNB target and horn system to improve its performance for oscillation searches.Comment: 209 pages, 129 figure

    Indication for the disappearance of reactor electron antineutrinos in the Double Chooz experiment

    Get PDF
    The Double Chooz Experiment presents an indication of reactor electron antineutrino disappearance consistent with neutrino oscillations. A ratio of 0.944 ±\pm 0.016 (stat) ±\pm 0.040 (syst) observed to predicted events was obtained in 101 days of running at the Chooz Nuclear Power Plant in France, with two 4.25 GWth_{th} reactors. The results were obtained from a single 10 m3^3 fiducial volume detector located 1050 m from the two reactor cores. The reactor antineutrino flux prediction used the Bugey4 measurement as an anchor point. The deficit can be interpreted as an indication of a non-zero value of the still unmeasured neutrino mixing parameter \sang. Analyzing both the rate of the prompt positrons and their energy spectrum we find \sang = 0.086 ±\pm 0.041 (stat) ±\pm 0.030 (syst), or, at 90% CL, 0.015 << \sang  <\ < 0.16.Comment: 7 pages, 4 figures, (new version after PRL referee's comments
    corecore