1,525 research outputs found

    Adversarial domain adaptation to reduce sample bias of a high energy physics classifier

    Full text link
    We apply adversarial domain adaptation to reduce sample bias in a classification machine learning algorithm. We add a gradient reversal layer to a neural network to simultaneously classify signal versus background events, while minimising the difference of the classifier response to a background sample using an alternative MC model. We show this on the example of simulated events at the LHC with ttˉHt\bar{t}H signal versus ttˉbbˉt\bar{t}b\bar{b} background classification.Comment: 15 pages, 8 figures, to be submitted to JINS

    Neural network setups for a precise detection of the many-body localization transition: finite-size scaling and limitations

    Full text link
    Determining phase diagrams and phase transitions semi-automatically using machine learning has received a lot of attention recently, with results in good agreement with more conventional approaches in most cases. When it comes to more quantitative predictions, such as the identification of universality class or precise determination of critical points, the task is more challenging. As an exacting test-bed, we study the Heisenberg spin-1/2 chain in a random external field that is known to display a transition from a many-body localized to a thermalizing regime, which nature is not entirely characterized. We introduce different neural network structures and dataset setups to achieve a finite-size scaling analysis with the least possible physical bias (no assumed knowledge on the phase transition and directly inputing wave-function coefficients), using state-of-the-art input data simulating chains of sizes up to L=24. In particular, we use domain adversarial techniques to ensure that the network learns scale-invariant features. We find a variability of the output results with respect to network and training parameters, resulting in relatively large uncertainties on final estimates of critical point and correlation length exponent which tend to be larger than the values obtained from conventional approaches. We put the emphasis on interpretability throughout the paper and discuss what the network appears to learn for the various used architectures. Our findings show that a it quantitative analysis of phase transitions of unknown nature remains a difficult task with neural networks when using the minimally engineered physical input.Comment: v2: published versio

    Model independent measurements of Standard Model cross sections with Domain Adaptation

    Full text link
    With the ever growing amount of data collected by the ATLAS and CMS experiments at the CERN LHC, fiducial and differential measurements of the Higgs boson production cross section have become important tools to test the standard model predictions with an unprecedented level of precision, as well as seeking deviations that can manifest the presence of physics beyond the standard model. These measurements are in general designed for being easily comparable to any present or future theoretical prediction, and to achieve this goal it is important to keep the model dependence to a minimum. Nevertheless, the reduction of the model dependence usually comes at the expense of the measurement precision, preventing to exploit the full potential of the signal extraction procedure. In this paper a novel methodology based on the machine learning concept of domain adaptation is proposed, which allows using a complex deep neural network in the signal extraction procedure while ensuring a minimal dependence of the measurements on the theoretical modelling of the signal.Comment: 16 pages, 10 figure

    Adversarial training to improve robustness of adversarial deep neural classifiers in the NOvA experiment

    Get PDF
    The NOvA experiment is a long-baseline neutrino oscillation experiment. Consisting of two functionally identical detectors situated off-axis in Fermilab’s NuMI neutrino beam. The Near Detector observes the unoscillated beam at Fermilab, while the Far Detector observes the oscillated beam 810 km away. This allows for measurements of the oscillation probabilities for multiple oscillation channels, Îœ_” → Îœ_”, anti Îœ_” → anti Îœ_”, Îœ_” → Îœ_e and anti Îœ_” → anti Îœ_e, leading to measurements of the neutrino oscillation parameters, sinΞ_23, ∆m^2_32 and ÎŽ_CP. These measurements are produced from an extensive analysis of the recorded data. Deep neural networks are deployed at multiple stages of this analysis. The Event CVN network is deployed for the purposes of identifying and classifying the interaction types of selected neutrino events. The effects of systematic uncertainties present in the measurements on the network performance are investigated and are found to cause negligible variations. The robustness of these network trainings is therefore demonstrated which further justifies their current usage in the analysis beyond the standard validation. The effects on the network performance for larger systematic alterations to the training datasets beyond the systematic uncertainties, such as an exchange of the neutrino event generators, are investigated. The differences in network performance corresponding to the introduced variations are found to be minimal. Domain adaptation techniques are implemented in the AdCVN framework. These methods are deployed for the purpose of improving the Event CVN robustness for scenarios with systematic variations in the underlying data

    Shifting representations:Adventures in cross-modality domain adaptation for medical image analysis

    Get PDF

    Shifting representations:Adventures in cross-modality domain adaptation for medical image analysis

    Get PDF

    Development and Deployment of a Deep Neural Network based Flavor Tagger for Belle II

    Get PDF

    Automatic detection of pathological regions in medical images

    Get PDF
    Medical images are an essential tool in the daily clinical routine for the detection, diagnosis, and monitoring of diseases. Different imaging modalities such as magnetic resonance (MR) or X-ray imaging are used to visualize the manifestations of various diseases, providing physicians with valuable information. However, analyzing every single image by human experts is a tedious and laborious task. Deep learning methods have shown great potential to support this process, but many images are needed to train reliable neural networks. Besides the accuracy of the final method, the interpretability of the results is crucial for a deep learning method to be established. A fundamental problem in the medical field is the availability of sufficiently large datasets due to the variability of different imaging techniques and their configurations. The aim of this thesis is the development of deep learning methods for the automatic identification of anomalous regions in medical images. Each method is tailored to the amount and type of available data. In the first step, we present a fully supervised segmentation method based on denoising diffusion models. This requires a large dataset with pixel-wise manual annotations of the pathological regions. Due to the implicit ensemble characteristic, our method provides uncertainty maps to allow interpretability of the model’s decisions. Manual pixel-wise annotations face the problems that they are prone to human bias, hard to obtain, and often even unavailable. Weakly supervised methods avoid these issues by only relying on image-level annotations. We present two different approaches based on generative models to generate pixel-wise anomaly maps using only image-level annotations, i.e., a generative adversarial network and a denoising diffusion model. Both perform image-to-image translation between a set of healthy and a set of diseased subjects. Pixel-wise anomaly maps can be obtained by computing the difference between the original image of the diseased subject and the synthetic image of its healthy representation. In an extension of the diffusion-based anomaly detection method, we present a flexible framework to solve various image-to-image translation tasks. With this method, we managed to change the size of tumors in MR images, and we were able to add realistic pathologies to images of healthy subjects. Finally, we focus on a problem frequently occurring when working with MR images: If not enough data from one MR scanner are available, data from other scanners need to be considered. This multi-scanner setting introduces a bias between the datasets of different scanners, limiting the performance of deep learning models. We present a regularization strategy on the model’s latent space to overcome the problems raised by this multi-site setting

    Semi-supervised detection of industrial fouling using ultrasound

    Get PDF
    Fouling is a large scale problem in industrial equipment such as heat exchangers or pipes, used in factories, ships, airplanes, etc. Traditionally, such equipment is cleaned using sandblasting, chemicals or mechanical methods, all of which require halting the process, which is costly. Recently, high-power ultrasound has become a viable option to these methods. In ultrasonic cleaning ultrasound is projected into the equipment from the outside, which means that the equipment does not need to be halted to perform cleaning. While the cleaning itself is not invasive in nature, in most cases vision cannot be used to determine whether cleaning is actually necessary or not. What remains is to have such a method that is also non-invasive. It is possible to use ultrasound as a kind of a radar to detect whether or not fouling is present, and this has been attempted in previous literature. However, until now, such methods have required extensive manual calculation and knowledge of the physical properties of the setup. We present the first ever system to concurrently clean and detect industrial fouling using ultrasound and deep learning. Our method does not rely on specific properties of the equipment, allowing it to generalize to large industrial processes where it is not practical to calculate or simulate the cleaning scenario. To this end, we extend existing literature on semi-supervised learning by presenting algorithms used to learn from a monotonic process, and model the high-dimensional signal data using a convolutional neural network that is highly robust to temporal variance. This thesis presents the machine learning solution behind the system, and the cleaning components are provided by Altum Technologies. Further, we explore methods to detect and counter the so-called domain shift that occurs when experimenting in the physical world, and provide experimental evidence that our methods work in practice
    • 

    corecore