67 research outputs found
Meson spectroscopy at non-zero temperature using lattice QCD
This thesis explores two main topics: the effects of the temperature on several Quantum Chromodynamics mesonic observables, with a concrete focus on the tem-perature dependence of the mesonic mass spectrum, and numerical spectral recon-struction of lattice correlation functions employing deep neural networks. In the first two chapters, a brief introduction to standard lattice Quantum Chromodynamics and non-zero temperature field theory is provided. Using the tools presented in the intro-ductory chapters, a complete spectroscopy analysis of the temperature dependence of several mesonic ground state masses is developed. From this study, novel results in the restoration of chiral symmetry as a function of the temperature are obtained by studying the degree of degeneracy between the Ï(770) and a1(1260) states. Ad-ditionally, a complete study of the thermal effects affecting the mesonic D(s)-sector below the pseudocritical temperature of the system is provided. A self-contained chapter discussing the pion velocity in the medium is also included in the document. The pion velocity is estimated as a function of the temperature using non-zero tem-perature lattice Quantum Chromodynamics. In addition, after providing a detailed introduction to the field of neural networks, their application to numerical spectral reconstruction is studied. A simple implementation in which deep neural networks are applied to numerical spectral reconstruction is tested in order to explore its limits and applicability
Green Function and Electromagnetic Potential for Computer Vision and Convolutional Neural Network Applications
RĂSUMĂ
Pour les problĂšmes de vision machine (CV) avancĂ©es, tels que la classification, la segmentation de scĂšnes et la dĂ©tection dâobjets salients, il est nĂ©cessaire dâextraire le plus de caractĂ©ristiques possibles des images. Un des outils les plus utilisĂ©s pour lâextraction de caractĂ©ristiques est lâutilisation dâun noyau de convolution, oĂč chacun des noyaux est spĂ©cialisĂ© pour lâextraction dâune caractĂ©ristique donnĂ©e. Ceci a menĂ© au dĂ©veloppement rĂ©cent des rĂ©seaux de neurones convolutionnels (CNN) qui permet dâoptimiser des milliers de noyaux Ă la fois, faisant du CNN la norme pour lâanalyse dâimages. Toutefois, une limitation importante du CNN est que les noyaux sont petits (gĂ©nĂ©ralement de taille 3x3 Ă 7x7), ce qui limite lâinteraction longue-distance des caractĂ©ristiques. Une autre limitation est que la fusion des caractĂ©ristiques se fait par des additions pondĂ©rĂ©es et des opĂ©rations de mise en commun (moyennes et maximums locaux). En effet, ces opĂ©rations ne permettent pas de fusionner des caractĂ©ristiques du domaine spatial avec des caractĂ©ristiques puisque ces caractĂ©ristiques occupent des positions Ă©loignĂ©es sur lâimage.
Lâobjectif de cette thĂšse est de dĂ©velopper des nouveaux noyaux de convolutions basĂ©s sur lâĂ©lectromagnĂ©tisme (EM) et les fonctions de Green (GF) pour ĂȘtre utilisĂ©s dans des applications de vision machine (CV) et dans des rĂ©seaux de neurones convolutionnels (CNN). Ces nouveaux noyaux sont au moins aussi grands que lâimage. Ils Ă©vitent donc plusieurs des limitations des CNN standards puisquâils permettent lâinteraction longue-distance entre les pixels de limages. De plus, ils permettent de fusionner les caractĂ©ristiques du domaine spatial avec les caractĂ©ristiques du domaine du gradient. Aussi, Ă©tant donnĂ© tout champ vectoriel, les nouveaux noyaux permettent de trouver le champ vectoriel conservatif le plus rapprochĂ© du champ initial, ce qui signifie que le nouveau champ devient lisse, irrotationnel et conservatif (intĂ©grable par intĂ©grale curviligne).
Pour rĂ©pondre Ă cet objectif, nous avons dâabord dĂ©veloppĂ© des noyaux convolutionnels symĂ©triques et asymĂ©triques basĂ©s sur les propriĂ©tĂ©s des EM et des GF et rĂ©sultant en des noyaux qui sont invariants en rĂ©solution et en rotation. Ensuite, nous avons dĂ©veloppĂ© la premiĂšre mĂ©thode qui permet de dĂ©terminer la probabilitĂ© dâinclusion dans des contours partiels, permettant donc dâextrapoler des contours fins en des rĂ©gions continues couvrant lâespace 2D. De plus, la prĂ©sente thĂšse dĂ©montre que les noyaux basĂ©s sur les GF sont les solveurs optimaux du gradient et du Laplacien.----------ABSTRACT
For advanced computer vision (CV) tasks such as classification, scene segmentation, and salient object detection, extracting features from images is mandatory. One of the most used tools for feature extraction is the convolutional kernel, with each kernel being specialized for specific feature detection. In recent years, the convolutional neural network (CNN) became the standard method of feature detection since it allowed to optimize thousands of kernels at the same time. However, a limitation of the CNN is that all the kernels are small (usually between 3x3 and 7x7), which limits the receptive field. Another limitation is that feature merging is done via weighted additions and pooling, which cannot be used to merge spatial-domain features with gradient-domain features since they are not located at the same pixel coordinate.
The objective of this thesis is to develop electromagnetic (EM) convolutions and Greenâs functions (GF) convolutions to be used in Computer Vision and convolutional neural networks (CNN). These new kernels do not have the limitations of the standard CNN kernels since they allow an unlimited receptive field and interaction between any pixel in the image by using kernels bigger than the image. They allow merging spatial domain features with gradient domain features by integrating any vector field. Additionally, they can transform any vector field of features into its least-error conservative field, meaning that the field of features becomes smooth, irrotational and conservative (line-integrable).
At first, we developed different symmetrical and asymmetrical convolutional kernel based on EM and GF that are both resolution and rotation invariant. Then we developed the first method of determining the probability of being inside partial edges, which allow extrapolating thin edge features into the full 2D space. Furthermore, the current thesis proves that GF kernels are the least-error gradient and Laplacian solvers, and they are empirically demonstrated to be faster than the fastest competing method and easier to implement.
Consequently, using the fast gradient solver, we developed the first method that directly combines edges with saliency maps in the gradient domain, then solves the gradient to go back to the saliency domain. The improvement of the saliency maps over the F-measure is on average 6.6 times better than the nearest competing algorithm on a selected dataset. Then, to improve the saliency maps further, we developed the DSS-GIS model which combines edges with salient regions deep inside the network
Convolutional Neural Networks for the CHIPS Neutrino Detector R&D Project
The CHerenkov detectors In mine PitS (Chips) neutrino detector R&D project aims to develop novel strategies and technologies for very large yet âcheap as chipsâ water Cherenkov neutrino detectors. Via deployment in a body of water, use of commercially available components, and instrumentation coverage optimisation for the study of exclusively accelerator beam neutrinos, Chips will enable megaton scale detectors to become a reality at the cost of 300k per kt of sensitive mass. During the summer of 2019 a prototype Chips detector, Chips-5, was deployed into the Wentworth 2W disused mine pit in northern Minnesota, 7 mrad off the NuMI beam axis. A novel data acquisition system was introduced using cheap single-board computers and open-source software. This work presents a novel approach to water Cherenkov neutrino detector event reconstruction and classification. Three forms of a Convolutional Neural Network, a type of deep learning algorithm, have been trained to reject cosmic muon events, classify beam events, and estimate neutrino energies, all using only the raw detector event as input. When evaluated on the expected distribution of Chips-5 events, this new approach is shown to be robust and explainable as well as providing a significant performance increase over the standard likelihood-based reconstruction and simple neural network classification. Promisingly, the performance presented here is comparable to the more complex (and expensive) neutrino oscillation experiments within the field
Adversarial training to improve robustness of adversarial deep neural classifiers in the NOvA experiment
The NOvA experiment is a long-baseline neutrino oscillation experiment. Consisting of two functionally identical detectors situated off-axis in Fermilabâs NuMI neutrino beam. The Near Detector observes the unoscillated beam at Fermilab, while the Far Detector observes the oscillated beam 810 km away. This allows for measurements of the oscillation probabilities for multiple oscillation channels, Îœ_” â Îœ_”, anti Îœ_” â anti Îœ_”, Îœ_” â Îœ_e and anti Îœ_” â anti Îœ_e, leading to measurements of the neutrino oscillation parameters, sinΞ_23, âm^2_32 and ÎŽ_CP.
These measurements are produced from an extensive analysis of the recorded data. Deep neural networks are deployed at multiple stages of this analysis. The Event CVN network is deployed for the purposes of identifying and classifying the interaction types of selected neutrino events. The effects of systematic uncertainties present in the measurements on the network performance are investigated and are found to cause negligible variations. The robustness of these network trainings is therefore demonstrated which further justifies their current usage in the analysis beyond the standard validation.
The effects on the network performance for larger systematic alterations to the training datasets beyond the systematic uncertainties, such as an exchange of the neutrino event generators, are investigated. The differences in network performance corresponding to the introduced variations are found to be minimal.
Domain adaptation techniques are implemented in the AdCVN framework. These methods are deployed for the purpose of improving the Event CVN robustness for scenarios with systematic variations in the underlying data
Neural networks and quantum many-body physics: exploring reciprocal benefits.
One of the main reasons why the physics of quantum many-body systems is hard
lies in the curse of dimensionality:
The number of states of such systems increases exponentially with the number of degrees of freedom involved.
As a result, computations for realistic systems become intractable,
and even numerical methods are limited to comparably small system sizes.
Many efforts in modern physics research are therefore concerned with
finding efficient representations of quantum states and
clever approximations schemes that would allow them to characterize physical systems of interest.
Meanwhile, Deep Learning (DL) has solved many non-scientific problems
that have been unaccessible to conventional methods for a similar reason.
The concept underlying DL is to extract knowledge from data
by identifying patterns and regularities.
The remarkable success of DL has excited many physicists about
the prospect of leveraging its power to solve intractable problems in physics.
At the same time, DL turned out to be an interesting complex many-body
problem in itself.
In contrast to its widespread empirical applications,
the theoretical foundation of DL is strongly underdeveloped.
In particular, as long as its decision-making process and result interpretability remain opaque,
DL can not claim the status of a scientific tool.
In this thesis, I explore the interface between DL and quantum many-body physics,
and investigate DL both as a tool and as a subject of study.
The first project presented here is a theory-based study of
a fundamental open question about the role of width and the number of parameters in deep neural networks.
In this work, we consider a DL setup for the image recognition task on standard benchmarking datasets.
We combine controlled experiments with a theoretical analysis, including analytical calculations for a toy model.
The other three works focus on the application of Restricted Boltzmann Machines as generative models
for the task of wavefunction reconstruction from measurement data on a quantum many-body system.
First, we implement this approach as a software package, making it available as a tool for experimentalists.
Following the idea that physics problems can be used to characterize DL tools,
we then use our extensive knowledge of this setup to conduct a systematic study of
how the RBM complexity scales with the complexity of the physical system.
Finally, in a follow-up study we focus on the effects of parameter pruning techniques
on the RBM and its scaling behavior
Learning Generalizable Visual Patterns Without Human Supervision
Owing to the existence of large labeled datasets, Deep Convolutional Neural Networks have ushered in a renaissance in computer vision. However, almost all of the visual data we generate daily - several human lives worth of it - remains unlabeled and thus out of reach of todayâs dominant supervised learning paradigm. This thesis focuses on techniques that steer deep models towards learning generalizable visual patterns without human supervision. Our primary tool in this endeavor is the design of Self-Supervised Learning tasks, i.e., pretext-tasks for which labels do not involve human labor. Besides enabling the learning from large amounts of unlabeled data, we demonstrate how self-supervision can capture relevant patterns that supervised learning largely misses. For example, we design learning tasks that learn deep representations capturing shape from images, motion from video, and 3D pose features from multi-view data. Notably, these tasksâ design follows a common principle: The recognition of data transformations. The strong performance of the learned representations on downstream vision tasks such as classification, segmentation, action recognition, or pose estimation validate this pretext-task design.
This thesis also explores the use of Generative Adversarial Networks (GANs) for unsupervised representation learning. Besides leveraging generative adversarial learning to define image transformation for self-supervised learning tasks, we also address training instabilities of GANs through the use of noise.
While unsupervised techniques can significantly reduce the burden of supervision, in the end, we still rely on some annotated examples to fine-tune learned representations towards a target task. To improve the learning from scarce or noisy labels, we describe a supervised learning algorithm with improved generalization in these challenging settings
Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems
Advances in artificial intelligence (AI) are fueling a new paradigm of
discoveries in natural sciences. Today, AI has started to advance natural
sciences by improving, accelerating, and enabling our understanding of natural
phenomena at a wide range of spatial and temporal scales, giving rise to a new
area of research known as AI for science (AI4Science). Being an emerging
research paradigm, AI4Science is unique in that it is an enormous and highly
interdisciplinary area. Thus, a unified and technical treatment of this field
is needed yet challenging. This work aims to provide a technically thorough
account of a subarea of AI4Science; namely, AI for quantum, atomistic, and
continuum systems. These areas aim at understanding the physical world from the
subatomic (wavefunctions and electron density), atomic (molecules, proteins,
materials, and interactions), to macro (fluids, climate, and subsurface) scales
and form an important subarea of AI4Science. A unique advantage of focusing on
these areas is that they largely share a common set of challenges, thereby
allowing a unified and foundational treatment. A key common challenge is how to
capture physics first principles, especially symmetries, in natural systems by
deep learning methods. We provide an in-depth yet intuitive account of
techniques to achieve equivariance to symmetry transformations. We also discuss
other common technical challenges, including explainability,
out-of-distribution generalization, knowledge transfer with foundation and
large language models, and uncertainty quantification. To facilitate learning
and education, we provide categorized lists of resources that we found to be
useful. We strive to be thorough and unified and hope this initial effort may
trigger more community interests and efforts to further advance AI4Science
- âŠ