9,138 research outputs found

    On information captured by neural networks: connections with memorization and generalization

    Full text link
    Despite the popularity and success of deep learning, there is limited understanding of when, how, and why neural networks generalize to unseen examples. Since learning can be seen as extracting information from data, we formally study information captured by neural networks during training. Specifically, we start with viewing learning in presence of noisy labels from an information-theoretic perspective and derive a learning algorithm that limits label noise information in weights. We then define a notion of unique information that an individual sample provides to the training of a deep network, shedding some light on the behavior of neural networks on examples that are atypical, ambiguous, or belong to underrepresented subpopulations. We relate example informativeness to generalization by deriving nonvacuous generalization gap bounds. Finally, by studying knowledge distillation, we highlight the important role of data and label complexity in generalization. Overall, our findings contribute to a deeper understanding of the mechanisms underlying neural network generalization.Comment: PhD thesi

    Neutron scattering studies of heterogeneous catalysis

    Get PDF
    Understanding the structural dynamics/evolution of catalysts and the related surface chemistry is essential for establishing structure–catalysis relationships, where spectroscopic and scattering tools play a crucial role. Among many such tools, neutron scattering, though less-known, has a unique power for investigating catalytic phenomena. Since neutrons interact with the nuclei of matter, the neutron–nucleon interaction provides unique information on light elements (mainly hydrogen), neighboring elements, and isotopes, which are complementary to X-ray and photon-based techniques. Neutron vibrational spectroscopy has been the most utilized neutron scattering approach for heterogeneous catalysis research by providing chemical information on surface/bulk species (mostly H-containing) and reaction chemistry. Neutron diffraction and quasielastic neutron scattering can also supply important information on catalyst structures and dynamics of surface species. Other neutron approaches, such as small angle neutron scattering and neutron imaging, have been much less used but still give distinctive catalytic information. This review provides a comprehensive overview of recent advances in neutron scattering investigations of heterogeneous catalysis, focusing on surface adsorbates, reaction mechanisms, and catalyst structural changes revealed by neutron spectroscopy, diffraction, quasielastic neutron scattering, and other neutron techniques. Perspectives are also provided on the challenges and future opportunities in neutron scattering studies of heterogeneous catalysis

    Beam scanning by liquid-crystal biasing in a modified SIW structure

    Get PDF
    A fixed-frequency beam-scanning 1D antenna based on Liquid Crystals (LCs) is designed for application in 2D scanning with lateral alignment. The 2D array environment imposes full decoupling of adjacent 1D antennas, which often conflicts with the LC requirement of DC biasing: the proposed design accommodates both. The LC medium is placed inside a Substrate Integrated Waveguide (SIW) modified to work as a Groove Gap Waveguide, with radiating slots etched on the upper broad wall, that radiates as a Leaky-Wave Antenna (LWA). This allows effective application of the DC bias voltage needed for tuning the LCs. At the same time, the RF field remains laterally confined, enabling the possibility to lay several antennas in parallel and achieve 2D beam scanning. The design is validated by simulation employing the actual properties of a commercial LC medium

    Reinforcement learning in large state action spaces

    Get PDF
    Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios. This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory). In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications

    Machine learning approach towards predicting turbulent fluid flow using convolutional neural networks

    Get PDF
    Using convolutional neural networks, we present a novel method for predicting turbulent fluid flow through an array of obstacles in this thesis. In recent years, machine learning has exploded in popularity due to its ability to create accurate data driven models and the abundance of available data. In an attempt to understand the characteristics of turbulent fluid flow, we utilise a novel convolutional autoencoder neural network to predict the first ten POD modes of turbulent fluid flow. We find that the model is able to predict the first two POD modes well although and with less accuracy for the remaining eight POD modes. In addition, we find that the ML-predicted POD modes are accurate enough to be used to reconstruct turbulent flow that adequately captures the large-scale details of the original simulation

    Application of multi-scale computational techniques to complex materials systems

    Get PDF
    The applications of computational materials science are ever-increasing, connecting fields far beyond traditional subfields in materials science. This dissertation demonstrates the broad scope of multi-scale computational techniques by investigating multiple unrelated complex material systems, namely scandate thermionic cathodes and the metallic foam component of micrometeoroid and orbital debris (MMOD) shielding. Sc-containing scandate cathodes have been widely reported to exhibit superior properties compared to previous thermionic cathodes; however, knowledge of their precise operating mechanism remains elusive. Here, quantum mechanical calculations were utilized to map the phase space of stable, highly-faceted and chemically-complex W nanoparticles, accounting for both finite temperature and chemical environment. The precise processing conditions required to form the characteristic W nanoparticle observed experimentally were then distilled. Metallic foams, a central component of MMOD shielding, also represent a highly-complex materials system, albeit at a far higher length scale than W nanoparticles. The non-periodic, randomly-oriented constituent ligaments of metallic foams and similar materials create a significant variability in properties that is generally difficult to model. Rather than homogenizing the material such that its unique characteristic structural features are neglected, here, a stochastic modeling approach is applied that integrates complex geometric structure and utilizes continuum calculations to predict the resulting probabilistic distributions of elastic properties. Though different in many aspects, scandate cathodes and metallic foams are united by complexity that is impractical, even dangerous, to ignore and well-suited to exploration with multi-scale computational methods

    A Statistical View of Column Subset Selection

    Full text link
    We consider the problem of selecting a small subset of representative variables from a large dataset. In the computer science literature, this dimensionality reduction problem is typically formalized as Column Subset Selection (CSS). Meanwhile, the typical statistical formalization is to find an information-maximizing set of Principal Variables. This paper shows that these two approaches are equivalent, and moreover, both can be viewed as maximum likelihood estimation within a certain semi-parametric model. Using these connections, we show how to efficiently (1) perform CSS using only summary statistics from the original dataset; (2) perform CSS in the presence of missing and/or censored data; and (3) select the subset size for CSS in a hypothesis testing framework

    Understanding Data Manipulation and How to Leverage it To Improve Generalization

    Get PDF
    Augmentations and other transformations of data, either in the input or latent space, are a critical component of modern machine learning systems. While these techniques are widely used in practice and known to provide improved generalization in many cases, it is still unclear how data manipulation impacts learning and generalization. To take a step toward addressing the problem, this thesis focuses on understanding and leveraging data augmentation and alignment for improving machine learning performance and transfer. In the first part of the thesis, we establish a novel theoretical framework to understand how data augmentation (DA) impacts learning in linear regression and classification tasks. The results demonstrate how the augmented transformed data spectrum plays a key role in characterizing the behavior of different augmentation strategies, especially in the overparameterized regime. The tools developed in this aim provide simple guidelines to build new augmentation strategies and a simple framework for comparing the generalization of different types of DA. In the second part of the thesis, we demonstrate how latent data alignment can be used to tackle the domain transfer problem, where training and testing datasets vary in distribution. Our algorithm builds upon joint clustering and data-matching through optimal transport, and outperforms the pure matching algorithm baselines in both synthetic and real datasets. Extension of the generalization analysis and algorithm design for data augmentation and alignment for nonlinear models such as artificial neural networks and random feature models are discussed. This thesis provides tools and analyses for better data manipulation design, which benefit both supervised and unsupervised learning schemes.Ph.D

    Modified Theories of Gravity and Cosmological Applications

    Get PDF
    This reprint focuses on recent aspects of gravitational theory and cosmology. It contains subjects of particular interest for modified gravity theories and applications to cosmology, special attention is given to Einstein–Gauss–Bonnet, f(R)-gravity, anisotropic inflation, extra dimension theories of gravity, black holes, dark energy, Palatini gravity, anisotropic spacetime, Einstein–Finsler gravity, off-diagonal cosmological solutions, Hawking-temperature and scalar-tensor-vector theories
    corecore