3,015 research outputs found
Meta-learning algorithms and applications
Meta-learning in the broader context concerns how an agent learns about their own learning, allowing them to improve their learning process. Learning how to learn is not only beneficial for humans, but it has also shown vast benefits for improving how machines learn. In the context of machine learning, meta-learning enables models to improve their learning process by selecting suitable meta-parameters that influence the learning. For deep learning specifically, the meta-parameters typically describe details of the training of the model but can also include description of the model itself - the architecture. Meta-learning is usually done with specific goals in mind, for example trying to improve ability to generalize or learn new concepts from only a few examples.
Meta-learning can be powerful, but it comes with a key downside: it is often computationally costly. If the costs would be alleviated, meta-learning could be more accessible to developers of new artificial intelligence models, allowing them to achieve greater goals or save resources. As a result, one key focus of our research is on significantly improving the efficiency of meta-learning. We develop two approaches: EvoGrad and PASHA, both of which significantly improve meta-learning efficiency in two common scenarios. EvoGrad allows us to efficiently optimize the value of a large number of differentiable meta-parameters, while PASHA enables us to efficiently optimize any type of meta-parameters but fewer in number.
Meta-learning is a tool that can be applied to solve various problems. Most commonly it is applied for learning new concepts from only a small number of examples (few-shot learning), but other applications exist too. To showcase the practical impact that meta-learning can make in the context of neural networks, we use meta-learning as a novel solution for two selected problems: more accurate uncertainty quantification (calibration) and general-purpose few-shot learning. Both are practically important problems and using meta-learning approaches we can obtain better solutions than the ones obtained using existing approaches. Calibration is important for safety-critical applications of neural networks, while general-purpose few-shot learning tests model's ability to generalize few-shot learning abilities across diverse tasks such as recognition, segmentation and keypoint estimation.
More efficient algorithms as well as novel applications enable the field of meta-learning to make more significant impact on the broader area of deep learning and potentially solve problems that were too challenging before. Ultimately both of them allow us to better utilize the opportunities that artificial intelligence presents
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Constraining the anisotropic expansion of the universe with type ia supernovae and improving the treatment of selection effects within bayesian hierarchical models
In thesis, I aim to apply advanced methods in Bayesian statistical modelling on Type Ia Supernovae (SNIa) data to determine tighter constraints on the fiducial Lambda-Cold-Dark-Matter (LCDM) cosmology and improve the modelling of systematic uncertainties in the data. The body of work covered herein can be broadly classified into two main topics:
I re-examine the contentious question of constraints on anisotropic expansion from SNIa in the light of a novel determination of peculiar velocities, which are crucial to test isotropy with SNe, out to distances < 200/h Mpc.The Bayesian hierarchical model BAHAMAS is adopted to constrain a dipole in the distance modulus in the context of the LCDM model and the deceleration parameter in a phenomenological Cosmographic expansion. I find no evidence for anisotropic expansion, and place a tight upper bound on the amplitude of a dipole, in a LCDM setting, and the Cosmographic expansion approach. Using Bayesian model comparison, I obtain posterior odds in excess of 900:1 (640:1) against a constant-in-redshift dipole for LCDM (Cosmographic expansion).
One of the modern problems of Supernovae cosmology is accounting for selection effects caused by Malmquist bias in a principled way. Here, I present a complete formalism for handling selection effects in Type Ia supernova (SNIa) cosmology in the context of Bayesian Hierarchical Modeling. I demonstrate the method on simulated data sets where selection cuts are made on the apparent magnitude and show that previous results by Rubin et al, (2015) are incorrect and can lead to biased cosmological parameters reconstruction. I how this formalism is easily extended to include the Phillips corrections that are used to standardize SNe. The formalism presented exhibits better statistical properties in terms of bias and mean squared error relative to a traditional ad hoc style correction and the model of Rubin et al, (2015)Open Acces
Revisiting and modeling power-law distributions in empirical outage data of power systems
The size distribution of planned and forced outages and following restoration
times in power systems have been studied for almost two decades and has drawn
great interest as they display heavy tails. Understanding of this phenomenon
has been done by various threshold models, which are self-tuned at their
critical points, but as many papers pointed out, explanations are intuitive,
and more empirical data is needed to support hypotheses. In this paper, the
authors analyze outage data collected from various public sources to calculate
the outage energy and outage duration exponents of possible power-law fits.
Temporal thresholds are applied to identify crossovers from initial short-time
behavior to power-law tails. We revisit and add to the possible explanations of
the uniformness of these exponents. By performing power spectral analyses on
the outage event time series and the outage duration time series, it is found
that, on the one hand, while being overwhelmed by white noise, outage events
show traits of self-organized criticality (SOC), which may be modeled by a
crossover from random percolation to directed percolation branching process
with dissipation, coupled to a conserved density. On the other hand, in
responses to outages, the heavy tails in outage duration distributions could be
a consequence of the highly optimized tolerance (HOT) mechanism, based on the
optimized allocation of maintenance resources.Comment: 16 pages, 8 figure
Differentially Private Synthetic Heavy-tailed Data
The U.S. Census Longitudinal Business Database (LBD) product contains
employment and payroll information of all U.S. establishments and firms dating
back to 1976 and is an invaluable resource for economic research. However, the
sensitive information in LBD requires confidentiality measures that the U.S.
Census in part addressed by releasing a synthetic version (SynLBD) of the data
to protect firms' privacy while ensuring its usability for research activities,
but without provable privacy guarantees. In this paper, we propose using the
framework of differential privacy (DP) that offers strong provable privacy
protection against arbitrary adversaries to generate synthetic heavy-tailed
data with a formal privacy guarantee while preserving high levels of utility.
We propose using the K-Norm Gradient Mechanism (KNG) with quantile regression
for DP synthetic data generation. The proposed methodology offers the
flexibility of the well-known exponential mechanism while adding less noise. We
propose implementing KNG in a stepwise and sandwich order, such that new
quantile estimation relies on previously sampled quantiles, to more efficiently
use the privacy-loss budget. Generating synthetic heavy-tailed data with a
formal privacy guarantee while preserving high levels of utility is a
challenging problem for data curators and researchers. However, we show that
the proposed methods can achieve better data utility relative to the original
KNG at the same privacy-loss budget through a simulation study and an
application to the Synthetic Longitudinal Business Database
Epistemic parity: reproducibility as an evaluation metric for differential privacy
Differential privacy (DP) data synthesizers are increasingly proposed to afford public release of sensitive information, offering theoretical guarantees for privacy (and, in some cases, utility), but limited empirical evidence of utility in practical settings. Utility is typically measured as the error on representative proxy tasks, such as descriptive statistics, multivariate correlations, the accuracy of trained classifiers, or performance over a query workload. The ability for these results to generalize to practitioners' experience has been questioned in a number of settings, including the U.S. Census. In this paper, we propose an evaluation methodology for synthetic data that avoids assumptions about the representativeness of proxy tasks, instead measuring the likelihood that published conclusions would change had the authors used synthetic data, a condition we call epistemic parity. Our methodology consists of reproducing empirical conclusions of peer-reviewed papers on real, publicly available data, then re-running these experiments a second time on DP synthetic data and comparing the results. We instantiate our methodology over a benchmark of recent peer-reviewed papers that analyze public datasets in the ICPSR social science repository. We model quantitative claims computationally to automate the experimental workflow, and model qualitative claims by reproducing visualizations and comparing the results manually. We then generate DP synthetic datasets using multiple state-of-the-art mechanisms, and estimate the likelihood that these conclusions will hold. We find that, for reasonable privacy regimes, state-of-the-art DP synthesizers are able to achieve high epistemic parity for several papers in our benchmark. However, some papers, and particularly some specific findings, are difficult to reproduce for any of the synthesizers. Given these results, we advocate for a new class of mechanisms that can reorder the priorities for DP data synthesis: favor stronger guarantees for utility (as measured by epistemic parity) and offer privacy protection with a focus on application-specific threat models and risk-assessment
Microwave-shielded ultracold polar molecules
Since the realization of Bose--Einstein condensates and degenerate Fermi gases, ultracold atoms with tunable interactions have become an essential platform for studying quantum many-body phenomena. Notable examples include the realization of BCS--BEC crossover and the simulation of the Bose/Fermi Hubbard model. Ultracold polar molecules could enrich the quantum gas toolbox with their long-range dipole-dipole interaction, which offers not only new opportunities in many-body physics, such as realizing the topological superfluid and the extended Hubbard model, but also applications in quantum chemistry, quantum computation, and precision measurements. However, the large number of internal degrees of freedom of molecules present a significant challenge in both cooling them to quantum degeneracy and controlling their interactions. Unlike atomic gases, a dense molecular sample suffers from fast collisional losses, preventing the implementation of evaporative cooling and the observation of scattering resonances. In this thesis, we describe how we solved the long-standing issue of collisional losses by microwave shielding, created a degenerate Fermi gas of NaK molecules, and discovered a new type of scattering resonances via which we created the first ultracold tetratomic molecules in the 100-nK regime.
By synchronizing the rotation of polar molecules with a circularly polarized microwave electric field, we equip the molecular sample with a highly tunable intermolecular potential. This not only stabilizes the gas against inelastic collisions but also enables field-linked scattering resonances for precise control over scattering lengths. At long range, the molecules interact via their induced rotating dipole moments. As they approach each other, their orientations realign to produce a repulsive force, thereby mitigating inelastic collisions at close distances. With an elastic-to-inelastic collision ratio of 500, we have achieved evaporative cooling of the molecular gas down to 21 nK and 0.36 times the Fermi temperature, setting a new record for the coldest polar molecular gas to date.
Thanks to the collisional stability of microwave-shielded molecules, we can directly load them into predominantly a single layer of a magic 3D optical lattice, achieving a peak filling fraction of 24%. These ultracold molecules, owing to their long lifetimes in their ground state and their long-range dipolar coupling, provide a unique platform to study quantum magnetism. With the achieved high filling fraction, we are prepared to study non-equilibrium spin dynamics such as rotational synchronization and spin squeezing.
We demonstrated that the interaction between microwave-shielded polar molecules is highly tunable via the microwave power, detuning, and polarization. When the interaction potential is deep enough to host field-linked bound states at the collisional threshold, a shape resonance is induced, allowing us to tune the scattering rate by three orders of magnitude. The field-linked resonances enables controls over the scattering length in a similar fashion as Feshbach resonance for ultracold atoms, promising the realization of strongly correlated phases, such as dipolar -wave superfluid. It also paves the way to investigate the interplay between short-range and long-range interactions in novel quantum matters, such as exotic supersolid.
Moreover, through a field-linked resonance, we associated for the first time weakly bound tetratomic molecules in the 100-nK regime, with a phase space density of 0.04. The transition from a Fermi gas of diatomic molecules to a Bose gas of tetratomic molecules paves the way for dipolar BCS--BEC crossover.
With microwave-shielded polar molecules, we have realized a quantum gas featuring highly tunable long-range interactions. The technique is universal to polar molecules with a sufficiently large dipole moment, and thus offers a general strategy for cooling and manipulating polar molecules, and for associating weakly bound ultracold polyatomic molecules. Utilizing the toolbox developed in ultracold atoms, this platform possesses the potential to unlock an entirely new realm of quantum simulation of many-body physics
- …