1,369 research outputs found
Self-supervised learning for transferable representations
Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Learning Control by Iterative Inversion
We propose -- an algorithm for learning an
inverse function without input-output pairs, but only with samples from the
desired output distribution and access to the forward function. The key
challenge is a between the desired outputs and
the outputs of an initial random guess, and we prove that iterative inversion
can steer the learning correctly, under rather strict conditions on the
function. We apply iterative inversion to learn control. Our input is a set of
demonstrations of desired behavior, given as video embeddings of trajectories
(without actions), and our method iteratively learns to imitate trajectories
generated by the current policy, perturbed by random exploration noise. Our
approach does not require rewards, and only employs supervised learning, which
can be easily scaled to use state-of-the-art trajectory embedding techniques
and policy representations. Indeed, with a VQ-VAE embedding, and a
transformer-based policy, we demonstrate non-trivial continuous control on
several tasks. Further, we report an improved performance on imitating diverse
behaviors compared to reward based methods.Comment: ICML 2023. Videos available at
https://sites.google.com/view/iter-inve
Unsupervised representation learning with recognition-parametrised probabilistic models
We introduce a new approach to probabilistic
unsupervised learning based on the recognitionparametrised model (RPM): a normalised semiparametric hypothesis class for joint distributions
over observed and latent variables. Under the key
assumption that observations are conditionally
independent given latents, the RPM combines
parametric prior and observation-conditioned latent distributions with non-parametric observation marginals. This approach leads to a flexible
learnt recognition model capturing latent dependence between observations, without the need for
an explicit, parametric generative model. The
RPM admits exact maximum-likelihood learning for discrete latents, even for powerful neuralnetwork-based recognition. We develop effective approximations applicable in the continuouslatent case. Experiments demonstrate the effectiveness of the RPM on high-dimensional data,
learning image classification from weak indirect
supervision; direct image-level latent Dirichlet
allocation; and recognition-parametrised Gaussian process factor analysis (RP-GPFA) applied
to multi-factorial spatiotemporal datasets. The
RPM provides a powerful framework to discover
meaningful latent structure underlying observational data, a function critical to both animal and
artificial intelligence
On information captured by neural networks: connections with memorization and generalization
Despite the popularity and success of deep learning, there is limited
understanding of when, how, and why neural networks generalize to unseen
examples. Since learning can be seen as extracting information from data, we
formally study information captured by neural networks during training.
Specifically, we start with viewing learning in presence of noisy labels from
an information-theoretic perspective and derive a learning algorithm that
limits label noise information in weights. We then define a notion of unique
information that an individual sample provides to the training of a deep
network, shedding some light on the behavior of neural networks on examples
that are atypical, ambiguous, or belong to underrepresented subpopulations. We
relate example informativeness to generalization by deriving nonvacuous
generalization gap bounds. Finally, by studying knowledge distillation, we
highlight the important role of data and label complexity in generalization.
Overall, our findings contribute to a deeper understanding of the mechanisms
underlying neural network generalization.Comment: PhD thesi
Expanding measures: Random walks and rigidity on homogeneous spaces
Let be a real Lie group, a lattice and a connected
semisimple subgroup without compact factors and with finite center. We define
the notion of -expanding measures on and, applying recent work of
Eskin-Lindenstrauss, prove that -stationary probability measures on
are homogeneous. Transferring a construction by Benoist-Quint and
drawing on ideas of Eskin-Mirzakhani-Mohammadi, we construct Lyapunov/Margulis
functions to show that -expanding random walks on satisfy a
recurrence condition and that homogeneous subspaces are repelling. Combined
with a countability result, this allows us to prove equidistribution of
trajectories in for -expanding random walks and to obtain orbit
closure descriptions. Finally, elaborating on an idea of Simmons-Weiss, we
deduce Birkhoff genericity of a class of measures with respect to some diagonal
flows and extend their applications to Diophantine approximation on similarity
fractals to a non-conformal and weighted setting.Comment: 63 pages; revised the presentation of the proof of Corollary 1.2 and
made other small changes and corrections. Accepted for publication by Forum
of Mathematics, Sigm
Transition 2.0: Re-establishing Constitutional Democracy in EU Member States
The central question of Transition 2.0 is this: what (and how) may a new government do to re-establish constitutional democracy, as well as repair membership within the European Union, without breaching the European rule of law? This volume demonstrates that EU law and international commitments impose constraints but also offer tools and assistance for facilitating the way back after rule of law and democratic backsliding. The various contributions explore the constitutional, legal, and social framework of 'Transition 2.0'.Dieser Band zeigt, dass das EU-Recht und die internationalen Verpflichtungen zwar Zwänge auferlegen, aber auch Instrumente und Hilfestellungen bieten, um den Weg zurück in die Europäische Union nach Rechtsstaatlichkeitsdefiziten und demokratischen Rückschritten zu erleichtern. Die verschiedenen Beiträge untersuchen den verfassungsrechtlichen, rechtlichen und sozialen Rahmen des "Übergangs 2.0"
Signal compaction using polynomial EVD for spherical array processing with applications
Multi-channel signals captured by spatially separated sensors often contain a high level of data redundancy. A compact signal representation enables more efficient storage and processing, which has been exploited for data compression, noise reduction, and speech and image coding. This paper focuses on the compact representation of speech signals acquired by spherical microphone arrays. A polynomial matrix eigenvalue decomposition (PEVD) can spatially decorrelate signals over a range of time lags and is known to achieve optimum multi-channel data compaction. However, the complexity of PEVD algorithms scales at best cubically with the number of channel signals, e.g., the number of microphones comprised in a spherical array used for processing. In contrast, the spherical harmonic transform (SHT) provides a compact spatial representation of the 3-dimensional sound field measured by spherical microphone arrays, referred to as eigenbeam signals, at a cost that rises only quadratically with the number of microphones. Yet, the SHT’s spatially orthogonal basis functions cannot completely decorrelate sound field components over a range of time lags. In this work, we propose to exploit the compact representation offered by the SHT to reduce the number of channels used for subsequent PEVD processing. In the proposed framework for signal representation, we show that the diagonality factor improves by up to 7 dB over the microphone signal representation with a significantly lower computation cost. Moreover, when applying this framework to speech enhancement and source separation, the proposed method improves metrics known as short-time objective intelligibility (STOI) and source-to-distortion ratio (SDR) by up to 0.2 and 20 dB, respectively
Superconductivity in a Topological Lattice Model with Strong Repulsion
The highly tunable nature of synthetic quantum materials -- both in the
solid-state and cold atom contexts -- invites examining which microscopic
ingredients aid in the realization of correlated phases of matter such as
superconductors. Recent experimental advances in moir\'e materials suggest that
unifying the features of the Fermi-Hubbard model and quantum Hall systems
creates a fertile ground for the emergence of such phases. Here, we introduce a
minimal 2D lattice model that incorporates exactly these features:
time-reversal symmetry, band topology, and strong repulsive interactions. By
using infinite cylinder density matrix renormalization group methods (cylinder
iDMRG), we investigate the ground state phase diagram of this model. We find
that it hosts an interaction-induced quantum spin Hall (QSH) insulator and
demonstrate that weakly hole-doping this state gives rise to a superconductor
at a finite circumference, with indications that this behavior persists on
larger cylinders. At the aforementioned circumference, the superconducting
phase is surprisingly robust to perturbations including additional repulsive
interactions in the pairing channel. By developing a technique to probe the
superconducting gap function in iDMRG, we phenomenologically characterize the
superconductor. Namely, we demonstrate that it is formed from the weak pairing
of holes atop the QSH insulator. Furthermore, we determine the pairing symmetry
of the superconductor, finding it to be -wave -- reminiscent of the
unconventional superconductivity reported in experiments on twisted bilayer
graphene (TBG). Motivated by this, we elucidate structural similarities and
differences between our model and those of TBG in its chiral limit. Finally, to
provide a more direct experimental realization, we detail an implementation of
our Hamiltonian in a system of cold fermionic alkaline-earth atoms in an
optical lattice.Comment: 27 pages (with 8 figures) + 35 pages supplementary (with 14 figures
- …