1,369 research outputs found

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Learning Control by Iterative Inversion

    Full text link
    We propose iterative inversion\textit{iterative inversion} -- an algorithm for learning an inverse function without input-output pairs, but only with samples from the desired output distribution and access to the forward function. The key challenge is a distribution shift\textit{distribution shift} between the desired outputs and the outputs of an initial random guess, and we prove that iterative inversion can steer the learning correctly, under rather strict conditions on the function. We apply iterative inversion to learn control. Our input is a set of demonstrations of desired behavior, given as video embeddings of trajectories (without actions), and our method iteratively learns to imitate trajectories generated by the current policy, perturbed by random exploration noise. Our approach does not require rewards, and only employs supervised learning, which can be easily scaled to use state-of-the-art trajectory embedding techniques and policy representations. Indeed, with a VQ-VAE embedding, and a transformer-based policy, we demonstrate non-trivial continuous control on several tasks. Further, we report an improved performance on imitating diverse behaviors compared to reward based methods.Comment: ICML 2023. Videos available at https://sites.google.com/view/iter-inve

    Unsupervised representation learning with recognition-parametrised probabilistic models

    Get PDF
    We introduce a new approach to probabilistic unsupervised learning based on the recognitionparametrised model (RPM): a normalised semiparametric hypothesis class for joint distributions over observed and latent variables. Under the key assumption that observations are conditionally independent given latents, the RPM combines parametric prior and observation-conditioned latent distributions with non-parametric observation marginals. This approach leads to a flexible learnt recognition model capturing latent dependence between observations, without the need for an explicit, parametric generative model. The RPM admits exact maximum-likelihood learning for discrete latents, even for powerful neuralnetwork-based recognition. We develop effective approximations applicable in the continuouslatent case. Experiments demonstrate the effectiveness of the RPM on high-dimensional data, learning image classification from weak indirect supervision; direct image-level latent Dirichlet allocation; and recognition-parametrised Gaussian process factor analysis (RP-GPFA) applied to multi-factorial spatiotemporal datasets. The RPM provides a powerful framework to discover meaningful latent structure underlying observational data, a function critical to both animal and artificial intelligence

    On information captured by neural networks: connections with memorization and generalization

    Full text link
    Despite the popularity and success of deep learning, there is limited understanding of when, how, and why neural networks generalize to unseen examples. Since learning can be seen as extracting information from data, we formally study information captured by neural networks during training. Specifically, we start with viewing learning in presence of noisy labels from an information-theoretic perspective and derive a learning algorithm that limits label noise information in weights. We then define a notion of unique information that an individual sample provides to the training of a deep network, shedding some light on the behavior of neural networks on examples that are atypical, ambiguous, or belong to underrepresented subpopulations. We relate example informativeness to generalization by deriving nonvacuous generalization gap bounds. Finally, by studying knowledge distillation, we highlight the important role of data and label complexity in generalization. Overall, our findings contribute to a deeper understanding of the mechanisms underlying neural network generalization.Comment: PhD thesi

    Expanding measures: Random walks and rigidity on homogeneous spaces

    Full text link
    Let GG be a real Lie group, Λ<G\Lambda<G a lattice and H<GH<G a connected semisimple subgroup without compact factors and with finite center. We define the notion of HH-expanding measures μ\mu on HH and, applying recent work of Eskin-Lindenstrauss, prove that μ\mu-stationary probability measures on G/ΛG/\Lambda are homogeneous. Transferring a construction by Benoist-Quint and drawing on ideas of Eskin-Mirzakhani-Mohammadi, we construct Lyapunov/Margulis functions to show that HH-expanding random walks on G/ΛG/\Lambda satisfy a recurrence condition and that homogeneous subspaces are repelling. Combined with a countability result, this allows us to prove equidistribution of trajectories in G/ΛG/\Lambda for HH-expanding random walks and to obtain orbit closure descriptions. Finally, elaborating on an idea of Simmons-Weiss, we deduce Birkhoff genericity of a class of measures with respect to some diagonal flows and extend their applications to Diophantine approximation on similarity fractals to a non-conformal and weighted setting.Comment: 63 pages; revised the presentation of the proof of Corollary 1.2 and made other small changes and corrections. Accepted for publication by Forum of Mathematics, Sigm

    Transition 2.0: Re-establishing Constitutional Democracy in EU Member States

    Get PDF
    The central question of Transition 2.0 is this: what (and how) may a new government do to re-establish constitutional democracy, as well as repair membership within the European Union, without breaching the European rule of law? This volume demonstrates that EU law and international commitments impose constraints but also offer tools and assistance for facilitating the way back after rule of law and democratic backsliding. The various contributions explore the constitutional, legal, and social framework of 'Transition 2.0'.Dieser Band zeigt, dass das EU-Recht und die internationalen Verpflichtungen zwar Zwänge auferlegen, aber auch Instrumente und Hilfestellungen bieten, um den Weg zurück in die Europäische Union nach Rechtsstaatlichkeitsdefiziten und demokratischen Rückschritten zu erleichtern. Die verschiedenen Beiträge untersuchen den verfassungsrechtlichen, rechtlichen und sozialen Rahmen des "Übergangs 2.0"

    Signal compaction using polynomial EVD for spherical array processing with applications

    Get PDF
    Multi-channel signals captured by spatially separated sensors often contain a high level of data redundancy. A compact signal representation enables more efficient storage and processing, which has been exploited for data compression, noise reduction, and speech and image coding. This paper focuses on the compact representation of speech signals acquired by spherical microphone arrays. A polynomial matrix eigenvalue decomposition (PEVD) can spatially decorrelate signals over a range of time lags and is known to achieve optimum multi-channel data compaction. However, the complexity of PEVD algorithms scales at best cubically with the number of channel signals, e.g., the number of microphones comprised in a spherical array used for processing. In contrast, the spherical harmonic transform (SHT) provides a compact spatial representation of the 3-dimensional sound field measured by spherical microphone arrays, referred to as eigenbeam signals, at a cost that rises only quadratically with the number of microphones. Yet, the SHT’s spatially orthogonal basis functions cannot completely decorrelate sound field components over a range of time lags. In this work, we propose to exploit the compact representation offered by the SHT to reduce the number of channels used for subsequent PEVD processing. In the proposed framework for signal representation, we show that the diagonality factor improves by up to 7 dB over the microphone signal representation with a significantly lower computation cost. Moreover, when applying this framework to speech enhancement and source separation, the proposed method improves metrics known as short-time objective intelligibility (STOI) and source-to-distortion ratio (SDR) by up to 0.2 and 20 dB, respectively

    Superconductivity in a Topological Lattice Model with Strong Repulsion

    Full text link
    The highly tunable nature of synthetic quantum materials -- both in the solid-state and cold atom contexts -- invites examining which microscopic ingredients aid in the realization of correlated phases of matter such as superconductors. Recent experimental advances in moir\'e materials suggest that unifying the features of the Fermi-Hubbard model and quantum Hall systems creates a fertile ground for the emergence of such phases. Here, we introduce a minimal 2D lattice model that incorporates exactly these features: time-reversal symmetry, band topology, and strong repulsive interactions. By using infinite cylinder density matrix renormalization group methods (cylinder iDMRG), we investigate the ground state phase diagram of this model. We find that it hosts an interaction-induced quantum spin Hall (QSH) insulator and demonstrate that weakly hole-doping this state gives rise to a superconductor at a finite circumference, with indications that this behavior persists on larger cylinders. At the aforementioned circumference, the superconducting phase is surprisingly robust to perturbations including additional repulsive interactions in the pairing channel. By developing a technique to probe the superconducting gap function in iDMRG, we phenomenologically characterize the superconductor. Namely, we demonstrate that it is formed from the weak pairing of holes atop the QSH insulator. Furthermore, we determine the pairing symmetry of the superconductor, finding it to be pp-wave -- reminiscent of the unconventional superconductivity reported in experiments on twisted bilayer graphene (TBG). Motivated by this, we elucidate structural similarities and differences between our model and those of TBG in its chiral limit. Finally, to provide a more direct experimental realization, we detail an implementation of our Hamiltonian in a system of cold fermionic alkaline-earth atoms in an optical lattice.Comment: 27 pages (with 8 figures) + 35 pages supplementary (with 14 figures
    • …
    corecore