1,526 research outputs found

    Undergraduate Catalog of Studies, 2023-2024

    Get PDF

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    Sensing Collectives: Aesthetic and Political Practices Intertwined

    Get PDF
    Are aesthetics and politics really two different things? The book takes a new look at how they intertwine, by turning from theory to practice. Case studies trace how sensory experiences are created and how collective interests are shaped. They investigate how aesthetics and politics are entangled, both in building and disrupting collective orders, in governance and innovation. This ranges from populist rallies and artistic activism over alternative lifestyles and consumer culture to corporate PR and governmental policies. Authors are academics and artists. The result is a new mapping of the intermingling and co-constitution of aesthetics and politics in engagements with collective orders

    Visual Guidance for Unmanned Aerial Vehicles with Deep Learning

    Full text link
    Unmanned Aerial Vehicles (UAVs) have been widely applied in the military and civilian domains. In recent years, the operation mode of UAVs is evolving from teleoperation to autonomous flight. In order to fulfill the goal of autonomous flight, a reliable guidance system is essential. Since the combination of Global Positioning System (GPS) and Inertial Navigation System (INS) systems cannot sustain autonomous flight in some situations where GPS can be degraded or unavailable, using computer vision as a primary method for UAV guidance has been widely explored. Moreover, GPS does not provide any information to the robot on the presence of obstacles. Stereo cameras have complex architecture and need a minimum baseline to generate disparity map. By contrast, monocular cameras are simple and require less hardware resources. Benefiting from state-of-the-art Deep Learning (DL) techniques, especially Convolutional Neural Networks (CNNs), a monocular camera is sufficient to extrapolate mid-level visual representations such as depth maps and optical flow (OF) maps from the environment. Therefore, the objective of this thesis is to develop a real-time visual guidance method for UAVs in cluttered environments using a monocular camera and DL. The three major tasks performed in this thesis are investigating the development of DL techniques and monocular depth estimation (MDE), developing real-time CNNs for MDE, and developing visual guidance methods on the basis of the developed MDE system. A comprehensive survey is conducted, which covers Structure from Motion (SfM)-based methods, traditional handcrafted feature-based methods, and state-of-the-art DL-based methods. More importantly, it also investigates the application of MDE in robotics. Based on the survey, two CNNs for MDE are developed. In addition to promising accuracy performance, these two CNNs run at high frame rates (126 fps and 90 fps respectively), on a single modest power Graphical Processing Unit (GPU). As regards the third task, the visual guidance for UAVs is first developed on top of the designed MDE networks. To improve the robustness of UAV guidance, OF maps are integrated into the developed visual guidance method. A cross-attention module is applied to fuse the features learned from the depth maps and OF maps. The fused features are then passed through a deep reinforcement learning (DRL) network to generate the policy for guiding the flight of UAV. Additionally, a simulation framework is developed which integrates AirSim, Unreal Engine and PyTorch. The effectiveness of the developed visual guidance method is validated through extensive experiments in the simulation framework

    Artificial intelligence for advanced manufacturing quality

    Get PDF
    100 p.This Thesis addresses the challenge of AI-based image quality control systems applied to manufacturing industry, aiming to improve this field through the use of advanced techniques for data acquisition and processing, in order to obtain robust, reliable and optimal systems. This Thesis presents contributions onthe use of complex data acquisition techniques, the application and design of specialised neural networks for the defect detection, and the integration and validation of these systems in production processes. It has been developed in the context of several applied research projects that provided a practical feedback of the usefulness of the proposed computational advances as well as real life data for experimental validation

    Implicit Object Pose Estimation on RGB Images Using Deep Learning Methods

    Get PDF
    With the rise of robotic and camera systems and the success of deep learning in computer vision, there is growing interest in precisely determining object positions and orientations. This is crucial for tasks like automated bin picking, where a camera sensor analyzes images or point clouds to guide a robotic arm in grasping objects. Pose recognition has broader applications, such as predicting a car's trajectory in autonomous driving or adapting objects in virtual reality based on the viewer's perspective. This dissertation focuses on RGB-based pose estimation methods that use depth information only for refinement, which is a challenging problem. Recent advances in deep learning have made it possible to predict object poses in RGB images, despite challenges like object overlap, object symmetries and more. We introduce two implicit deep learning-based pose estimation methods for RGB images, covering the entire process from data generation to pose selection. Furthermore, theoretical findings on Fourier embeddings are shown to improve the performance of the so-called implicit neural representations - which are then successfully utilized for the task of implicit pose estimation

    Religion, Education, and the ‘East’. Addressing Orientalism and Interculturality in Religious Education Through Japanese and East Asian Religions

    Get PDF
    This work addresses the theme of Japanese religions in order to rethink theories and practices pertaining to the field of Religious Education. Through an interdisciplinary framework that combines the study of religions, didactics and intercultural education, this book puts the case study of Religious Education in England in front of two ‘challenges’ in order to reveal hidden spots, tackle unquestioned assumptions and highlight problematic areas. These ‘challenges’, while focusing primarily on Japanese religions, are addressed within the wider contexts of other East Asian traditions and of the modern historical exchanges with the Euro-American societies. As result, a model for teaching Japanese and other East Asian religions is discussed and proposed in order to fruitfully engage issues such as orientalism, occidentalism, interculturality and critical thinking

    Mapping Brain Development and Decoding Brain Activity with Diffuse Optical Tomography

    Get PDF
    Functional neuroimaging has been used to map brain function as well as decode information from brain activity. However, applications like studying early brain development or enabling augmentative communication in patients with severe motor disabilities have been constrained by extant imaging modalities, which can be challenging to use in young children and entail major tradeoffs between logistics and image quality. Diffuse optical tomography (DOT) is an emerging method combining logistical advantages of optical imaging with enhanced image quality. Here, we developed one of the world’s largest DOT systems for high-performance optical brain imaging in children. From visual cortex activity in adults, we decoded the locations of checkerboard visual stimuli, e.g. localizing a 60 degree wedge rotating through 36 positions with an error of 25.8±24.7 degrees. Using animated movies as more child-friendly stimuli, we mapped reproducible responses to speech and faces with DOT in awake, typically developing 1-7 year-old children and adults. We then decoded with accuracy significantly above chance which movie a participant was watching or listening to from DOT data. This work lays a valuable foundation for ongoing research with wearable imaging systems and increasingly complex algorithms to map atypical brain development and decode covert semantic information in clinical populations
    corecore