2,009 research outputs found

    Visual Similarity Using Limited Supervision

    Get PDF
    The visual world is a conglomeration of objects, scenes, motion, and much more. As humans, we look at the world through our eyes, but we understand it by using our brains. From a young age, humans learn to recognize objects by association, meaning that we link an object or action to the most similar one in our memory to make sense of it. Within the field of Artificial Intelligence, Computer Vision gives machines the ability to see. While digital cameras provide eyes to the machine, Computer Vision develops its brain. To that purpose, Deep Learning has emerged as a very successful tool. This method allows machines to learn solutions to problems directly from the data. On the basis of Deep Learning, computers nowadays can also learn to interpret the visual world. However, the process of learning in machines is very different from ours. In Deep Learning, images and videos are grouped into predefined, artificial categories. However, describing a group of objects, or actions, with a single integer (category) disregards most of its characteristics and pair-wise relationships. To circumvent this, we propose to expand the categorical model by using visual similarity which better mirrors the human approach. Deep Learning requires a large set of manually annotated samples, that form the training set. Retrieving training samples is easy given the endless amount of images and videos available on the internet. However, this also requires manual annotations, which are very costly and laborious to obtain and thus a major bottleneck in modern computer vision. In this thesis, we investigate visual similarity methods to solve image and video classification. In particular, we search for a solution where human super- vision is marginal. We focus on Zero-Shot Learning (ZSL), where only a subset of categories are manually annotated. After studying existing methods in the field, we identify common limitations and propose methods to tackle them. In particular, ZSL image classification is trained using only discriminative supervi- sion, i.e. predefined categories, while ignoring other descriptive characteristics. To tackle this, we propose a new approach to learn shared features, i.e. non- discriminative, thus descriptive characteristics, which improves existing methods by a large margin. However, while ZSL has shown great potential for the task of image classification, for example in case of face recognition, it has performed poorly for video classification. We identify the reasons for the lack of growth in the field and provide a new, powerful baseline. Unfortunately, even if ZSL requires only partial labeled data, it still needs supervision during training. For that reason, we also investigate purely unsuper- vised methods. A successful paradigm is self-supervision: the model is trained using a surrogate task where supervision is automatically provided. The key to self-supervision is the ability of deep learning to transfer the knowledge learned from one task to a new task. The more similar the two tasks are, the more effective the transfer is. Similar to our work on ZSL, we also studied the com- mon limitations of existing self-supervision approaches and proposed a method to overcome them. To improve self-supervised learning, we propose a policy network which controls the parameters of the surrogate task and is trained through reinforcement learning. Finally, we present a real-life application where utilizing visual similarity with limited supervision provides a better solution compared to existing parametric approaches. We analyze the behavior of motor-impaired rodents during a single repeating action for which our method provides an objective similarity of behav- ior, facilitating comparisons across animal subjects and time during recovery

    Machine Learning for Fluid Mechanics

    Full text link
    The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from field measurements, experiments and large-scale simulations at multiple spatiotemporal scales. Machine learning offers a wealth of techniques to extract information from data that could be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. It outlines fundamental machine learning methodologies and discusses their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experimentation, and simulation. Machine learning provides a powerful information processing framework that can enrich, and possibly even transform, current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202

    Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

    Get PDF
    Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at this http URL
    corecore