9 research outputs found

    Towards a Visipedia: Combining Computer Vision and Communities of Experts

    Get PDF
    Motivated by the idea of a Visipedia, where users can search and explore by image, this thesis presents tools and techniques for empowering expert communities through computer vision. The collective aim of this work is to provide a scalable foundation upon which an application like Visipedia can be built. We conduct experiments using two highly motivated communities, the birding community and the naturalist community, and report results and lessons on how to build the necessary components of a Visipedia. First, we conduct experiments analyzing the behavior of state-of-the-art computer vision classifiers on long tailed datasets. We find poor feature sharing between classes, potentially limiting the applicability of these models and emphasizing the ability to intelligently direct data collection resources. Second, we devise online crowdsourcing algorithms to make dataset collection for binary labels, multiclass labels, keypoints, and mulit-instance bounding boxes faster, cheaper, and more accurate. These methods jointly estimate labels, worker skills, and train computer vision models for these tasks. Experiments show that we can achieve significant cost savings compared to traditional data collection techniques, and that we can produce a more accurate dataset compared to traditional data collection techniques. Third, we present two fine-grained datasets, detail how they were constructed, and analyze the test accuracy of state-of-the-art methods. These datasets are then used to create applications that help users identify species in their photographs: Merlin, an app assisting users in identifying birds species, and iNaturalist, an app that assists users in identifying a broad variety of species. Finally, we present work aimed at reducing the computational burden of large scale classification with the goal of creating an application that allows users to classify tens of thousands of species in real time on their mobile device. As a whole, the lessons learned and the techniques presented in this thesis bring us closer to the realization of a Visipedia.</p

    Interactive computer vision through the Web

    Get PDF
    Computer vision is the computational science aiming at reproducing and improving the ability of human vision to understand its environment. In this thesis, we focus on two fields of computer vision, namely image segmentation and visual odometry and we show the positive impact that interactive Web applications provide on each. The first part of this thesis focuses on image annotation and segmentation. We introduce the image annotation problem and challenges it brings for large, crowdsourced datasets. Many interactions have been explored in the literature to help segmentation algorithms. The most common consist in designating contours, bounding boxes around objects, or interior and exterior scribbles. When crowdsourcing, annotation tasks are delegated to a non-expert public, sometimes on cheaper devices such as tablets. In this context, we conducted a user study showing the advantages of the outlining interaction over scribbles and bounding boxes. Another challenge of crowdsourcing is the distribution medium. While evaluating an interaction in a small user study does not require complex setup, distributing an annotation campaign to thousands of potential users might differ. Thus we describe how the Elm programming language helped us build a reliable image annotation Web application. A highlights tour of its functionalities and architecture is provided, as well as a guide on how to deploy it to crowdsourcing services such as Amazon Mechanical Turk. The application is completely opensource and available online. In the second part of this thesis we present our open-source direct visual odometry library. In that endeavor, we provide an evaluation of other open-source RGB-D camera tracking algorithms and show that our approach performs as well as the currently available alternatives. The visual odometry problem relies on geometry tools and optimization techniques traditionally requiring much processing power to perform at realtime framerates. Since we aspire to run those algorithms directly in the browser, we review past and present technologies enabling high performance computations on the Web. In particular, we detail how to target a new standard called WebAssembly from the C++ and Rust programming languages. Our library has been started from scratch in the Rust programming language, which then allowed us to easily port it to WebAssembly. Thanks to this property, we are able to showcase a visual odometry Web application with multiple types of interactions available. A timeline enables one-dimensional navigation along the video sequence. Pairs of image points can be picked on two 2D thumbnails of the image sequence to realign cameras and correct drifts. Colors are also used to identify parts of the 3D point cloud, selectable to reinitialize camera positions. Combining those interactions enables improvements on the tracking and 3D point reconstruction results

    EG-ICE 2021 Workshop on Intelligent Computing in Engineering

    Get PDF
    The 28th EG-ICE International Workshop 2021 brings together international experts working at the interface between advanced computing and modern engineering challenges. Many engineering tasks require open-world resolutions to support multi-actor collaboration, coping with approximate models, providing effective engineer-computer interaction, search in multi-dimensional solution spaces, accommodating uncertainty, including specialist domain knowledge, performing sensor-data interpretation and dealing with incomplete knowledge. While results from computer science provide much initial support for resolution, adaptation is unavoidable and most importantly, feedback from addressing engineering challenges drives fundamental computer-science research. Competence and knowledge transfer goes both ways

    Fine Art Pattern Extraction and Recognition

    Get PDF
    This is a reprint of articles from the Special Issue published online in the open access journal Journal of Imaging (ISSN 2313-433X) (available at: https://www.mdpi.com/journal/jimaging/special issues/faper2020)

    GEOBIA 2016 : Solutions and Synergies., 14-16 September 2016, University of Twente Faculty of Geo-Information and Earth Observation (ITC): open access e-book

    Get PDF

    EG-ICE 2021 Workshop on Intelligent Computing in Engineering

    Get PDF
    The 28th EG-ICE International Workshop 2021 brings together international experts working at the interface between advanced computing and modern engineering challenges. Many engineering tasks require open-world resolutions to support multi-actor collaboration, coping with approximate models, providing effective engineer-computer interaction, search in multi-dimensional solution spaces, accommodating uncertainty, including specialist domain knowledge, performing sensor-data interpretation and dealing with incomplete knowledge. While results from computer science provide much initial support for resolution, adaptation is unavoidable and most importantly, feedback from addressing engineering challenges drives fundamental computer-science research. Competence and knowledge transfer goes both ways

    Remote Sensing for Land Administration 2.0

    Get PDF
    The reprint “Land Administration 2.0” is an extension of the previous reprint “Remote Sensing for Land Administration”, another Special Issue in Remote Sensing. This reprint unpacks the responsible use and integration of emerging remote sensing techniques into the domain of land administration, including land registration, cadastre, land use planning, land valuation, land taxation, and land development. The title was chosen as “Land Administration 2.0” in reference to both this Special Issue being the second volume on the topic “Land Administration” and the next-generation requirements of land administration including demands for 3D, indoor, underground, real-time, high-accuracy, lower-cost, and interoperable land data and information
    corecore