224 research outputs found

    Towards Closing the Energy Gap Between HOG and CNN Features for Embedded Vision

    Get PDF
    Computer vision enables a wide range of applications in robotics/drones, self-driving cars, smart Internet of Things, and portable/wearable electronics. For many of these applications, local embedded processing is preferred due to privacy and/or latency concerns. Accordingly, energy-efficient embedded vision hardware delivering real-time and robust performance is crucial. While deep learning is gaining popularity in several computer vision algorithms, a significant energy consumption difference exists compared to traditional hand-crafted approaches. In this paper, we provide an in-depth analysis of the computation, energy and accuracy trade-offs between learned features such as deep Convolutional Neural Networks (CNN) and hand-crafted features such as Histogram of Oriented Gradients (HOG). This analysis is supported by measurements from two chips that implement these algorithms. Our goal is to understand the source of the energy discrepancy between the two approaches and to provide insight about the potential areas where CNNs can be improved and eventually approach the energy-efficiency of HOG while maintaining its outstanding performance accuracy

    Sistema de detección y clasificación de peces utilizando visión computacional

    Get PDF
    La gestión de los recursos hidrobiológicos implica tanto el aspecto ecológico a través del equilibrio del ecosistema, como el aspecto económico mediante el control de la cantidad y calidad de los recursos pesqueros producidos en el Perú. En la actualidad, labores relacionadas a esta gestión son realizadas por empresas privadas y entidades del Estado como el Imarpe. La misión de estas es proteger la calidad de los recursos que llegan a los hogares de millones de peruanos. Esta investigación busca desarrollar un sistema para la detección, clasificación y, finalmente, la medición de diversas especies de peces, utilizando técnicas de visión computacional como el algoritmo SURF y redes neuronales convolucionales. Las pruebas, utilizando dos especies de peces, demostraron que la identificación alcanza un nivel de precisión del 90 % y que la clasificación alcanza una precisión del 80 %. Estos valores se obtienen bajo determinadas condiciones que se comentan en el desarrollo del artículo.The management of hydrobiological resources involves both the ecological aspect through the balance of the ecosystem, and the economic aspect through the control of the quantity and quality of the fishery resources produced in our country. Currently, work related to this management is carried out by private companies and state entities such as Imarpe. Their mission is to protect the quality of the resources that reach the homes of millions of Peruvians. This research aims to develop a system for the detection, classification and finally measurement of various species of fish, using computational vision techniques such as the SURF algorithm and convolutional neural networks. The tests, which used two fish species, showed that the identification reaches a 90% accuracy level and the classification reaches an 80% accuracy level. These values are achieved under certain conditions that are discussed in the article

    Improving the accuracy of weed species detection for robotic weed control in complex real-time environments

    Get PDF
    Alex Olsen applied deep learning and machine vision to improve the accuracy of weed species detection in real time complex environments. His robotic weed control prototype, AutoWeed, presents a new efficient tool for weed management in crop and pasture and has launched a startup agricultural technology company

    Recognising and localising human actions

    Get PDF
    Human action recognition in challenging video data is becoming an increasingly important research area. Given the growing number of cameras and robots pointing their lenses at humans, the need for automatic recognition of human actions arises, promising Google-style video search and automatic video summarisation/description. Furthermore, for any autonomous robotic system to interact with humans, it must rst be able to understand and quickly react to human actions. Although the best action classication methods aggregate features from the entire video clip in which the action unfolds, this global representation may include irrelevant scene context and movements which are shared amongst multiple action classes. For example, a waving action may be performed whilst walking, however if the walking movement appears in distinct action classes, then it should not be included in training a waving movement classier. For this reason, we propose an action classication framework in which more discriminative action subvolumes are learned in a weakly supervised setting, owing to the diculty of manually labelling massive video datasets. The learned models are used to simultaneously classify video clips and to localise actions to a given space-time subvolume. Each subvolume is cast as a bag-of-features (BoF) instance in a multiple-instance-learning framework, which in turn is used to learn its class membership. We demonstrate quantitatively that even with single xed-sized subvolumes, the classication performance of our proposed algorithm is superior to our BoF baseline on the majority of performance measures, and shows promise for space-time action localisation on the most challenging video datasets. Exploiting spatio-temporal structure in the video should also improve results, just as deformable part models have proven highly successful in object recognition. However, whereas objects have clear boundaries which means we can easily dene a ground truth for initialisation, 3D space-time actions are inherently ambiguous and expensive to annotate in large datasets. Thus, it is desirable to adapt pictorial star models to action datasets without location annotation, and to features invariant to changes in pose such as bag-of-feature and Fisher vectors, rather than low-level HoG. Thus, we propose local deformable spatial bag-of-features (LDSBoF) in which local discriminative regions are split into axed grid of parts that are allowed to deform in both space and time at test-time. In our experimental evaluation we demonstrate that by using local, deformable space-time action parts, we are able to achieve very competitive classification performance, whilst being able to localise actions even in the most challenging video datasets. A recent trend in action recognition is towards larger and more challenging datasets, an increasing number of action classes and larger visual vocabularies. For the global classication of human action video clips, the bag-of-visual-words pipeline is currently the best performing. However, the strategies chosen to sample features and construct a visual vocabulary are critical to performance, in fact often dominating performance. Thus, we provide a critical evaluation of various approaches to building a vocabulary and show that good practises do have a signicant impact. By subsampling and partitioning features strategically, we are able to achieve state-of-the-art results on 5 major action recognition datasets using relatively small visual vocabularies. Another promising approach to recognise human actions first encodes the action sequence via a generative dynamical model. However, using classical distances for their classication does not necessarily deliver good results. Therefore we propose a general framework for learning distance functions between dynamical models, given a training set of labelled videos. The optimal distance function is selected among a family of `pullback' ones, induced by a parametrised mapping of the space of models. We focus here on hidden Markov models and their model space, and show how pullback distance learning greatly improves action recognition performances with respect to base distances. Finally, the action classication systems that use a single global representation for each video clip are tailored for oine batch classication benchmarks. For human-robot interaction however, current systems fall short, either because they can only detect one human action per video frame, or because they assume the video is available ahead of time. In this work we propose an online human action detection system that can incrementally detect multiple concurrent space-time actions. In this way, it becomes possible to learn new action classes on-the-fly, allowing multiple people to actively teach and interact with a robot

    Enabling the Development and Implementation of Digital Twins : Proceedings of the 20th International Conference on Construction Applications of Virtual Reality

    Get PDF
    Welcome to the 20th International Conference on Construction Applications of Virtual Reality (CONVR 2020). This year we are meeting on-line due to the current Coronavirus pandemic. The overarching theme for CONVR2020 is "Enabling the development and implementation of Digital Twins". CONVR is one of the world-leading conferences in the areas of virtual reality, augmented reality and building information modelling. Each year, more than 100 participants from all around the globe meet to discuss and exchange the latest developments and applications of virtual technologies in the architectural, engineering, construction and operation industry (AECO). The conference is also known for having a unique blend of participants from both academia and industry. This year, with all the difficulties of replicating a real face to face meetings, we are carefully planning the conference to ensure that all participants have a perfect experience. We have a group of leading keynote speakers from industry and academia who are covering up to date hot topics and are enthusiastic and keen to share their knowledge with you. CONVR participants are very loyal to the conference and have attended most of the editions over the last eighteen editions. This year we are welcoming numerous first timers and we aim to help them make the most of the conference by introducing them to other participants

    15th SC@RUG 2018 proceedings 2017-2018

    Get PDF

    15th SC@RUG 2018 proceedings 2017-2018

    Get PDF
    corecore