Search CORE

48 research outputs found

Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

Author: Fernandez-Chaves David
Gonzalez-Jimenez Javier
Matez-Bandera Jose Luis
Monroy Javier
Petkov Nicolai
Ruiz-Sarmiento Jose Raul
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Geometric modeling based on polygonal meshes

Author: Alliez Pierre
Bischoff Stephan
Botsch Mario
Kobbelt Leif
Levy Bruno
Pauly Mark
Rossl Christian
Publication venue
Publication date: 14/06/2010
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Technology 2003: The Fourth National Technology Transfer Conference and Exposition, volume 2

Author: Hackett Michael
Publication venue
Publication date
Field of study

Proceedings from symposia of the Technology 2003 Conference and Exposition, Dec. 7-9, 1993, Anaheim, CA, are presented. Volume 2 features papers on artificial intelligence, CAD&E, computer hardware, computer software, information management, photonics, robotics, test and measurement, video and imaging, and virtual reality/simulation

NASA Technical Reports Server

Representational Redundancy Reduction Strategies for Efficient Neural Network Architectures for Visual and Language Tasks

Author: Bidart Rene
Publication venue: 'University of Waterloo'
Publication date: 29/06/2023
Field of study

Deep neural networks have transformed a wide variety of domains including natural language processing, image and video processing, and robotics. However, the computational cost of training and inference with these models is high, and the rise of unsupervised pretraining has allowed ever larger networks to be used to further improve performance. Running these large neural networks in compute constrained environments such as on edge devices is infeasible, and the alternative of doing inference using cloud compute can be exceedingly expensive, with the largest language models needing to be distributed across multiple GPUs. Because of these constraints, size reduction and improving inference speed has been a main focus in neural network research. A wide variety of techniques have been proposed to improve the efficiency of existing neural networks including pruning, quantization, and knowledge distillation. In addition there is extensive effort on creating more efficient networks through hand design or an automated process called neural architecture search. However, there remain key domains where where there is significant room for improvement, which we demonstrate in this thesis. In this thesis we aim to improve the efficiency of deep neural networks in terms of inference latency, model size and latent representation size. We take an alternative approach to previous research and instead investigate redundant representations in neural networks. Across three domains of text classification, image classification and generative models we hypothesize that current neural networks contain representational redundancy and show that through the removal of this redundancy we can improve their efficiency. For image classification we hypothesize that convolution kernels contain redundancy in terms of unnecessary channel wise flexibility, and test this by introducing additional weight sharing into the network, preserving or even increasing classification performance while requiring fewer parameters. We show the benefits of this approach on convolution layers on the CIFAR and Imagenet datasets, on both standard models and models explicitly designed to be parameter efficient. For generative models we show it is possible to reduce the size of the latent representation of the model while preserving the quality of the generated images through the unsupervised disentanglement of shape and orientation. To do this we introduce the affine variational autoencoder, a novel training procedure, and demonstrate its effectiveness on the problem of generating 2 dimensional images, as well as 3 dimensional voxel representations of objects. Finally, looking at the transformer model, we note that there is a mismatch between the tasks used for pretraining and the downstream tasks models are fine tuned on, such as text classification.We hypothesize that this results in a redundancy in terms of unnecessary spatial information, and remove it through the introduction of learned sequence length bottlenecks. We aim to create task specific networks given a dataset and performance requirements through the use of a neural architecture search method and learned downsampling. We show that these task specific networks achieve superior performance in terms of inference latency and accuracy tradeoff to standard models without requiring additional pretraining

University of Waterloo's Institutional Repository

A Survey of Surface Reconstruction from Point Clouds

Author: Alliez Pierre
Berger Matthew
Guennebaud Gael
Levine Joshua
Seversky Lee
Sharf Andrei
Silva Claudio
Tagliasacchi Andrea
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

International audienceThe area of surface reconstruction has seen substantial progress in the past two decades. The traditional problem addressed by surface reconstruction is to recover the digital representation of a physical shape that has been scanned, where the scanned data contains a wide variety of defects. While much of the earlier work has been focused on reconstructing a piece-wise smooth representation of the original shape, recent work has taken on more specialized priors to address significantly challenging data imperfections, where the reconstruction can take on different representations – not necessarily the explicit geometry. We survey the field of surface reconstruction, and provide a categorization with respect to priors, data imperfections, and reconstruction output. By considering a holistic view of surface reconstruction, we show a detailed characterization of the field, highlight similarities between diverse reconstruction techniques, and provide directions for future work in surface reconstruction

Infoscience - École polytechnique fédérale de Lausanne

Crossref

INRIA a CCSD electronic archive server

Oskar Bordeaux

On the power of message passing for learning on graph-structured data

Author: Fey Matthias
Publication venue
Publication date: 01/01/2022
Field of study

This thesis proposes novel approaches for machine learning on irregularly structured input data such as graphs, point clouds and manifolds. Specifically, we are breaking up with the regularity restriction of conventional deep learning techniques, and propose solutions in designing, implementing and scaling up deep end-to-end representation learning on graph-structured data, known as Graph Neural Networks (GNNs). GNNs capture local graph structure and feature information by following a neural message passing scheme, in which node representations are recursively updated in a trainable and purely local fashion. In this thesis, we demonstrate the generality of message passing through a unified framework suitable for a wide range of operators and learning tasks. Specifically, we analyze the limitations and inherent weaknesses of GNNs and propose efficient solutions to overcome them, both theoretically and in practice, e.g., by conditioning messages via continuous B-spline kernels, by utilizing hierarchical message passing, or by leveraging positional encodings. In addition, we ensure that our proposed methods scale naturally to large input domains. In particular, we propose novel methods to fully eliminate the exponentially increasing dependency of nodes over layers inherent to message passing GNNs. Lastly, we introduce PyTorch Geometric, a deep learning library for implementing and working with graph-based neural network building blocks, built upon PyTorch

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Pattern Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

Directory of Open Access Books (DOAB)

Visual Perception of Objects and their Parts in Artificial Systems

Author: Schoeler Markus
Publication venue
Publication date: 12/10/2015
Field of study

Georg-August-University Göttingen

International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book

Author: Arndt Rafael
Hintermüller Michael
Huber Olivier
Löbhard Caroline
Stengl Steven-Marian
Publication venue
Publication date: 01/01/2019
Field of study

The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions. This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more

Publications Server of the Weierstrass Institute for Applied Analysis and Stochastics

Generalized averaged Gaussian quadrature and applications

Author: Spalević Miodrag
Publication venue
Publication date: 01/01/2019
Field of study

A simple numerical method for constructing the optimal generalized averaged Gaussian quadrature formulas will be presented. These formulas exist in many cases in which real positive GaussKronrod formulas do not exist, and can be used as an adequate alternative in order to estimate the error of a Gaussian rule. We also investigate the conditions under which the optimal averaged Gaussian quadrature formulas and their truncated variants are internal

Machinery - Repository of the Faculty of Mechanical Engineering, University of Belgrade

machinery