3,568 research outputs found
Hybrid Bayesian Eigenobjects: Combining Linear Subspace and Deep Network Methods for 3D Robot Vision
We introduce Hybrid Bayesian Eigenobjects (HBEOs), a novel representation for
3D objects designed to allow a robot to jointly estimate the pose, class, and
full 3D geometry of a novel object observed from a single viewpoint in a single
practical framework. By combining both linear subspace methods and deep
convolutional prediction, HBEOs efficiently learn nonlinear object
representations without directly regressing into high-dimensional space. HBEOs
also remove the onerous and generally impractical necessity of input data
voxelization prior to inference. We experimentally evaluate the suitability of
HBEOs to the challenging task of joint pose, class, and shape inference on
novel objects and show that, compared to preceding work, HBEOs offer
dramatically improved performance in all three tasks along with several orders
of magnitude faster runtime performance.Comment: To appear in the International Conference on Intelligent Robots
(IROS) - Madrid, 201
PyCUDA and PyOpenCL: A Scripting-Based Approach to GPU Run-Time Code Generation
High-performance computing has recently seen a surge of interest in
heterogeneous systems, with an emphasis on modern Graphics Processing Units
(GPUs). These devices offer tremendous potential for performance and efficiency
in important large-scale applications of computational science. However,
exploiting this potential can be challenging, as one must adapt to the
specialized and rapidly evolving computing environment currently exhibited by
GPUs. One way of addressing this challenge is to embrace better techniques and
develop tools tailored to their needs. This article presents one simple
technique, GPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL,
two open-source toolkits that support this technique.
In introducing PyCUDA and PyOpenCL, this article proposes the combination of
a dynamic, high-level scripting language with the massive performance of a GPU
as a compelling two-tiered computing platform, potentially offering significant
performance and productivity advantages over conventional single-tier, static
systems. The concept of RTCG is simple and easily implemented using existing,
robust infrastructure. Nonetheless it is powerful enough to support (and
encourage) the creation of custom application-specific tools by its users. The
premise of the paper is illustrated by a wide range of examples where the
technique has been applied with considerable success.Comment: Submitted to Parallel Computing, Elsevie
Hardware accelerated real-time Linux video anonymizer
Dissertação de mestrado em Engenharia Eletrónica Industrial e ComputadoresOs Sistemas Embebidos estão presentes atualmente numa variada gama de equipamentos do
quotidiano do ser humano. Desde TV-boxes, televisões, routers até ao indispensável telemóvel.
O Sistema Operativo Linux, com a sua filosofia de distribuição ”one-size-fits-all” tornou-se
uma alternativa viável, fornecendo um vasto suporte de hardware, técnicas de depuração, suporte
dos protocolos de comunicação de rede, entre outros serviços, que se tornaram no conjunto
standard de requisitos na maioria dos sistemas embebidos atuais.
Este sistema operativo torna-se apelativo pela sua filosofia open-source que disponibiliza ao
utilizador um vasto conjunto de bibliotecas de software que possibilitam o desenvolvimento num
determinado domínio com maior celeridade e facilidade de integração de software complexo.
Os algoritmos deMachine Learning são desenvolvidos para a automização de tarefas e estão
presentes nas mais variadas tecnologias, desde o sistema de foco de imagem nosmartphone até
ao sistema de deteção dos limites de faixa de rodagem de um sistema de condução autónoma.
Estes são algoritmos que quando compilados para as plataformas de sistemas embebidos,
resultam num esforço de processamento e de consumo de recursos, como o footprint de memória,
que na maior parte dos casos supera em larga escala o conjunto de recursos disponíveis para a
aplicação do sistema, sendo necessária a implementação de componentes que requerem maior
poder de processamento através de elementos de hardware para garantir que as métricas tem porais sejam satisfeitas.
Esta dissertação propõe-se, por isso, à criação de um sistema de anonimização de vídeo
que adquire, processa e manipula as frames, com o intuito de garantir o anonimato, mesmo na
transmissão.
A sua implementação inclui técnicas de Deteção de Objectos, fazendo uso da combinação
das tecnologias de aceleração por hardware: paralelização e execução em hardware especial izado. É proposta então uma implementação restringida tanto temporalmente como no consumo
de recursos ao nível do hardware e software.Embedded Systems are currently present in a wide range of everyday equipment. From TV-boxes,
televisions and routers to the indispensable smartphone.
Linux Operating System, with its ”one-size-fits-all” distribution philosophy, has become a
viable alternative, providing extensive support for hardware, debugging techniques, network com munication protocols, among other functionalities, which have become the standard set of re quirements in most modern embedded systems.
This operating system is appealing due to its open-source philosophy, which provides the
user with a vast set of software libraries that enable development in a given domain with greater
speed and ease the integration of complex software.
Machine Learning algorithms are developed to execute tasks autonomously, i.e., without
human supervision, and are present in the most varied technologies, from the image focus system
on the smartphone to the detection system of the lane limits of an autonomous driving system.
These are algorithms that, when compiled for embedded systems platforms, require an ef fort to process and consume resources, such as the memory footprint, which in most cases far
outweighs the set of resources available for the application of the system, requiring the imple mentation of components that need greater processing power through elements of hardware to
ensure that the time metrics are satisfied.
This dissertation proposes the creation of a video anonymization system that acquires, pro cesses, and manipulates the frames, in order to guarantee anonymity, even during the transmis sion.
Its implementation includes Object Detection techniques, making use of the combination
of hardware acceleration technologies: parallelization and execution in specialized hardware.
An implementation is then proposed, restricted both in time and in resource consumption at
hardware and software levels
Machine Learning for Microcontroller-Class Hardware -- A Review
The advancements in machine learning opened a new opportunity to bring
intelligence to the low-end Internet-of-Things nodes such as microcontrollers.
Conventional machine learning deployment has high memory and compute footprint
hindering their direct deployment on ultra resource-constrained
microcontrollers. This paper highlights the unique requirements of enabling
onboard machine learning for microcontroller class devices. Researchers use a
specialized model development workflow for resource-limited applications to
ensure the compute and latency budget is within the device limits while still
maintaining the desired performance. We characterize a closed-loop widely
applicable workflow of machine learning model development for microcontroller
class devices and show that several classes of applications adopt a specific
instance of it. We present both qualitative and numerical insights into
different stages of model development by showcasing several use cases. Finally,
we identify the open research challenges and unsolved questions demanding
careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa
Carnegie Mellon Team Tartan: Mission-level Robustness with Rapidly Deployed Autonomous Aerial Vehicles in the MBZIRC 2020
For robotics systems to be used in high risk, real-world situations, they
have to be quickly deployable and robust to environmental changes,
under-performing hardware, and mission subtask failures. Robots are often
designed to consider a single sequence of mission events, with complex
algorithms lowering individual subtask failure rates under some critical
constraints. Our approach is to leverage common techniques in vision and
control and encode robustness into mission structure through outcome monitoring
and recovery strategies, aided by a system infrastructure that allows for quick
mission deployments under tight time constraints and no central communication.
We also detail lessons in rapid field robotics development and testing. Systems
were developed and evaluated through real-robot experiments at an outdoor test
site in Pittsburgh, Pennsylvania, USA, as well as in the 2020 Mohamed Bin Zayed
International Robotics Challenge. All competition trials were completed in
fully autonomous mode without RTK-GPS. Our system led to 4th place in Challenge
2 and 7th place in the Grand Challenge, and achievements like popping five
balloons (Challenge 1), successfully picking and placing a block (Challenge 2),
and dispensing the most water autonomously with a UAV of all teams onto an
outdoor, real fire (Challenge 3).Comment: 28 pages, 26 figures. To appear in Field Robotics, Special Issues on
MBZIRC 202
A robust patch-based synthesis framework for combining inconsistent images
Current methods for combining different images produce visible artifacts when the sources have very different textures and structures, come from far view points, or capture dynamic scenes with motions. In this thesis, we propose a patch-based synthesis algorithm to plausibly combine different images that have color, texture, structural, and geometric inconsistencies. For some applications such as cloning and stitching where a gradual blend is required, we present a new method for synthesizing a transition region between two source images, such that inconsistent properties change gradually from one source to the other. We call this process image melding. For gradual blending, we generalized patch-based optimization foundation with three key generalizations: First, we enrich the patch search space with additional geometric and photometric transformations. Second, we integrate image gradients into the patch representation and replace the usual color averaging with a screened Poisson equation solver. Third, we propose a new energy based on mixed L2/L0 norms for colors and gradients that produces a gradual transition between sources without sacrificing texture sharpness. Together, all three generalizations enable patch-based solutions to a broad class of image melding problems involving inconsistent sources: object cloning, stitching challenging panoramas, hole filling from multiple photos, and image harmonization. We also demonstrate another application which requires us to address inconsistencies across the images: high dynamic range (HDR) reconstruction using sequential exposures. In this application, the results will suffer from objectionable artifacts for dynamic scenes if the inconsistencies caused by significant scene motions are not handled properly. In this thesis, we propose a new approach to HDR reconstruction that uses information in all exposures while being more robust to motion than previous techniques. Our algorithm is based on a novel patch-based energy-minimization formulation that integrates alignment and reconstruction in a joint optimization through an equation we call the HDR image synthesis equation. This allows us to produce an HDR result that is aligned to one of the exposures yet contains information from all of them. These two applications (image melding and high dynamic range reconstruction) show that patch based methods like the one proposed in this dissertation can address inconsistent images and could open the door to many new image editing applications in the future
Learning from minimally labeled data with accelerated convolutional neural networks
The main objective of an Artificial Vision Algorithm is to design a mapping function that takes an image as an input and correctly classifies it into one of the user-determined categories. There are several important properties to be satisfied by the mapping function for visual understanding. First, the function should produce good representations of the visual world, which will be able to recognize images independently of pose, scale and illumination. Furthermore, the designed artificial vision system has to learn these representations by itself. Recent studies on Convolutional Neural Networks (ConvNets) produced promising advancements in visual understanding. These networks attain significant performance upgrades by relying on hierarchical structures inspired by biological vision systems. In my research, I work mainly in two areas: 1) how ConvNets can be programmed to learn the optimal mapping function using the minimum amount of labeled data, and 2) how these networks can be accelerated for practical purposes. In this work, algorithms that learn from unlabeled data are studied. A new framework that exploits unlabeled data is proposed. The proposed framework obtains state-of-the-art performance results in different tasks.
Furthermore, this study presents an optimized streaming method for ConvNets’ hardware accelerator on an embedded platform. It is tested on object classification and detection applications using ConvNets. Experimental results indicate high computational efficiency, and significant performance upgrades over all other existing platforms
A Survey of Techniques for Improving Security of GPUs
Graphics processing unit (GPU), although a powerful performance-booster, also
has many security vulnerabilities. Due to these, the GPU can act as a
safe-haven for stealthy malware and the weakest `link' in the security `chain'.
In this paper, we present a survey of techniques for analyzing and improving
GPU security. We classify the works on key attributes to highlight their
similarities and differences. More than informing users and researchers about
GPU security techniques, this survey aims to increase their awareness about GPU
security vulnerabilities and potential countermeasures
- …