52,138 research outputs found
Linux user interface using camera
Cílem projektu bylo vytvořit plně funkční program v jazyce C++, který je schopen detekce objektů a ovládání kurzoru myši v operačním systému Linux. Tato detekce je založena na rozpoznávání objektů požadované barvy a tvaru ze vstupu webkamery, v tomto případě sledování červeného kruhu. Hlavní část kódu byla psaná v programu Harpia, který je pro účely zpracovávání obrazu speciálně vytvořen. Většina použitých funkcí je z knihovny OpenCV, která se zabývá počítačovým viděním. V mé práci naleznete informace o způsobech detekce hran, filtraci obrazu a vyhlazovacích filtrech. Program splňuje stanovené zadání, na základě zjištěné polohy detekovaného objektu v obraze ovládá pohyb kurzoru myši.The goal of this was to create a fully functional program coded in C++, which is capable of real time object detection and mouse positioning in operating system Linux. Object detection is based on recognizing desired color and shape from webcam input. In this case it was a red circle. The main part of source code was generated via application Harpia. This is an application especially created for purposes of object tracking, border detection and picture processing. Most of used functions belong to OpenCV library. This library, as well as Harpia application, was created for computer vision, so it has many functions especially for purposes of my program. You can find many information about edge detection, color filtering and noise reduction in this document. I have also managed to control mouse cursor according to data that program detects. My program fulfils its purpose.
Perception for detection and grasping
The final publication is available at link.springer.comThis research presents a methodology for the detection of the crawler used in the project AEROARMS. The approach consisted on using a two-step progressive strategy, going from rough detection and tracking, for approximation maneuvers, to an accurate positioning step based on fiducial markers. Two different methods are explained for the first step, one using efficient image segmentation approach; and the second one using Deep Learning techniques to detect the center of the crawler. The fiducial markers are used for precise localization of the crawler in a similar way as explained in earlier chapters. The methods can run in real-time.Peer ReviewedPostprint (author's final draft
Synthesizing Training Data for Object Detection in Indoor Scenes
Detection of objects in cluttered indoor environments is one of the key
enabling functionalities for service robots. The best performing object
detection approaches in computer vision exploit deep Convolutional Neural
Networks (CNN) to simultaneously detect and categorize the objects of interest
in cluttered scenes. Training of such models typically requires large amounts
of annotated training data which is time consuming and costly to obtain. In
this work we explore the ability of using synthetically generated composite
images for training state-of-the-art object detectors, especially for object
instance detection. We superimpose 2D images of textured object models into
images of real environments at variety of locations and scales. Our experiments
evaluate different superimposition strategies ranging from purely image-based
blending all the way to depth and semantics informed positioning of the object
models into real scenes. We demonstrate the effectiveness of these object
detector training strategies on two publicly available datasets, the
GMU-Kitchens and the Washington RGB-D Scenes v2. As one observation, augmenting
some hand-labeled training data with synthetic examples carefully composed onto
scenes yields object detectors with comparable performance to using much more
hand-labeled data. Broadly, this work charts new opportunities for training
detectors for new objects by exploiting existing object model repositories in
either a purely automatic fashion or with only a very small number of
human-annotated examples.Comment: Added more experiments and link to project webpag
An Immersive Telepresence System using RGB-D Sensors and Head Mounted Display
We present a tele-immersive system that enables people to interact with each
other in a virtual world using body gestures in addition to verbal
communication. Beyond the obvious applications, including general online
conversations and gaming, we hypothesize that our proposed system would be
particularly beneficial to education by offering rich visual contents and
interactivity. One distinct feature is the integration of egocentric pose
recognition that allows participants to use their gestures to demonstrate and
manipulate virtual objects simultaneously. This functionality enables the
instructor to ef- fectively and efficiently explain and illustrate complex
concepts or sophisticated problems in an intuitive manner. The highly
interactive and flexible environment can capture and sustain more student
attention than the traditional classroom setting and, thus, delivers a
compelling experience to the students. Our main focus here is to investigate
possible solutions for the system design and implementation and devise
strategies for fast, efficient computation suitable for visual data processing
and network transmission. We describe the technique and experiments in details
and provide quantitative performance results, demonstrating our system can be
run comfortably and reliably for different application scenarios. Our
preliminary results are promising and demonstrate the potential for more
compelling directions in cyberlearning.Comment: IEEE International Symposium on Multimedia 201
IMPLEMENTATION OF A LOCALIZATION-ORIENTED HRI FOR WALKING ROBOTS IN THE ROBOCUP ENVIRONMENT
This paper presents the design and implementation of a human–robot interface capable of evaluating robot localization performance and maintaining full control of robot behaviors in the RoboCup domain. The system consists of legged robots, behavior modules, an overhead visual tracking system, and a graphic user interface. A human–robot communication framework is designed for executing cooperative and competitive processing tasks between users and robots by using object oriented and modularized software architecture, operability, and functionality. Some experimental results are presented to show the performance of the proposed system based on simulated and real-time information. </jats:p
Robust 3D People Tracking and Positioning System in a Semi-Overlapped Multi-Camera Environment
People positioning and tracking in 3D indoor environments are challenging tasks due to background clutter and occlusions. Current works are focused on solving people occlusions in low-cluttered backgrounds, but fail in high-cluttered scenarios, specially when foreground objects occlude people. In this paper, a novel 3D people positioning and tracking system is presented, which shows itself robust to both possible occlusion sources: static scene objects and other people. The system holds on a set of multiple cameras with partially overlapped fields of view. Moving regions are segmented independently in each camera stream by means of a new background modeling strategy based on Gabor filters. People detection is carried out on these segmentations through a template-based correlation strategy. Detected people are tracked independently in each camera view by means of a graph-based matching strategy, which estimates the best correspondences between consecutive people segmentations. Finally, 3D tracking and positioning of people is achieved by geometrical consistency analysis over the tracked 2D candidates, using head position (instead of object centroids) to increase robustness to foreground occlusions
3D Tracking Using Multi-view Based Particle Filters
Visual surveillance and monitoring of indoor environments using multiple cameras has become a field of great activity in computer vision. Usual 3D tracking and positioning systems rely on several independent 2D tracking modules applied over individual camera streams, fused using geometrical relationships across cameras. As 2D tracking systems suffer inherent difficulties due to point of view limitations (perceptually similar foreground and background regions causing fragmentation of moving objects, occlusions), 3D tracking based on partially erroneous 2D tracks are likely to fail when handling multiple-people interaction. To overcome this problem, this paper proposes a Bayesian framework for combining 2D low-level cues from multiple cameras directly into the 3D world through 3D Particle Filters. This method allows to estimate the probability of a certain volume being occupied by a moving object, and thus to segment and track multiple people across the monitored area. The proposed method is developed on the basis of simple, binary 2D moving region segmentation on each camera, considered as different state observations. In addition, the method is proved well suited for integrating additional 2D low-level cues to increase system robustness to occlusions: in this line, a naïve color-based (HSI) appearance model has been integrated, resulting in clear performance improvements when dealing with complex scenarios
- …