1,464 research outputs found
Realtime Object Detection via Deep Learning-based Pipelines
Ever wonder how the Tesla Autopilot system works (or why it fails)? In this tutorial we will look under the hood of self-driving cars and of other applications of computer vision and review state-of-the-art tech pipelines for object detection such as two-stage approaches (e.g., Faster R-CNN) or single-stage approaches (e.g., YOLO/SSD). This is accomplished via a series of Jupyter Notebooks that use Python, OpenCV, Keras, and Tensorflow. No prior knowledge of computer vision is assumed (although it will be help!). To this end we begin this tutorial with a review of computer vision and traditional approaches to object detection such as Histogram of oriented gradients (HOG)
LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes
Deep neural network (DNN) architectures have been shown to outperform
traditional pipelines for object segmentation and pose estimation using RGBD
data, but the performance of these DNN pipelines is directly tied to how
representative the training data is of the true data. Hence a key requirement
for employing these methods in practice is to have a large set of labeled data
for your specific robotic manipulation task, a requirement that is not
generally satisfied by existing datasets. In this paper we develop a pipeline
to rapidly generate high quality RGBD data with pixelwise labels and object
poses. We use an RGBD camera to collect video of a scene from multiple
viewpoints and leverage existing reconstruction techniques to produce a 3D
dense reconstruction. We label the 3D reconstruction using a human assisted
ICP-fitting of object meshes. By reprojecting the results of labeling the 3D
scene we can produce labels for each RGBD image of the scene. This pipeline
enabled us to collect over 1,000,000 labeled object instances in just a few
days. We use this dataset to answer questions related to how much training data
is required, and of what quality the data must be, to achieve high performance
from a DNN architecture
Perception for detection and grasping
The final publication is available at link.springer.comThis research presents a methodology for the detection of the crawler used in the project AEROARMS. The approach consisted on using a two-step progressive strategy, going from rough detection and tracking, for approximation maneuvers, to an accurate positioning step based on fiducial markers. Two different methods are explained for the first step, one using efficient image segmentation approach; and the second one using Deep Learning techniques to detect the center of the crawler. The fiducial markers are used for precise localization of the crawler in a similar way as explained in earlier chapters. The methods can run in real-time.Peer ReviewedPostprint (author's final draft
The Palomar Transient Factory: System Overview, Performance and First Results
The Palomar Transient Factory (PTF) is a fully-automated, wide-field survey
aimed at a systematic exploration of the optical transient sky. The transient
survey is performed using a new 8.1 square degree camera installed on the
48-inch Samuel Oschin telescope at Palomar Observatory; colors and light curves
for detected transients are obtained with the automated Palomar 60-inch
telescope. PTF uses eighty percent of the 1.2-m and fifty percent of the 1.5-m
telescope time. With an exposure of 60-s the survey reaches a depth of
approximately 21.3 in g' and 20.6 in R (5 sigma, median seeing). Four major
experiments are planned for the five-year project: 1) a 5-day cadence supernova
search; 2) a rapid transient search with cadences between 90 seconds and 1 day;
3) a search for eclipsing binaries and transiting planets in Orion; and 4) a
3-pi sr deep H-alpha survey. PTF provides automatic, realtime transient
classification and follow up, as well as a database including every source
detected in each frame. This paper summarizes the PTF project, including
several months of on-sky performance tests of the new survey camera, the
observing plans and the data reduction strategy. We conclude by detailing the
first 51 PTF optical transient detections, found in commissioning data.Comment: 12 pages, 11 figures, 3 tables, submitted to PAS
Scene Text Eraser
The character information in natural scene images contains various personal
information, such as telephone numbers, home addresses, etc. It is a high risk
of leakage the information if they are published. In this paper, we proposed a
scene text erasing method to properly hide the information via an inpainting
convolutional neural network (CNN) model. The input is a scene text image, and
the output is expected to be text erased image with all the character regions
filled up the colors of the surrounding background pixels. This work is
accomplished by a CNN model through convolution to deconvolution with
interconnection process. The training samples and the corresponding inpainting
images are considered as teaching signals for training. To evaluate the text
erasing performance, the output images are detected by a novel scene text
detection method. Subsequently, the same measurement on text detection is
utilized for testing the images in benchmark dataset ICDAR2013. Compared with
direct text detection way, the scene text erasing process demonstrates a
drastically decrease on the precision, recall and f-score. That proves the
effectiveness of proposed method for erasing the text in natural scene images
- …