Search CORE

1,464 research outputs found

Realtime Object Detection via Deep Learning-based Pipelines

Author: Dai Liang
Shanahan James G
Publication venue: Bryant Digital Repository
Publication date: 01/11/2019
Field of study

Ever wonder how the Tesla Autopilot system works (or why it fails)? In this tutorial we will look under the hood of self-driving cars and of other applications of computer vision and review state-of-the-art tech pipelines for object detection such as two-stage approaches (e.g., Faster R-CNN) or single-stage approaches (e.g., YOLO/SSD). This is accomplished via a series of Jupyter Notebooks that use Python, OpenCV, Keras, and Tensorflow. No prior knowledge of computer vision is assumed (although it will be help!). To this end we begin this tutorial with a review of computer vision and traditional approaches to object detection such as Histogram of oriented gradients (HOG)

Crossref

DigitalCommons@Bryant University

LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes

Author: Florence Peter R.
Manuelli Lucas
Marion Pat
Tedrake Russ
Publication venue
Publication date: 26/09/2017
Field of study

Deep neural network (DNN) architectures have been shown to outperform traditional pipelines for object segmentation and pose estimation using RGBD data, but the performance of these DNN pipelines is directly tied to how representative the training data is of the true data. Hence a key requirement for employing these methods in practice is to have a large set of labeled data for your specific robotic manipulation task, a requirement that is not generally satisfied by existing datasets. In this paper we develop a pipeline to rapidly generate high quality RGBD data with pixelwise labels and object poses. We use an RGBD camera to collect video of a scene from multiple viewpoints and leverage existing reconstruction techniques to produce a 3D dense reconstruction. We label the 3D reconstruction using a human assisted ICP-fitting of object meshes. By reprojecting the results of labeling the 3D scene we can produce labels for each RGBD image of the scene. This pipeline enabled us to collect over 1,000,000 labeled object instances in just a few days. We use this dataset to answer questions related to how much training data is required, and of what quality the data must be, to achieve high performance from a DNN architecture

arXiv.org e-Print Archive

Crossref

Perception for detection and grasping

Author: Grau Saldes Antoni
Guerra Paradas Edmundo
Pumarola Peris Albert
Sanfeliu Cortés Alberto
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

The final publication is available at link.springer.comThis research presents a methodology for the detection of the crawler used in the project AEROARMS. The approach consisted on using a two-step progressive strategy, going from rough detection and tracking, for approximation maneuvers, to an accurate positioning step based on fiducial markers. Two different methods are explained for the first step, one using efficient image segmentation approach; and the second one using Deep Learning techniques to detect the center of the crawler. The fiducial markers are used for precise localization of the crawler in a similar way as explained in earlier chapters. The methods can run in real-time.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The Palomar Transient Factory: System Overview, Performance and First Results

Author: Bildsten L.
Bloom J. S.
Brown T.
Cenko S. B.
Ciardi D.
Croner E.
Dekany R. G.
Djorgovski S. G.
Filippenko A. V.
Fox D. B.
Gal-Yam A.
Grillmair C. C.
Hale D.
Hamam N.
Helou G.
Henning J. R.
Howell D. A.
Jacobsen J.
Kasliwal M. M.
Kulkarni S. R.
Laher R.
Law N. M.
Mattingly S.
McKenna D.
Nugent P. E.
Ofek E. O.
Pickles A.
Poznanski D.
Quimby R. M.
Rahmer G.
Rau A.
Rosing W.
Shara M.
Smith R.
Starr D.
Sullivan M.
Surace J.
van Eyken J. C.
Velur V.
Walters R. S.
Zolkower J.
Publication venue: 'University of Chicago Press'
Publication date: 01/01/2009
Field of study

The Palomar Transient Factory (PTF) is a fully-automated, wide-field survey aimed at a systematic exploration of the optical transient sky. The transient survey is performed using a new 8.1 square degree camera installed on the 48-inch Samuel Oschin telescope at Palomar Observatory; colors and light curves for detected transients are obtained with the automated Palomar 60-inch telescope. PTF uses eighty percent of the 1.2-m and fifty percent of the 1.5-m telescope time. With an exposure of 60-s the survey reaches a depth of approximately 21.3 in g' and 20.6 in R (5 sigma, median seeing). Four major experiments are planned for the five-year project: 1) a 5-day cadence supernova search; 2) a rapid transient search with cadences between 90 seconds and 1 day; 3) a search for eclipsing binaries and transiting planets in Orion; and 4) a 3-pi sr deep H-alpha survey. PTF provides automatic, realtime transient classification and follow up, as well as a database including every source detected in each frame. This paper summarizes the PTF project, including several months of on-sky performance tests of the new survey camera, the observing plans and the data reduction strategy. We conclude by detailing the first 51 PTF optical transient detections, found in commissioning data.Comment: 12 pages, 11 figures, 3 tables, submitted to PAS

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

MPG.PuRe

Scene Text Eraser

Author: Nakamura Toshiki
Uchida Seiichi
Yanai Keiji
Zhu Anna
Publication venue
Publication date: 08/05/2017
Field of study

The character information in natural scene images contains various personal information, such as telephone numbers, home addresses, etc. It is a high risk of leakage the information if they are published. In this paper, we proposed a scene text erasing method to properly hide the information via an inpainting convolutional neural network (CNN) model. The input is a scene text image, and the output is expected to be text erased image with all the character regions filled up the colors of the surrounding background pixels. This work is accomplished by a CNN model through convolution to deconvolution with interconnection process. The training samples and the corresponding inpainting images are considered as teaching signals for training. To evaluate the text erasing performance, the output images are detected by a novel scene text detection method. Subsequently, the same measurement on text detection is utilized for testing the images in benchmark dataset ICDAR2013. Compared with direct text detection way, the scene text erasing process demonstrates a drastically decrease on the precision, recall and f-score. That proves the effectiveness of proposed method for erasing the text in natural scene images

arXiv.org e-Print Archive

Crossref