Search CORE

9,808 research outputs found

Recommended from our members

Explainable and Advisable Learning for Self-driving Vehicles

Author: Kim Jinkyu
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly

eScholarship - University of California

Extracting 3D parametric curves from 2D images of Helical objects

Author: Jackson Philip T.G.
Nelson Chas
Obara Boguslaw
Willcocks Chris G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2017
Field of study

Helical objects occur in medicine, biology, cosmetics, nanotechnology, and engineering. Extracting a 3D parametric curve from a 2D image of a helical object has many practical applications, in particular being able to extract metrics such as tortuosity, frequency, and pitch. We present a method that is able to straighten the image object and derive a robust 3D helical curve from peaks in the object boundary. The algorithm has a small number of stable parameters that require little tuning, and the curve is validated against both synthetic and real-world data. The results show that the extracted 3D curve comes within close Hausdorff distance to the ground truth, and has near identical tortuosity for helical objects with a circular profile. Parameter insensitivity and robustness against high levels of image noise are demonstrated thoroughly and quantitatively

Enlighten

Combining Multiple Algorithms for Road Network Tracking from Multiple Source Remotely Sensed Imagery: a Practical System and Performance Evaluation

Author: Amini
Baltsavias
Barzohar
Baumgartner
Benz
Geman
Gruen
Gruen
Harvey
Haverkamp
Heipke
Hinz
Hu
Hu
Jin
Jing Shen
Jixian Zhang
Kim
Lin
Liu
Mena
Nevatia
Peng
Shukla
Song
Stoica
Tupin
Vosselman
Wang
Xiangguo Lin
Zebedin
Zhang
Zhang
Zhengjun Liu
Zhou
Zhu
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 01/01/2009
Field of study

In light of the increasing availability of commercial high-resolution imaging sensors, automatic interpretation tools are needed to extract road features. Currently, many approaches for road extraction are available, but it is acknowledged that there is no single method that would be successful in extracting all types of roads from any remotely sensed imagery. In this paper, a novel classification of roads is proposed, based on both the roads' geometrical, radiometric properties and the characteristics of the sensors. Subsequently, a general road tracking framework is proposed, and one or more suitable road trackers are designed or combined for each type of roads. Extensive experiments are performed to extract roads from aerial/satellite imagery, and the results show that a combination strategy can automatically extract more than 60% of the total roads from very high resolution imagery such as QuickBird and DMC images, with a time-saving of approximately 20%, and acceptable spatial accuracy. It is proven that a combination of multiple algorithms is more reliable, more efficient and more robust for extracting road networks from multiple-source remotely sensed imagery than the individual algorithms

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

Local object gist: meaningful shapes and spatial layout at a very early stage of visual processing

Author: du Buf J. M. H.
Martins J. C.
Rodrigues J. M. F.
Publication venue: Verlag Wolfgang Krammer
Publication date: 27/12/2012
Field of study

In his introduction, Pinna (2010) quoted one of Wertheimer’s observations: “I stand at the window and see a house, trees, sky. Theoretically I might say there were 327 brightnesses and nuances of color. Do I have ‘327’? No. I have sky, house, and trees.” This seems quite remarkable, for Max Wertheimer, together with Kurt Koffka and Wolfgang Koehler, was a pioneer of Gestalt Theory: perceptual organisation was tackled considering grouping rules of line and edge elements in relation to figure-ground segregation, i.e., a meaningful object (the figure) as perceived against a complex background (the ground). At the lowest level – line and edge elements – Wertheimer (1923) himself formulated grouping principles on the basis of proximity, good continuation, convexity, symmetry and, often forgotten, past experience of the observer. Rubin (1921) formulated rules for figure-ground segregation using surroundedness, size and orientation, but also convexity and symmetry. Almost a century of research into Gestalt later, Pinna and Reeves (2006) introduced the notion of figurality, meant to represent the integrated set of properties of visual objects, from the principles of grouping and figure-ground to the colour and volume of objects with shading. Pinna, in 2010, went one important step further and studied perceptual meaning, i.e., the interpretation of complex figures on the basis of past experience of the observer. Re-establishing a link to Wertheimer’s rule about past experience, he formulated five propositions, three definitions and seven properties on the basis of observations made on graphically manipulated patterns. For example, he introduced the illusion of meaning by comics-like elements suggesting wind, therefore inducing a learned interpretation. His last figure shows a regular array of squares but with irregular positions on the right side. This pile of (ir)regular squares can be interpreted as the result of an earthquake which destroyed part of an apartment block. This is much more intuitive, direct and economic than describing the complexity of the array of squares

Sapientia