10 research outputs found
3d semantic representation of actions from effcient stereo-image-sequence segmentation on GPUs
A novel real-time framework for model-free stereo-video segmentation and stereo-segment tracking is presented, combining real-time optical flow and stereo with image segmentation running separately on two GPUs. The stereosegment tracking algorithm achieves a frame rate of 23 Hz for regular videos with a frame size of 256 x 320 pixels and nearly real time for stereo videos. The computed stereo segments are used to construct 3D segment graphs, from which main graphs, representing a relevant change in the scene, are extracted, which allow us to represent a movie of e.g. 396 original frames by only 12 graphs, each containing only a small number of nodes, providing a condensed description of the scene while preserving data-intrinsic semantics. Using this method, human activities, e.g., handling of objects, can be encoded in an efficient way. The method has potential applications for manipulation action recognition and learning, and provides a vision-front end for applications in cognitive robotics.Postprint (published version
Automatic Loop Tuning and Memory Management for Stencil Computations
The Texas Instruments C66x Digital Signal Processor (DSP) is an embedded processor technology that is targeted at real time signal processing. It is also developed with a high potential to become the new generation of coprocessor technology for high performance embedded computing. Of particular interest is its performance for stencil computations, such as those found in signal processing and computer vision tasks. A stencil is a loop in which the output value is updated at each position of an array by taking a weighted function of its neighbors. Efficiently mapping stencil-based kernels to the C66x device presents two challenges. The first one is how to efficiently optimize loops in order to facilitate the usage of Single Instruction Multiple Data (SIMD) instructions. On this architecture, like most others, SIMD instructions are not directly generated by the compiler. The second problem is how to manage on-chip memory in a way that minimizes off-chip memory access. Although this could theoretically be achieved by using a highly associative cache, the high rate of data reuse in stencil loops causes a high conflict miss rate. One way to solve this problem is to configure the on-chip memory as a program controlled scratchpad. It allows user to buffer a 2D block of data and minimizes the off-chip data access. For this dissertation, we have accomplished two goals: (1) Develop a methodology for optimization of arbitrary 2D stencils that fully utilize SIMD instructions through microachitecture-aware loop unrolling. (2) Deliver an easy-to-use scratchpad buffer management system and use it to improve the memory efficiency for 2D stencils. We show in the results and analysis section that our stencil compiler is able to achieve up to 2x speed up compared with the code generated by the industrial standard compiler developed by Texas Instruments, and our memory management system is able to achieve up to 10x speed up compared with cache
Optical flow estimation using steered-L1 norm
Motion is a very important part of understanding the visual picture of the surrounding environment. In image processing it involves the estimation of displacements for image points in an image sequence. In this context dense optical flow estimation is concerned with the computation of pixel displacements in a sequence of images, therefore it has been used widely in the field of image processing and computer vision. A lot of research was dedicated to enable an accurate and fast motion computation in image sequences. Despite the recent advances in the computation of optical flow, there is still room for improvements and optical flow algorithms still suffer from several issues, such as motion discontinuities, occlusion handling, and robustness to illumination changes. This thesis includes an investigation for the topic of optical flow and its applications. It addresses several issues in the computation of dense optical flow and proposes solutions. Specifically, this thesis is divided into two main parts dedicated to address two main areas of interest in optical flow.
In the first part, image registration using optical flow is investigated. Both local and global image registration has been used for image registration. An image registration based on an improved version of the combined Local-global method of optical flow computation is proposed. A bi-lateral filter was used in this optical flow method to improve the edge preserving performance. It is shown that image registration via this method gives more robust results compared to the local and the global optical flow methods previously investigated.
The second part of this thesis encompasses the main contribution of this research which is an improved total variation L1 norm. A smoothness term is used in the optical flow energy function to regularise this function. The L1 is a plausible choice for such a term because of its performance in preserving edges, however this term is known to be isotropic and hence decreases the penalisation near motion boundaries in all directions. The proposed improved
L1 (termed here as the steered-L1 norm) smoothness term demonstrates similar performance across motion boundaries but improves the penalisation performance along such boundaries
Optimization techniques for computationally expensive rendering algorithms
Realistic rendering in computer graphics simulates the interactions of light and surfaces. While many accurate models for surface reflection and lighting, including solid surfaces and participating media have been described; most of them rely on intensive computation. Common practices such as adding constraints and assumptions can increase performance. However, they may compromise the quality of the resulting images or the variety of phenomena that can be accurately represented. In this thesis, we will focus on rendering methods that require high amounts of computational resources. Our intention is to consider several conceptually different approaches capable of reducing these requirements with only limited implications in the quality of the results. The first part of this work will study rendering of time-¿varying participating media. Examples of this type of matter are smoke, optically thick gases and any material that, unlike the vacuum, scatters and absorbs the light that travels through it. We will focus on a subset of algorithms that approximate realistic illumination using images of real world scenes. Starting from the traditional ray marching algorithm, we will suggest and implement different optimizations that will allow performing the computation at interactive frame rates. This thesis will also analyze two different aspects of the generation of anti-¿aliased images. One targeted to the rendering of screen-¿space anti-¿aliased images and the reduction of the artifacts generated in rasterized lines and edges. We expect to describe an implementation that, working as a post process, it is efficient enough to be added to existing rendering pipelines with reduced performance impact. A third method will take advantage of the limitations of the human visual system (HVS) to reduce the resources required to render temporally antialiased images. While film and digital cameras naturally produce motion blur, rendering pipelines need to explicitly simulate it. This process is known to be one of the most important burdens for every rendering pipeline. Motivated by this, we plan to run a series of psychophysical experiments targeted at identifying groups of motion-¿blurred images that are perceptually equivalent. A possible outcome is the proposal of criteria that may lead to reductions of the rendering budgets
Teaching a Robot to Drive - A Skill Learning Inspired Approach
Roboter können unser Leben erleichtern, indem sie
für uns unangenehme, oder sogar gefährliche Aufgaben
übernehmen. Um sie effizient einsetzen zu können,
sollten sie autonom, adaptiv und einfach zu instruieren
sein. Traditionelle 'white-box'-Ansätze in der Robotik
basieren auf dem Verständnis des Ingenieurs der
unterliegenden physikalischen Struktur des gegebenen
Problems. Ausgehend von diesem Verständnis kann der
Ingenieur eine mögliche Lösung finden und es in dem
System implementieren. Dieser Ansatz ist sehr mächtig,
aber gleichwohl limitiert. Der wichtigste Nachteil ist,
dass derart erstellte Systeme von vordefiniertem Wissen
abhängen und deswegen jedes neue Verhalten den
gleichen, teuren Entwicklungszyklus benötigt. Im
Gegensatz dazu sind Menschen und einige andere Tiere
nicht auf ihre angeborene Verhalten beschränkt, sondern
können während ihrer Lebenszeit vielzählige weitere
Fähigkeiten erwerben. Zusätzlich scheinen sie dazu kein
detailliertes Wissen über den (physikalische) Ablauf
einer gegebenen Aufgabe zu benötigen. Diese
Eigenschaften sind auch für künstliche Systeme
wünschenswert. Deswegen untersuchen wir in dieser
Dissertation die Hypothese, dass Prinzipien des
menschlichen Fähigkeitslernens zu alternativen Methoden
für adaptive Systemkontrolle führen können. Wir
untersuchen diese Hypothese anhand der Aufgabe des
Autonomen Fahrens, welche ein klassiches Problem der
Systemkontrolle darstellt und die Möglichkeit für
vielfältige Applikationen bietet. Die genaue Aufgabe
ist das Erlernen eines grundlegenden, antizipatorischen
Fahrverhaltens von einem menschlichem Lehrer. Nachdem
wir relevante Aspekte bezüglich des menschlichen
Fähigkeitslernen aufgezeigt haben, und die Begriffe
'interne Modelle' und 'chunking' eingeführt haben,
beschreiben wir die Anwendung dieser auf die gegebene
Aufgabe. Wir realisieren chunking mit Hilfe einer
Datenbank in welcher Beispiele menschlichen
Fahreverhaltens gespeichert werden und mit
Beschreibungen der visuell erfassten
Strassentrajektorie verknüpft werden. Dies wird
zunächst innerhalb einer Laborumgebung mit Hilfe eines
Roboters verwirklicht und später, im Laufe des
Europäischen DRIVSCO Projektes, auf ein echtes Auto
übertragen. Wir untersuchen ausserdem das Erlernen
visueller 'Vorwärtsmodelle', welche zu den internen
Modellen gehören, sowie ihren Effekt auf die
Kontrollperformanz beim Roboter. Das Hauptresultat
dieser interdisziplinären und anwendungsorientierten
Arbeit ist ein System, welches in der Lage ist als
Antwort auf die visuell wahrgenommene
Strassentrajektorie entsprechende Aktionspläne zu
generieren, ohne das dazu metrische Informationen
benötigt werden. Die vorhergesagten Aktionen in der
Laborumgebung sind Lenken und Geschwindigkeit. Für das
echte Auto Lenken und Beschleunigung, wobei die
prediktive Kapazität des Systems für Letzteres
beschränkt ist. D.h. der Roboter lernt autonomes Fahren
von einem menschlichen Lehrer und das Auto lernt die
Vorhersage menschlichen Fahrverhaltens. Letzteres wurde
während der Begutachtung des Projektes duch ein
internationales Expertenteam erfolgreich demonstriert.
Das Ergebnis dieser Arbeit ist relevant für Anwendungen
in der Roboterkontrolle und dabei besonders in dem
Bereich intelligenter Fahrerassistenzsysteme