10 research outputs found

    3d semantic representation of actions from effcient stereo-image-sequence segmentation on GPUs

    Get PDF
    A novel real-time framework for model-free stereo-video segmentation and stereo-segment tracking is presented, combining real-time optical flow and stereo with image segmentation running separately on two GPUs. The stereosegment tracking algorithm achieves a frame rate of 23 Hz for regular videos with a frame size of 256 x 320 pixels and nearly real time for stereo videos. The computed stereo segments are used to construct 3D segment graphs, from which main graphs, representing a relevant change in the scene, are extracted, which allow us to represent a movie of e.g. 396 original frames by only 12 graphs, each containing only a small number of nodes, providing a condensed description of the scene while preserving data-intrinsic semantics. Using this method, human activities, e.g., handling of objects, can be encoded in an efficient way. The method has potential applications for manipulation action recognition and learning, and provides a vision-front end for applications in cognitive robotics.Postprint (published version

    Automatic Loop Tuning and Memory Management for Stencil Computations

    Get PDF
    The Texas Instruments C66x Digital Signal Processor (DSP) is an embedded processor technology that is targeted at real time signal processing. It is also developed with a high potential to become the new generation of coprocessor technology for high performance embedded computing. Of particular interest is its performance for stencil computations, such as those found in signal processing and computer vision tasks. A stencil is a loop in which the output value is updated at each position of an array by taking a weighted function of its neighbors. Efficiently mapping stencil-based kernels to the C66x device presents two challenges. The first one is how to efficiently optimize loops in order to facilitate the usage of Single Instruction Multiple Data (SIMD) instructions. On this architecture, like most others, SIMD instructions are not directly generated by the compiler. The second problem is how to manage on-chip memory in a way that minimizes off-chip memory access. Although this could theoretically be achieved by using a highly associative cache, the high rate of data reuse in stencil loops causes a high conflict miss rate. One way to solve this problem is to configure the on-chip memory as a program controlled scratchpad. It allows user to buffer a 2D block of data and minimizes the off-chip data access. For this dissertation, we have accomplished two goals: (1) Develop a methodology for optimization of arbitrary 2D stencils that fully utilize SIMD instructions through microachitecture-aware loop unrolling. (2) Deliver an easy-to-use scratchpad buffer management system and use it to improve the memory efficiency for 2D stencils. We show in the results and analysis section that our stencil compiler is able to achieve up to 2x speed up compared with the code generated by the industrial standard compiler developed by Texas Instruments, and our memory management system is able to achieve up to 10x speed up compared with cache

    Optical flow estimation using steered-L1 norm

    Get PDF
    Motion is a very important part of understanding the visual picture of the surrounding environment. In image processing it involves the estimation of displacements for image points in an image sequence. In this context dense optical flow estimation is concerned with the computation of pixel displacements in a sequence of images, therefore it has been used widely in the field of image processing and computer vision. A lot of research was dedicated to enable an accurate and fast motion computation in image sequences. Despite the recent advances in the computation of optical flow, there is still room for improvements and optical flow algorithms still suffer from several issues, such as motion discontinuities, occlusion handling, and robustness to illumination changes. This thesis includes an investigation for the topic of optical flow and its applications. It addresses several issues in the computation of dense optical flow and proposes solutions. Specifically, this thesis is divided into two main parts dedicated to address two main areas of interest in optical flow. In the first part, image registration using optical flow is investigated. Both local and global image registration has been used for image registration. An image registration based on an improved version of the combined Local-global method of optical flow computation is proposed. A bi-lateral filter was used in this optical flow method to improve the edge preserving performance. It is shown that image registration via this method gives more robust results compared to the local and the global optical flow methods previously investigated. The second part of this thesis encompasses the main contribution of this research which is an improved total variation L1 norm. A smoothness term is used in the optical flow energy function to regularise this function. The L1 is a plausible choice for such a term because of its performance in preserving edges, however this term is known to be isotropic and hence decreases the penalisation near motion boundaries in all directions. The proposed improved L1 (termed here as the steered-L1 norm) smoothness term demonstrates similar performance across motion boundaries but improves the penalisation performance along such boundaries

    Optimization techniques for computationally expensive rendering algorithms

    Get PDF
    Realistic rendering in computer graphics simulates the interactions of light and surfaces. While many accurate models for surface reflection and lighting, including solid surfaces and participating media have been described; most of them rely on intensive computation. Common practices such as adding constraints and assumptions can increase performance. However, they may compromise the quality of the resulting images or the variety of phenomena that can be accurately represented. In this thesis, we will focus on rendering methods that require high amounts of computational resources. Our intention is to consider several conceptually different approaches capable of reducing these requirements with only limited implications in the quality of the results. The first part of this work will study rendering of time-­¿varying participating media. Examples of this type of matter are smoke, optically thick gases and any material that, unlike the vacuum, scatters and absorbs the light that travels through it. We will focus on a subset of algorithms that approximate realistic illumination using images of real world scenes. Starting from the traditional ray marching algorithm, we will suggest and implement different optimizations that will allow performing the computation at interactive frame rates. This thesis will also analyze two different aspects of the generation of anti-­¿aliased images. One targeted to the rendering of screen-­¿space anti-­¿aliased images and the reduction of the artifacts generated in rasterized lines and edges. We expect to describe an implementation that, working as a post process, it is efficient enough to be added to existing rendering pipelines with reduced performance impact. A third method will take advantage of the limitations of the human visual system (HVS) to reduce the resources required to render temporally antialiased images. While film and digital cameras naturally produce motion blur, rendering pipelines need to explicitly simulate it. This process is known to be one of the most important burdens for every rendering pipeline. Motivated by this, we plan to run a series of psychophysical experiments targeted at identifying groups of motion-­¿blurred images that are perceptually equivalent. A possible outcome is the proposal of criteria that may lead to reductions of the rendering budgets

    Teaching a Robot to Drive - A Skill Learning Inspired Approach

    Get PDF
    Roboter können unser Leben erleichtern, indem sie für uns unangenehme, oder sogar gefährliche Aufgaben übernehmen. Um sie effizient einsetzen zu können, sollten sie autonom, adaptiv und einfach zu instruieren sein. Traditionelle 'white-box'-Ansätze in der Robotik basieren auf dem Verständnis des Ingenieurs der unterliegenden physikalischen Struktur des gegebenen Problems. Ausgehend von diesem Verständnis kann der Ingenieur eine mögliche Lösung finden und es in dem System implementieren. Dieser Ansatz ist sehr mächtig, aber gleichwohl limitiert. Der wichtigste Nachteil ist, dass derart erstellte Systeme von vordefiniertem Wissen abhängen und deswegen jedes neue Verhalten den gleichen, teuren Entwicklungszyklus benötigt. Im Gegensatz dazu sind Menschen und einige andere Tiere nicht auf ihre angeborene Verhalten beschränkt, sondern können während ihrer Lebenszeit vielzählige weitere Fähigkeiten erwerben. Zusätzlich scheinen sie dazu kein detailliertes Wissen über den (physikalische) Ablauf einer gegebenen Aufgabe zu benötigen. Diese Eigenschaften sind auch für künstliche Systeme wünschenswert. Deswegen untersuchen wir in dieser Dissertation die Hypothese, dass Prinzipien des menschlichen Fähigkeitslernens zu alternativen Methoden für adaptive Systemkontrolle führen können. Wir untersuchen diese Hypothese anhand der Aufgabe des Autonomen Fahrens, welche ein klassiches Problem der Systemkontrolle darstellt und die Möglichkeit für vielfältige Applikationen bietet. Die genaue Aufgabe ist das Erlernen eines grundlegenden, antizipatorischen Fahrverhaltens von einem menschlichem Lehrer. Nachdem wir relevante Aspekte bezüglich des menschlichen Fähigkeitslernen aufgezeigt haben, und die Begriffe 'interne Modelle' und 'chunking' eingeführt haben, beschreiben wir die Anwendung dieser auf die gegebene Aufgabe. Wir realisieren chunking mit Hilfe einer Datenbank in welcher Beispiele menschlichen Fahreverhaltens gespeichert werden und mit Beschreibungen der visuell erfassten Strassentrajektorie verknüpft werden. Dies wird zunächst innerhalb einer Laborumgebung mit Hilfe eines Roboters verwirklicht und später, im Laufe des Europäischen DRIVSCO Projektes, auf ein echtes Auto übertragen. Wir untersuchen ausserdem das Erlernen visueller 'Vorwärtsmodelle', welche zu den internen Modellen gehören, sowie ihren Effekt auf die Kontrollperformanz beim Roboter. Das Hauptresultat dieser interdisziplinären und anwendungsorientierten Arbeit ist ein System, welches in der Lage ist als Antwort auf die visuell wahrgenommene Strassentrajektorie entsprechende Aktionspläne zu generieren, ohne das dazu metrische Informationen benötigt werden. Die vorhergesagten Aktionen in der Laborumgebung sind Lenken und Geschwindigkeit. Für das echte Auto Lenken und Beschleunigung, wobei die prediktive Kapazität des Systems für Letzteres beschränkt ist. D.h. der Roboter lernt autonomes Fahren von einem menschlichen Lehrer und das Auto lernt die Vorhersage menschlichen Fahrverhaltens. Letzteres wurde während der Begutachtung des Projektes duch ein internationales Expertenteam erfolgreich demonstriert. Das Ergebnis dieser Arbeit ist relevant für Anwendungen in der Roboterkontrolle und dabei besonders in dem Bereich intelligenter Fahrerassistenzsysteme
    corecore