100 research outputs found

    Deep learning techniques for visual object tracking

    Get PDF
    Visual object tracking plays a crucial role in various vision systems, including biometric analysis, medical imaging, smart traffic systems, and video surveillance. Despite notable advancements in visual object tracking over the past few decades, many tracking algorithms still face challenges due to factors like illumination changes, deformation, and scale variations. This thesis is divided into three parts. The first part introduces the visual object tracking problem and discusses the traditional approaches that have been used to study it. We then propose a novel method called Tracking by Iterative Multi-Refinements, which addresses the issue of locating the target by redefining the search for the ideal bounding box. This method utilizes an iterative process to forecast a sequence of bounding box adjustments, enabling the tracking algorithm to handle multiple non-conflicting transformations simultaneously. As a result, it achieves faster tracking and can handle a higher number of composite transformations. In the second part of this thesis we explore the application of reinforcement learning (RL) to visual tracking. Presenting a general RL framework applicable to problems that require a sequence of decisions. We discuss various families of popular RL approaches, including value-based methods, policy gradient approaches, and Actor-Critic Methods. Furthermore, we delve into the application of RL to visual tracking, where an RL agent predicts the target's location, selects hyperparameters, correlation filters, or target appearance. A comprehensive comparison of these approaches is provided, along with a taxonomy of state-of-the-art methods. The third part presents a novel method that addresses the need for online tuning of offline-trained tracking models. Typically, offline-trained models, whether through supervised learning or reinforcement learning, require additional tuning during online tracking to achieve optimal performance. The duration of this tuning process depends on the number of layers that need training for the new target. However, our thesis proposes a pioneering approach that expedites the training of convolutional neural networks (CNNs) while preserving their high performance levels. In summary, this thesis extensively explores the area of visual object tracking and its related domains, covering traditional approaches, novel methodologies like Tracking by Iterative Multi-Refinements, the application of reinforcement learning, and a pioneering method for accelerating CNN training. By addressing the challenges faced by existing tracking algorithms, this research aims to advance the field of visual object tracking and contributes to the development of more robust and efficient tracking systems

    Folklore in the Digital Age: Collected Essays. Foreword by Andy Ross

    Get PDF
    Online and digital cultures are among the most personally gripping effects of globalisation in our increasingly networked world. While global multimedia culture may seem to endanger traditional folklore, there is no doubt that it creates new folklore as well. Folklore in the Digital Age vividly illustrates the range of e-folklore studies in updated papers and essays from the author’s 21st-century research. The themes covered include not only the most serious issues of the day, such as the 9/11 attacks and natural disasters, but also cheerier topics, such as online dating and food culture. In these essays Professor Krawczyk-Wasilewska paints a convincing picture of digital folklore as a cultural heritage. She covers a wide range of issues from all levels of society and offers fascinating insights into how online culture affects our postmodern lives

    Event-Driven Technologies for Reactive Motion Planning: Neuromorphic Stereo Vision and Robot Path Planning and Their Application on Parallel Hardware

    Get PDF
    Die Robotik wird immer mehr zu einem SchlĂŒsselfaktor des technischen Aufschwungs. Trotz beeindruckender Fortschritte in den letzten Jahrzehnten, ĂŒbertreffen Gehirne von SĂ€ugetieren in den Bereichen Sehen und Bewegungsplanung noch immer selbst die leistungsfĂ€higsten Maschinen. Industrieroboter sind sehr schnell und prĂ€zise, aber ihre Planungsalgorithmen sind in hochdynamischen Umgebungen, wie sie fĂŒr die Mensch-Roboter-Kollaboration (MRK) erforderlich sind, nicht leistungsfĂ€hig genug. Ohne schnelle und adaptive Bewegungsplanung kann sichere MRK nicht garantiert werden. Neuromorphe Technologien, einschließlich visueller Sensoren und Hardware-Chips, arbeiten asynchron und verarbeiten so raum-zeitliche Informationen sehr effizient. Insbesondere ereignisbasierte visuelle Sensoren sind konventionellen, synchronen Kameras bei vielen Anwendungen bereits ĂŒberlegen. Daher haben ereignisbasierte Methoden ein großes Potenzial, schnellere und energieeffizientere Algorithmen zur Bewegungssteuerung in der MRK zu ermöglichen. In dieser Arbeit wird ein Ansatz zur flexiblen reaktiven Bewegungssteuerung eines Roboterarms vorgestellt. Dabei wird die Exterozeption durch ereignisbasiertes Stereosehen erreicht und die Pfadplanung ist in einer neuronalen ReprĂ€sentation des Konfigurationsraums implementiert. Die Multiview-3D-Rekonstruktion wird durch eine qualitative Analyse in Simulation evaluiert und auf ein Stereo-System ereignisbasierter Kameras ĂŒbertragen. Zur Evaluierung der reaktiven kollisionsfreien Online-Planung wird ein Demonstrator mit einem industriellen Roboter genutzt. Dieser wird auch fĂŒr eine vergleichende Studie zu sample-basierten Planern verwendet. ErgĂ€nzt wird dies durch einen Benchmark von parallelen Hardwarelösungen wozu als Testszenario Bahnplanung in der Robotik gewĂ€hlt wurde. Die Ergebnisse zeigen, dass die vorgeschlagenen neuronalen Lösungen einen effektiven Weg zur Realisierung einer Robotersteuerung fĂŒr dynamische Szenarien darstellen. Diese Arbeit schafft eine Grundlage fĂŒr neuronale Lösungen bei adaptiven Fertigungsprozesse, auch in Zusammenarbeit mit dem Menschen, ohne Einbußen bei Geschwindigkeit und Sicherheit. Damit ebnet sie den Weg fĂŒr die Integration von dem Gehirn nachempfundener Hardware und Algorithmen in die Industrierobotik und MRK

    Tightly-coupled manipulation pipelines: Combining traditional pipelines and end-to-end learning

    Get PDF
    Traditionally, robot manipulation tasks are solved by engineering solutions in a modular fashion --- typically consisting of object detection, pose estimation, grasp planning, motion planning, and finally run a control algorithm to execute the planned motion. This traditional approach to robot manipulation separates the hard problem of manipulation into several self-contained stages, which can be developed independently, and gives interpretable outputs at each stage of the pipeline. However, this approach comes with a plethora of issues, most notably, their generalisability to a broad range of tasks; it is common that as tasks get more difficult, the systems become increasingly complex. To combat the flaws of these systems, recent trends have seen robots visually learning to predict actions and grasp locations directly from sensor input in an end-to-end manner using deep neural networks, without the need to explicitly model the in-between modules. This thesis investigates a sample of methods, which fall somewhere on a spectrum from pipelined to fully end-to-end, which we believe to be more advantageous for developing a general manipulation system; one that could eventually be used in highly dynamic and unpredictable household environments. The investigation starts at the far end of the spectrum, where we explore learning an end-to-end controller in simulation and then transferring to the real world by employing domain randomisation, and finish on the other end, with a new pipeline, where the individual modules bear little resemblance to the "traditional" ones. The thesis concludes with a proposition of a new paradigm: Tightly-coupled Manipulation Pipelines (TMP). Rather than learning all modules implicitly in one large, end-to-end network or conversely, having individual, pre-defined modules that are developed independently, TMPs suggest taking the best of both world by tightly coupling actions to observations, whilst still maintaining structure via an undefined number of learned modules, which do not have to bear any resemblance to the modules seen in "traditional" systems.Open Acces

    Designing a Griotte for the Global Village: Increasing the Evidentiary Value of Oral Histories for Use in Digital Libraries

    Get PDF
    A griotte in West African culture is a female professional storyteller, responsible for preserving a tribe's history and genealogy by relaying its folklore in oral and musical recitations. Similarly, Griotte is an interdisciplinary project that seeks to foster collaboration between tradition bearers, subject experts, and computer specialists in an effort to build high quality digital oral history collections. To accomplish this objective, this project preserves the primary strength of oral history, namely its ability to disclose "our" intangible culture, and addresses its primary criticism, namely its dubious reliability due to reliance on human memory and integrity. For a theoretical foundation and a systematic model, William Moss's work on the evidentiary value of historical sources is employed. Using his work as a conceptual framework, along with Semantic Web technologies (e.g. Topic Maps and ontologies), a demonstrator system is developed to provide digital oral history tools to a "sample" of the target audience(s). This demonstrator system is evaluated via two methods: 1) a case study conducted to employ the system in the actual building of a digital oral history collection (this step also created sample data for the following assessment), and 2) a survey which involved a task-based evaluation of the demonstrator system. The results of the survey indicate that integrating oral histories with documentary evidence increases the evidentiary value of oral histories. Furthermore, the results imply that individuals are more likely to use oral histories in their work if their evidentiary value is increased. The contributions of this research – primarily in the area of organizing metadata on the World Wide Web – and considerations for future research are also provided

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world
    • 

    corecore