35 research outputs found

    Feature based dynamic intra-video indexing

    Get PDF
    A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling

    Latitude, longitude, and beyond:mining mobile objects' behavior

    Get PDF
    Rapid advancements in Micro-Electro-Mechanical Systems (MEMS), and wireless communications, have resulted in a surge in data generation. Mobility data is one of the various forms of data, which are ubiquitously collected by different location sensing devices. Extensive knowledge about the behavior of humans and wildlife is buried in raw mobility data. This knowledge can be used for realizing numerous viable applications ranging from wildlife movement analysis, to various location-based recommendation systems, urban planning, and disaster relief. With respect to what mentioned above, in this thesis, we mainly focus on providing data analytics for understanding the behavior and interaction of mobile entities (humans and animals). To this end, the main research question to be addressed is: How can behaviors and interactions of mobile entities be determined from mobility data acquired by (mobile) wireless sensor nodes in an accurate and efficient manner? To answer the above-mentioned question, both application requirements and technological constraints are considered in this thesis. On the one hand, applications requirements call for accurate data analytics to uncover hidden information about individual behavior and social interaction of mobile entities, and to deal with the uncertainties in mobility data. Technological constraints, on the other hand, require these data analytics to be efficient in terms of their energy consumption and to have low memory footprint, and processing complexity

    Cyclist Detection, Tracking, and Trajectory Analysis in Urban Traffic Video Data

    Full text link
    The major objective of this thesis work is examining computer vision and machine learning detection methods, tracking algorithms and trajectory analysis for cyclists in traffic video data and developing an efficient system for cyclist counting. Due to the growing number of cyclist accidents on urban roads, methods for collecting information on cyclists are of significant importance to the Department of Transportation. The collected information provides insights into solving critical problems related to transportation planning, implementing safety countermeasures, and managing traffic flow efficiently. Intelligent Transportation System (ITS) employs automated tools to collect traffic information from traffic video data. In comparison to other road users, such as cars and pedestrians, the automated cyclist data collection is relatively a new research area. In this work, a vision-based method for gathering cyclist count data at intersections and road segments is developed. First, we develop methodology for an efficient detection and tracking of cyclists. The combination of classification features along with motion based properties are evaluated to detect cyclists in the test video data. A Convolutional Neural Network (CNN) based detector called You Only Look Once (YOLO) is implemented to increase the detection accuracy. In the next step, the detection results are fed into a tracker which is implemented based on the Kernelized Correlation Filters (KCF) which in cooperation with the bipartite graph matching algorithm allows to track multiple cyclists, concurrently. Then, a trajectory rebuilding method and a trajectory comparison model are applied to refine the accuracy of tracking and counting. The trajectory comparison is performed based on semantic similarity approach. The proposed counting method is the first cyclist counting method that has the ability to count cyclists under different movement patterns. The trajectory data obtained can be further utilized for cyclist behavioral modeling and safety analysis

    Proceedings of the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    This book is a collection of 15 reviewed technical reports summarizing the presentations at the 2011 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory. The covered topics include image processing, optical signal processing, visual inspection, pattern recognition and classification, human-machine interaction, world and situation modeling, autonomous system localization and mapping, information fusion, and trust propagation in sensor networks

    Hierarchische Modelle für das visuelle Erkennen und Lernen von Objekten, Szenen und Aktivitäten

    Get PDF
    In many computer vision applications, objects have to be learned and recognized in images or image sequences. Most of these objects have a hierarchical structure.For example, 3d objects can be decomposed into object parts, and object parts, in turn, into geometric primitives. Furthermore, scenes are composed of objects. And also activities or behaviors can be divided hierarchically into actions, these into individual movements, etc. Hierarchical models are therefore ideally suited for the representation of a wide range of objects used in applications such as object recognition, human pose estimation, or activity recognition. In this work new probabilistic hierarchical models are presented that allow an efficient representation of multiple objects of different categories, scales, rotations, and views. The idea is to exploit similarities between objects, object parts or actions and movements in order to share calculations and avoid redundant information. We will introduce online and offline learning methods, which enable to create efficient hierarchies based on small or large training datasets, in which poses or articulated structures are given by instances. Furthermore, we present inference approaches for fast and robust detection. These new approaches combine the idea of compositional and similarity hierarchies and overcome limitations of previous methods. They will be used in an unified hierarchical framework spatially for object recognition as well as spatiotemporally for activity recognition. The unified generic hierarchical framework allows us to apply the proposed models in different projects. Besides classical object recognition it is used for detection of human poses in a project for gait analysis. The activity detection is used in a project for the design of environments for ageing, to identify activities and behavior patterns in smart homes. In a project for parking spot detection using an intelligent vehicle, the proposed approaches are used to hierarchically model the environment of the vehicle for an efficient and robust interpretation of the scene in real-time.In zahlreichen Computer Vision Anwendungen müssen Objekte in einzelnen Bildern oder Bildsequenzen erlernt und erkannt werden. Viele dieser Objekte sind hierarchisch aufgebaut.So lassen sich 3d Objekte in Objektteile zerlegen und Objektteile wiederum in geometrische Grundkörper. Und auch Aktivitäten oder Verhaltensmuster lassen sich hierarchisch in einzelne Aktionen aufteilen, diese wiederum in einzelne Bewegungen usw. Für die Repräsentation sind hierarchische Modelle dementsprechend gut geeignet. In dieser Arbeit werden neue probabilistische hierarchische Modelle vorgestellt, die es ermöglichen auch mehrere Objekte verschiedener Kategorien, Skalierungen, Rotationen und aus verschiedenen Blickrichtungen effizient zu repräsentieren. Eine Idee ist hierbei, Ähnlichkeiten unter Objekten, Objektteilen oder auch Aktionen und Bewegungen zu nutzen, um redundante Informationen und Mehrfachberechnungen zu vermeiden. In der Arbeit werden online und offline Lernverfahren vorgestellt, die es ermöglichen, effiziente Hierarchien auf Basis von kleinen oder großen Trainingsdatensätzen zu erstellen, in denen Posen und bewegliche Strukturen durch Beispiele gegeben sind. Des Weiteren werden Inferenzansätze zur schnellen und robusten Detektion vorgestellt. Diese werden innerhalb eines einheitlichen hierarchischen Frameworks sowohl räumlich zur Objekterkennung als auch raumzeitlich zur Aktivitätenerkennung verwendet. Das einheitliche Framework ermöglicht die Anwendung des vorgestellten Modells innerhalb verschiedener Projekte. Neben der klassischen Objekterkennung wird es zur Erkennung von menschlichen Posen in einem Projekt zur Ganganalyse verwendet. Die Aktivitätenerkennung wird in einem Projekt zur Gestaltung altersgerechter Lebenswelten genutzt, um in intelligenten Wohnräumen Aktivitäten und Verhaltensmuster von Bewohnern zu erkennen. Im Rahmen eines Projektes zur Parklückenvermessung mithilfe eines intelligenten Fahrzeuges werden die vorgestellten Ansätze verwendet, um das Umfeld des Fahrzeuges hierarchisch zu modellieren und dadurch das Szenenverstehen zu ermöglichen

    Image-Based Scene Analysis for Computer-Assisted Laparoscopic Surgery

    Get PDF
    This thesis is concerned on image-based scene analysis for computer-assisted laparoscopic surgery. The focus lies on how to extract different types of information from laparoscopic video data. Methods for semantic analysis can be used to determine what instruments and organs are currently visible and where they are located. Quantitative analysis provides numerical information on the size and distances of structures. Workflow analysis uses information from previously seen images to estimate the progression of surgery. To demonstrate that the proposed methods function in real-world scenarios, multiple evaluations on actual laparoscopic image data recorded from surgeries were performed. The proposed methods for semantic and quantitative analysis were successfully evaluated in live phantom and animal studies and also used during a live gastric bypass on a human patient

    Automatic Behavior Analysis and Understanding of Collision Processes Using Video Sensors

    Get PDF
    RÉSUMÉ La sécurité routière est un des problèmes de société les plus importants à cause des multiples impacts et coûts des accidents de la route. Traditionnellement, le diagnostic de sécurité repose principalement sur les données historiques de collision. Cette approche réactive mène à remédier au problème de sécurité après que ses impacts sur la société soit déjà réalisés. Les analystes de la sécurité et les décideurs doivent attendre jusqu'à ce qu'un nombre suffisant de collisions (ce qui demande d’attendre habituellement au moins trois ans) soit collecté pour analyser ou mettre en place des mesures d’amélioration de la sécurité routière. Les méthodes substituts (« surrogate ») d'analyse de la sécurité constituent une approche alternative proactive qui s'appuie sur l'observation d’événements « dangereux » sans collision, souvent appelé accidents « évités de justesse » (« near misses ») ou « conflits ». Parmi ces approches, les techniques de conflits de trafic (TCT) reposent sur la collecte des données de conflit par des observateurs sur le terrain qui interprètent leur sévérité. Par conséquent, les TCT souffrent des variations de jugement des observateurs, de la difficulté de mesurer les indicateurs de sécurité en temps réel par les observateurs, et du coût de la collecte des données.----------ABSTRACTTraffic safety is one of the most important social issues due to the multiple costs of collisions. Traditionally, safety diagnosis depends mainly on historical collision data. This reactive approach leads to remedy the existing safety problem after the materialization of the induced social cost. Safety analysts and decision makers must wait till a sufficient number of collisions (typically at least 3 years of collision data) is collected to analyze and to devise countermeasures. Surrogate safety analysis is an alternative and proactive approach that relies on the observation of traffic events without a collision, in particular “unsafe” events often called “near misses” or “conflicts”. Among these approaches, traffic conflict techniques (TCT) rely mainly on field observers to identify conflicts and interpret their severity. Consequently, TCTs suffer from the variations of observer judgement, the cost of collecting conflict data, and the difficulty of measuring safety indicators in real time by the observers

    DATA-DRIVEN ANALYTICAL MODELS FOR IDENTIFICATION AND PREDICTION OF OPPORTUNITIES AND THREATS

    Get PDF
    During the lifecycle of mega engineering projects such as: energy facilities, infrastructure projects, or data centers, executives in charge should take into account the potential opportunities and threats that could affect the execution of such projects. These opportunities and threats can arise from different domains; including for example: geopolitical, economic or financial, and can have an impact on different entities, such as, countries, cities or companies. The goal of this research is to provide a new approach to identify and predict opportunities and threats using large and diverse data sets, and ensemble Long-Short Term Memory (LSTM) neural network models to inform domain specific foresights. In addition to predicting the opportunities and threats, this research proposes new techniques to help decision-makers for deduction and reasoning purposes. The proposed models and results provide structured output to inform the executive decision-making process concerning large engineering projects (LEPs). This research proposes new techniques that not only provide reliable timeseries predictions but uncertainty quantification to help make more informed decisions. The proposed ensemble framework consists of the following components: first, processed domain knowledge is used to extract a set of entity-domain features; second, structured learning based on Dynamic Time Warping (DTW), to learn similarity between sequences and Hierarchical Clustering Analysis (HCA), is used to determine which features are relevant for a given prediction problem; and finally, an automated decision based on the input and structured learning from the DTW-HCA is used to build a training data-set which is fed into a deep LSTM neural network for time-series predictions. A set of deeper ensemble programs are proposed such as Monte Carlo Simulations and Time Label Assignment to offer a controlled setting for assessing the impact of external shocks and a temporal alert system, respectively. The developed model can be used to inform decision makers about the set of opportunities and threats that their entities and assets face as a result of being engaged in an LEP accounting for epistemic uncertainty
    corecore