556 research outputs found

    Detection and Generalization of Spatio-temporal Trajectories for Motion Imagery

    Get PDF
    In today\u27s world of vast information availability users often confront large unorganized amounts of data with limited tools for managing them. Motion imagery datasets have become increasingly popular means for exposing and disseminating information. Commonly, moving objects are of primary interest in modeling such datasets. Users may require different levels of detail mainly for visualization and further processing purposes according to the application at hand. In this thesis we exploit the geometric attributes of objects for dataset summarization by using a series of image processing and neural network tools. In order to form data summaries we select representative time instances through the segmentation of an object\u27s spatio-temporal trajectory lines. High movement variation instances are selected through a new hybrid self-organizing map (SOM) technique to describe a single spatio-temporal trajectory. Multiple objects move in diverse yet classifiable patterns. In order to group corresponding trajectories we utilize an abstraction mechanism that investigates a vague moving relevance between the data in space and time. Thus, we introduce the spatio-temporal neighborhood unit as a variable generalization surface. By altering the unit\u27s dimensions, scaled generalization is accomplished. Common complications in tracking applications that include occlusion, noise, information gaps and unconnected segments of data sequences are addressed through the hybrid-SOM analysis. Nevertheless, entangled data sequences where no information on which data entry belongs to each corresponding trajectory are frequently evident. A multidimensional classification technique that combines geometric and backpropagation neural network implementation is used to distinguish between trajectory data. Further more, modeling and summarization of two-dimensional phenomena evolving in time brings forward the novel concept of spatio-temporal helixes as compact event representations. The phenomena models are comprised of SOM movement nodes (spines) and cardinality shape-change descriptors (prongs). While we focus on the analysis of MI datasets, the framework can be generalized to function with other types of spatio-temporal datasets. Multiple scale generalization is allowed in a dynamic significance-based scale rather than a constant one. The constructed summaries are not just a visualization product but they support further processing for metadata creation, indexing, and querying. Experimentation, comparisons and error estimations for each technique support the analyses discussed

    A Study on Human Motion Acquisition and Recognition Employing Structured Motion Database

    Get PDF
    九州工業大学博士学位論文 学位記番号:工博甲第332号 学位授与年月日:平成24年3月23日1 Introduction||2 Human Motion Representation||3 Human Motion Recognition||4 Automatic Human Motion Acquisition||5 Human Motion Recognition Employing Structured Motion Database||6 Analysis on the Constraints in Human Motion Recognition||7 Multiple Persons’ Action Recognition||8 Discussion and ConclusionsHuman motion analysis is an emerging research field for the video-based applications capable of acquiring and recognizing human motions or actions. The automaticity of such a system with these capabilities has vital importance in real-life scenarios. With the increasing number of applications, the demand for a human motion acquisition system is gaining importance day-by-day. We develop such kind of acquisition system based on body-parts modeling strategy. The system is able to acquire the motion by positioning body joints and interpreting those joints by the inter-parts inclination. Besides the development of the acquisition system, there is increasing need for a reliable human motion recognition system in recent years. There are a number of researches on motion recognition is performed in last two decades. At the same time, an enormous amount of bulk motion datasets are becoming available. Therefore, it becomes an indispensable task to develop a motion database that can deal with large variability of motions efficiently. We have developed such a system based on the structured motion database concept. In order to gain a perspective on this issue, we have analyzed various aspects of the motion database with a view to establishing a standard recognition scheme. The conventional structured database is subjected to improvement by considering three aspects: directional organization, nearest neighbor searching problem resolution, and prior direction estimation. In order to investigate and analyze comprehensively the effect of those aspects on motion recognition, we have adopted two forms of motion representation, eigenspace-based motion compression, and B-Tree structured database. Moreover, we have also analyzed the two important constraints in motion recognition: missing information and clutter outdoor motions. Two separate systems based on these constraints are also developed that shows the suitable adoption of the constraints. However, several people occupy a scene in practical cases. We have proposed a detection-tracking-recognition integrated action recognition system to deal with multiple people case. The system shows decent performance in outdoor scenarios. The experimental results empirically illustrate the suitability and compatibility of various factors of the motion recognition

    Model and Appearance Based Analysis of Neuronal Morphology from Different Microscopy Imaging Modalities

    Get PDF
    The neuronal morphology analysis is key for understanding how a brain works. This process requires the neuron imaging system with single-cell resolution; however, there is no feasible system for the human brain. Fortunately, the knowledge can be inferred from the model organism, Drosophila melanogaster, to the human system. This dissertation explores the morphology analysis of Drosophila larvae at single-cell resolution in static images and image sequences, as well as multiple microscopy imaging modalities. Our contributions are on both computational methods for morphology quantification and analysis of the influence of the anatomical aspect. We develop novel model-and-appearance-based methods for morphology quantification and illustrate their significance in three neuroscience studies. Modeling of the structure and dynamics of neuronal circuits creates understanding about how connectivity patterns are formed within a motor circuit and determining whether the connectivity map of neurons can be deduced by estimations of neuronal morphology. To address this problem, we study both boundary-based and centerline-based approaches for neuron reconstruction in static volumes. Neuronal mechanisms are related to the morphology dynamics; so the patterns of neuronal morphology changes are analyzed along with other aspects. In this case, the relationship between neuronal activity and morphology dynamics is explored to analyze locomotion procedures. Our tracking method models the morphology dynamics in the calcium image sequence designed for detecting neuronal activity. It follows the local-to-global design to handle calcium imaging issues and neuronal movement characteristics. Lastly, modeling the link between structural and functional development depicts the correlation between neuron growth and protein interactions. This requires the morphology analysis of different imaging modalities. It can be solved using the part-wise volume segmentation with artificial templates, the standardized representation of neurons. Our method follows the global-to-local approach to solve both part-wise segmentation and registration across modalities. Our methods address common issues in automated morphology analysis from extracting morphological features to tracking neurons, as well as mapping neurons across imaging modalities. The quantitative analysis delivered by our techniques enables a number of new applications and visualizations for advancing the investigation of phenomena in the nervous system

    Gesture-based Numeral Extraction and Recognition Shree Prakash

    Get PDF
    In this work the extraction of numerals and recognition is done using gesture. Gestures are elementary movements of a human body part, and are the atomic components describing the meaningful motion of a person. It is of utmost importance in designing an intelligent and efficient human-computer interface. Two approaches are proposed for the extraction of numeral from gesture. In the first approach, numerals are formed using the finger gesture. The movement of the finger gesture is identified using optical flow method. A view-specific representation of movement is constructed, where movement is defined as motion over time. A temporal encoding is performed from different frames into a single frame. To achieve this we utilize motion history image (MHI) scheme which spans the time scale of gesture. In the second approach, gesture is performed by the use of a pointer like a pen whose tip is either red, green, or blue. In the scene multiple persons are present performing various activities, but our scheme only captures the gesture made by the desired object. HSI color model is used to segment the tip followed by the optical flow to segment the motion. After getting the temporal template, the features are extracted and the recognition is performed. Our second approach is invariant to uninteresting movements in the surrounding while capturing the gesture. Hence it will not affect the final result of recognition

    Human action recognition using saliency-based global and local features

    Get PDF
    Recognising human actions from video sequences is one of the most important topics in computer vision and has been extensively researched during the last decades; however, it is still regarded as a challenging task especially in real scenarios due to difficulties mainly resulting from background clutter, partial occlusion, as well as changes in scale, viewpoint, lighting, and appearance. Human action recognition is involved in many applications, including video surveillance systems, human-computer interaction, and robotics for human behaviour characterisation. In this thesis, we aim to introduce new features and methods to enhance and develop human action recognition systems. Specifically, we have introduced three methods for human action recognition. In the first approach, we present a novel framework for human action recognition based on salient object detection and a combination of local and global descriptors. Saliency Guided Feature Extraction (SGFE) is proposed to detect salient objects and extract features on the detected objects. We then propose a simple strategy to identify and process only those video frames that contain salient objects. Processing salient objects instead of all the frames not only makes the algorithm more efficient, but more importantly also suppresses the interference of background pixels. We combine this approach with a new combination of local and global descriptors, namely 3D SIFT and Histograms of Oriented Optical Flow (HOOF). The resulting Saliency Guided 3D SIFT and HOOF (SGSH) feature is used along with a multi-class support vector machine (SVM) classifier for human action recognition. The second proposed method is a novel 3D extension of Gradient Location and Orientation Histograms (3D GLOH) which provides discriminative local features representing both the gradient orientation and their relative locations. We further propose a human action recognition system based on the Bag of Visual Words model, by combining the new 3D GLOH local features with Histograms of Oriented Optical Flow (HOOF) global features. Along with the idea from our first work to extract features only in salient regions, our overall system outperforms existing feature descriptors for human action recognition for challenging video datasets. Finally, we propose to extract minimal representative information, namely deforming skeleton graphs corresponding to foreground shapes, to effectively represent actions and remove the influence of changes of illumination, subject appearance and backgrounds. We propose a novel approach to action recognition based on matching of skeleton graphs, combining static pairwise graph similarity measure using Optimal Subsequence Bijection with Dynamic TimeWarping to robustly handle topological and temporal variations. We have evaluated the proposed methods by conducting extensive experiments on widely-used human action datasets including the KTH, the UCF Sports, TV Human Interaction (TVHI), Olympic Sports and UCF11 datasets. Experimental results show the effectiveness of our methods for action recognition

    Articulated people detection and pose estimation in challenging real world environments

    Get PDF
    In this thesis we are interested in the problem of articulated people detection and pose estimation being key ingredients towards understanding visual scenes containing people. First, we investigate how statistical 3D human shape models from computer graphics can be leveraged to ease training data generation. Second, we develop expressive models for 2D single- and multi-person pose estimation. Third, we introduce a novel human pose estimation benchmark that makes a significant advance in terms of diversity and difficulty. Thorough experimental evaluation on standard benchmarks demonstrates significant improvements due to the proposed data augmentation techniques and novel body models, while detailed performance analysis of competing approaches on our novel benchmark allows to identify the most promising directions of improvement.In dieser Arbeit untersuchen wir das Problem der artikulierten Detektion und Posenschätzung von Personen als Schlüsselkomponenten des Verstehens von visuellen Szenen mit Personen. Obwohl es umfangreiche Bemühungen gibt, die Lösung dieser Probleme anzugehen, haben wir drei vielversprechende Herangehensweisen ermittelt, die unserer Meinung nach bisher nicht ausreichend beachtet wurden. Erstens untersuchen wir, wie statistische 3 D Modelle des menschlichen Umrisses, die aus der Computergrafik stammen, wirksam eingesetzt werden können, um die Generierung von Trainingsdaten zu erleichtern. Wir schlagen eine Reihe von Techniken zur automatischen Datengenerierung vor, die eine direkte Repräsentation relevanter Variationen in den Trainingsdaten erlauben. Indem wir Stichproben aus der zu Grunde liegenden Verteilung des menschlichen Umrisses und aus einem großen Datensatz von menschlichen Posen ziehen, erzeugen wir eine neue für unsere Aufgabe relevante Auswahl mit regulierbaren Variationen von Form und Posen. Darüber hinaus verbessern wir das neueste 3 D Modell des menschlichen Umrisses selbst, indem wir es aus einem großen handelsüblichen Datensatz von 3 D Körpern neu aufbauen. Zweitens entwickeln wir ausdrucksstarke räumliche Modelle und ErscheinungsbildModelle für die 2 D Posenschätzung einzelner und mehrerer Personen. Wir schlagen ein ausdrucksstarkes Einzelperson-Modell vor, das Teilabhängigkeiten höherer Ordnung einbezieht, aber dennoch effizient bleibt. Wir verstärken dieses Modell durch verschiedene Arten von starken Erscheinungsbild-Repräsentationen, um die Körperteilhypothesen erheblich zu verbessern. Schließlich schlagen wir ein ausdruckstarkes Modell zur gemeinsamen Posenschätzung mehrerer Personen vor. Dazu entwickeln wir starke Deep Learning-basierte Körperteildetektoren und ein ausdrucksstarkes voll verbundenes räumliches Modell. Der vorgeschlagene Ansatz behandelt die Posenschätzung mehrerer Personen als ein Problem der gemeinsamen Aufteilung und Annotierung eines Satzes von Körperteilhypothesen: er erschließt die Anzahl von Personen in einer Szene, identifiziert verdeckte Körperteile und unterscheidet eindeutig Körperteile von Personen, die sich nahe beieinander befinden. Drittens führen wir eine gründliche Bewertung und Performanzanalyse führender Methoden der menschlichen Posenschätzung und Aktivitätserkennung durch. Dazu stellen wir einen neuen Benchmark vor, der einen bedeutenden Fortschritt bezüglich Diversität und Schwierigkeit im Vergleich zu bisherigen Datensätzen mit sich bringt und über 40 . 000 annotierte Körperposen und mehr als 1 . 5 Millionen Einzelbilder enthält. Darüber hinaus stellen wir einen reichhaltigen Satz an Annotierungen zur Verfügung, die zu einer detaillierten Analyse konkurrierender Herangehensweisen benutzt werden, wodurch wir Erkenntnisse zu Erfolg und Mißerfolg dieser Methoden erhalten. Zusammengefasst präsentiert diese Arbeit einen neuen Ansatz zur artikulierten Detektion und Posenschätzung von Personen. Eine gründliche experimentelle Evaluation auf Standard-Benchmarkdatensätzen zeigt signifikante Verbesserungen durch die vorgeschlagenen Datenverstärkungstechniken und neuen Körpermodelle, während eine detaillierte Performanzanalyse konkurrierender Herangehensweisen auf unserem neu vorgestellten großen Benchmark uns erlaubt, die vielversprechendsten Bereiche für Verbesserungen zu erkennen
    corecore