20 research outputs found

    Exploring techniques for vision based human activity recognition: Methods, systems, and evaluation

    Get PDF
    With the wide applications of vision based intelligent systems, image and video analysis technologies have attracted the attention of researchers in the computer vision field. In image and video analysis, human activity recognition is an important research direction. By interpreting and understanding human activity, we can recognize and predict the occurrence of crimes and help the police or other agencies react immediately. In the past, a large number of papers have been published on human activity recognition in video and image sequences. In this paper, we provide a comprehensive survey of the recent development of the techniques, including methods, systems, and quantitative evaluation towards the performance of human activity recognitio

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Object Tracking

    Get PDF
    Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application

    Domain Transfer Learning for Object and Action Recognition

    Get PDF
    Visual recognition has always been a fundamental problem in computer vision. Its task is to learn visual categories using labeled training data and then identify unlabeled new instances of those categories. However, due to the large variations in visual data, visual recognition is still a challenging problem. Handling the variations in captured images is important for real-world applications where unconstrained data acquisition scenarios are widely prevalent. In this dissertation, we first address the variations between training and testing data. Particularly, for cross-domain object recognition, we propose a Grassmann manifold-based domain adaptation approach to model the domain shift using the geodesic connecting the source and target domains. We further measure the distance between two data points from different domains by integrating the distance of their projections through all the intermediate subspaces along the geodesic. Our proposed approach that exploits all the intermediate subspaces along the geodesic produces a more accurate metric. For cross-view action recognition, we present two effective approaches to learn transferable dictionaries and view-invariant sparse representations. In the first approach, we learn a set of transferable dictionaries where each dictionary corresponds to one camera view. The set of dictionaries is learned simultaneously from sets of correspondence videos taken at different views with the aim of encouraging each video in the set to have the same sparse representation. In the second approach, we relaxes this constraint by encouraging correspondence videos to have similar sparse representations. In addition, we learn a common dictionary that is incoherent to view-specific dictionaries for cross-view action recognition. The set of view-specific dictionaries is learned for specific views while the common dictionary is shared across different views. In this way, we can align view-specific features in the sparse feature spaces spanned by the view-specific dictionary set and transfer the view-shared features in the sparse feature space spanned by the common dictionary. In order to handle the more general variations in captured images, we also exploit the semantic information to learn discriminative feature representations for visual recognition. Class labels are often organized in a hierarchical taxonomy based on their semantic meanings. We propose a novel multi-layer hierarchical dictionary learning framework for region tagging. Specifically, we learn a node-specific dictionary for each semantic label in the taxonomy and preserve the hierarchial semantic structure in the relationship among these node-dictionaries. Our approach can also transfer knowledge from semantic label at higher levels to help learn the classifiers for semantic labels at lower levels. Moreover, we exploit the semantic attributes for boosting the performance of visual recognition. We encode objects or actions based on attributes that describe them as high-level concepts. We consider two types of attributes. One type of attributes is generated by humans, while the second type is data-driven attributes extracted from data using dictionary learning methods. Attribute-based representation may exhibit variations due to noisy and redundant attributes. We propose a discriminative and compact attribute-based representation by selecting a subset of discriminative attributes from a large attribute set. Three attribute selection criteria are proposed and formulated as a submodular optimization problem. A greedy optimization algorithm is presented and its solution is guaranteed to be at least (1-1/e)-approximation to the optimum

    Efficient Human Activity Recognition in Large Image and Video Databases

    Get PDF
    Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images

    Human action recognition and mobility assessment in smart environments with RGB-D sensors

    Get PDF
    openQuesta attività di ricerca è focalizzata sullo sviluppo di algoritmi e soluzioni per ambienti intelligenti sfruttando sensori RGB e di profondità. In particolare, gli argomenti affrontati fanno riferimento alla valutazione della mobilità di un soggetto e al riconoscimento di azioni umane. Riguardo il primo tema, l'obiettivo è quello di implementare algoritmi per l'estrazione di parametri oggettivi che possano supportare la valutazione di test di mobilità svolta da personale sanitario. Il primo algoritmo proposto riguarda l'estrazione di sei joints sul piano sagittale utilizzando i dati di profondità forniti dal sensore Kinect. La precisione in termini di stima degli angoli di busto e ginocchio nella fase di sit-to-stand viene valutata considerando come riferimento un sistema stereofotogrammetrico basato su marker. Un secondo algoritmo viene proposto per facilitare la realizzazione del test in ambiente domestico e per consentire l'estrazione di un maggior numero di parametri dall'esecuzione del test Timed Up and Go. I dati di Kinect vengono combinati con quelli di un accelerometro attraverso un algoritmo di sincronizzazione, costituendo un setup che può essere utilizzato anche per altre applicazioni che possono beneficiare dell'utilizzo congiunto di dati RGB, profondità ed inerziali. Vengono quindi proposti algoritmi di rilevazione della caduta che sfruttano la stessa configurazione del Timed Up and Go test. Per quanto riguarda il secondo argomento affrontato, l'obiettivo è quello di effettuare la classificazione di azioni che possono essere compiute dalla persona all'interno di un ambiente domestico. Vengono quindi proposti due algoritmi di riconoscimento attività i quali utilizzano i joints dello scheletro di Kinect e sfruttano un SVM multiclasse per il riconoscimento di azioni appartenenti a dataset pubblicamente disponibili, raggiungendo risultati confrontabili con lo stato dell'arte rispetto ai dataset CAD-60, KARD, MSR Action3D.This research activity is focused on the development of algorithms and solutions for smart environments exploiting RGB and depth sensors. In particular, the addressed topics refer to mobility assessment of a subject and to human action recognition. Regarding the first topic, the goal is to implement algorithms for the extraction of objective parameters that can support the assessment of mobility tests performed by healthcare staff. The first proposed algorithm regards the extraction of six joints on the sagittal plane using depth data provided by Kinect sensor. The accuracy in terms of estimation of torso and knee angles in the sit-to-stand phase is evaluated considering a marker-based stereometric system as a reference. A second algorithm is proposed to simplify the test implementation in home environment and to allow the extraction of a greater number of parameters from the execution of the Timed Up and Go test. Kinect data are combined with those of an accelerometer through a synchronization algorithm constituting a setup that can be used also for other applications that benefit from the joint usage of RGB, depth and inertial data. Fall detection algorithms exploiting the same configuration of the Timed Up and Go test are therefore proposed. Regarding the second topic addressed, the goal is to perform the classification of human actions that can be carried out in home environment. Two algorithms for human action recognition are therefore proposed, which exploit skeleton joints of Kinect and a multi-class SVM for the recognition of actions belonging to publicly available datasets, achieving results comparable with the state of the art in the datasets CAD-60, KARD, MSR Action3D.INGEGNERIA DELL'INFORMAZIONECippitelli, EneaCippitelli, Ene

    Human action recognition and mobility assessment in smart environments with RGB-D sensors

    Get PDF
    Questa attività di ricerca è focalizzata sullo sviluppo di algoritmi e soluzioni per ambienti intelligenti sfruttando sensori RGB e di profondità. In particolare, gli argomenti affrontati fanno riferimento alla valutazione della mobilità di un soggetto e al riconoscimento di azioni umane. Riguardo il primo tema, l'obiettivo è quello di implementare algoritmi per l'estrazione di parametri oggettivi che possano supportare la valutazione di test di mobilità svolta da personale sanitario. Il primo algoritmo proposto riguarda l'estrazione di sei joints sul piano sagittale utilizzando i dati di profondità forniti dal sensore Kinect. La precisione in termini di stima degli angoli di busto e ginocchio nella fase di sit-to-stand viene valutata considerando come riferimento un sistema stereofotogrammetrico basato su marker. Un secondo algoritmo viene proposto per facilitare la realizzazione del test in ambiente domestico e per consentire l'estrazione di un maggior numero di parametri dall'esecuzione del test Timed Up and Go. I dati di Kinect vengono combinati con quelli di un accelerometro attraverso un algoritmo di sincronizzazione, costituendo un setup che può essere utilizzato anche per altre applicazioni che possono beneficiare dell'utilizzo congiunto di dati RGB, profondità ed inerziali. Vengono quindi proposti algoritmi di rilevazione della caduta che sfruttano la stessa configurazione del Timed Up and Go test. Per quanto riguarda il secondo argomento affrontato, l'obiettivo è quello di effettuare la classificazione di azioni che possono essere compiute dalla persona all'interno di un ambiente domestico. Vengono quindi proposti due algoritmi di riconoscimento attività i quali utilizzano i joints dello scheletro di Kinect e sfruttano un SVM multiclasse per il riconoscimento di azioni appartenenti a dataset pubblicamente disponibili, raggiungendo risultati confrontabili con lo stato dell'arte rispetto ai dataset CAD-60, KARD, MSR Action3D.This research activity is focused on the development of algorithms and solutions for smart environments exploiting RGB and depth sensors. In particular, the addressed topics refer to mobility assessment of a subject and to human action recognition. Regarding the first topic, the goal is to implement algorithms for the extraction of objective parameters that can support the assessment of mobility tests performed by healthcare staff. The first proposed algorithm regards the extraction of six joints on the sagittal plane using depth data provided by Kinect sensor. The accuracy in terms of estimation of torso and knee angles in the sit-to-stand phase is evaluated considering a marker-based stereometric system as a reference. A second algorithm is proposed to simplify the test implementation in home environment and to allow the extraction of a greater number of parameters from the execution of the Timed Up and Go test. Kinect data are combined with those of an accelerometer through a synchronization algorithm constituting a setup that can be used also for other applications that benefit from the joint usage of RGB, depth and inertial data. Fall detection algorithms exploiting the same configuration of the Timed Up and Go test are therefore proposed. Regarding the second topic addressed, the goal is to perform the classification of human actions that can be carried out in home environment. Two algorithms for human action recognition are therefore proposed, which exploit skeleton joints of Kinect and a multi-class SVM for the recognition of actions belonging to publicly available datasets, achieving results comparable with the state of the art in the datasets CAD-60, KARD, MSR Action3D

    Theory and Algorithms for Reliable Multimodal Data Analysis, Machine Learning, and Signal Processing

    Get PDF
    Modern engineering systems collect large volumes of data measurements across diverse sensing modalities. These measurements can naturally be arranged in higher-order arrays of scalars which are commonly referred to as tensors. Tucker decomposition (TD) is a standard method for tensor analysis with applications in diverse fields of science and engineering. Despite its success, TD exhibits severe sensitivity against outliers —i.e., heavily corrupted entries that appear sporadically in modern datasets. We study L1-norm TD (L1-TD), a reformulation of TD that promotes robustness. For 3-way tensors, we show, for the first time, that L1-TD admits an exact solution via combinatorial optimization and present algorithms for its solution. We propose two novel algorithmic frameworks for approximating the exact solution to L1-TD, for general N-way tensors. We propose a novel algorithm for dynamic L1-TD —i.e., efficient and joint analysis of streaming tensors. Principal-Component Analysis (PCA) (a special case of TD) is also outlier responsive. We consider Lp-quasinorm PCA (Lp-PCA) for

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Novel Aspects of Interference Alignment in Wireless Communications

    Get PDF
    Interference alignment (IA) is a promising joint-transmission technology that essentially enables the maximum achievable degrees-of-freedom (DoF) in K-user interference channels. Fundamentally, wireless networks are interference-limited since the spectral efficiency of each user in the network is degraded with the increase of users. IA breaks through this barrier, that is caused by the traditional interference management techniques, and promises large gains in spectral efficiency and DoF, notably in interference limited environments. This dissertation concentrates on overcoming the challenges as well as exploiting the opportunities of IA in K-user multiple-input multiple-output (MIMO) interference channels. In particular, we consider IA in K-user MIMO interference channels in three novel aspects. In the first aspect, we develop a new IA solution by designing transmit precoding and interference suppression matrices through a novel iterative algorithm based on Min-Maxing strategy. Min-Maxing IA optimization problem is formulated such that each receiver maximizes the power of the desired signal, whereas it preserves the minimum leakage interference as a constraint. This optimization problem is solved by relaxing it into a standard semidefinite programming form, and additionally its convergence is proved. Furthermore, we propose a simplified Min-Maxing IA algorithm for rank-deficient interference channels to achieve the targeted performance with less complexity. Our numerical results show that Min-Maxing IA algorithm proffers significant sum-rate improvement in K-user MIMO interference channels compared to the existing algorithms in the literature at high signal-to-noise ratio (SNR) regime. Moreover, the simplified algorithm matches the optimal performance in the systems of rank-deficient channels. In the second aspect, we deal with the practical challenges of IA under realistic channels, where IA is highly affected by the spatial correlation. Data sum-rate and symbol error-rate of IA are dramatically degraded in real-world scenarios since the correlation between channels decreases the SNR of the received signal after alignment. For this reason, an acceptable sum-rate of IA in MIMO orthogonal frequency-division-multiplexing (MIMO-OFDM) interference channels was obtained in the literature by modifying the locations of network nodes and the separation between the antennas within each node in order to minimize the correlation between channels. In this regard, we apply transmit antenna selection to MIMO-OFDM IA systems either through bulk or per-subcarrier selection aiming at improving the sum-rate and/or error-rate performance under real-world channel circumstances while keeping the minimum spatial antenna separation of half-wavelengths. A constrained per-subcarrier antenna selection is performed to avoid subcarrier imbalance across the antennas of each user that is caused by per-subcarrier selection. Furthermore, we propose a sub-optimal antenna selection algorithm to reduce the computational complexity of the exhaustive search. An experimental testbed of MIMO-OFDM IA with antenna selection in indoor wireless network scenarios is implemented to collect measured channels. The performance of antenna selection in MIMO IA systems is evaluated using measured and deterministic channels, where antenna selection achieves considerable improvements in sum-rate and error-rate under real-world channels. Third aspect of this work is exploiting the opportunity of IA in resource management problem in OFDM based MIMO cognitive radio systems that coexist with primary systems. We propose to perform IA based resource allocation to improve the spectral efficiency of cognitive systems without affecting the quality of service (QoS) of the primary system. IA plays a vital role in the proposed algorithm enabling the secondary users (SUs) to cooperate and share the available spectrum aiming at increasing the DoF of the cognitive system. Nevertheless, the number of SUs that can share a given subcarrier is restricted to the IA feasibility conditions, where this limitation is considered in problem formulation. As the optimal solution for resource allocation problem is mixed-integer, we propose a two-phases efficient sub-optimal algorithm to handle this problem. In the first phase, frequency-clustering with throughput fairness consideration among SUs is performed to tackle the IA feasibility conditions, where each subcarrier is assigned to a feasible number of SUs. In the second phase, the power is allocated among subcarriers and SUs without violating the interference constraint to the primary system. Simulation results show that IA with frequency-clustering achieves a significant sum-rate increase compared to cognitive radio systems with orthogonal multiple access transmission techniques. The considered aspects with the corresponding achievements bring IA to have a powerful role in the future wireless communication systems. The contributions lead to significant improvements in the spectral efficiency of IA based wireless systems and the reliability of IA under real-world channels.Interference Alignment (IA) ist eine vielversprechende kooperative Übertragungstechnik, die die meisten Freiheitsgrade (engl. degrees-of-freedom, DoF) in Bezug auf Zeit, Frequenz und Ort in einem Mehrnutzer Überlagerungskanal bietet. Im Grunde sind Funksysteme Interferenz begrenzt, da die Spektraleffizienz jedes einzelnen Nutzers mit zunehmender Nutzerzahl sinkt. IA durchbricht die Schranke, die herkömmliches Interferenzmanagement errichtet und verspricht große Steigerungen der Spektraleffizienz und der Freiheitsgrade, besonders in Interferenzbegrenzter Umgebung. Die vorliegende Dissertation betrachtet bisher noch unerforschte Möglichkeiten von IA in Mehrnutzerszenarien für Mehrantennen- (MIMO) Kanäle sowie deren Anwendung in einem kognitiven Kommunikationssystem. Als erstes werden mit Hilfe eines effizienten iterativen Algorithmus, basierend auf der Min-Maxing Strategie, senderseitige Vorkodierungs- und Interferenzunterdrückungs Matrizen entwickelt. Das Min-Maxing Optimierungsproblem ist dadurch beschreiben, dass jeder Empfänger seine gewünschte Signalleistung maximiert, während das Minimum der Leck-Interferenz als Randbedingung beibehalten wird. Zur Lösung des Problems wird es in eine semidefinite Form überführt, zusätzlich wird deren Konvergenz nachgewiesen. Des Weiteren wird ein vereinfachter Algorithmus für nicht vollrangige Kanalmatrizen vorgeschlagen, um die Rechenkomplexität zu verringern. Wie numerische Ergebnisse belegen, bedeutet die Min-Maxing Strategie eine wesentliche Verbesserung des Systemdurchsatzes gegenüber den bisher in der Literatur beschriebenen Algorithmen für Mehrnutzer MIMO Szenarien im hohen Signal-Rausch-Verhältnis (engl. signal-to-noise ratio, SNR). Mehr noch, der vereinfachte Algorithmus zeigt das optimale Verhalten in einem System mit nicht vollrangigen Kanalmatrizen. Als zweites werden die IA Herausforderungen an Hand von realistischen/realen Kanälen in der Praxis untersucht. Hierbei wird das System stark durch räumliche Korrelation beeinträchtigt. Der Datendurchsatz sinkt und die Symbolfehlerrate steigt dramatisch unter diesen Bedingungen, da korrelierte Kanäle den SNR des empfangenen Signals nach dem Alignment verschlechtern. Aus diesem Grund wurde in der Literatur für IA in MIMO-OFDM Überlagerungskanälen sowohl die Position der einzelnen Netzwerkknoten als auch die Trennung zwischen den Antennen eines Knotens variiert, um so die Korrelierung der verschiedenen Kanäle zu minimieren. Das vorgeschlagene MIMO-OFDM IA System wählt unter mehreren Sendeantennen, entweder pro Unterträger oder für das komplette Signal, um so die Symbolfehlerrate und/oder die gesamt Datenrate zu verbessern, während die räumliche Trennung der Antennen auf die halbe Wellenlänge beschränkt bleiben soll. Bei der Auswahl pro Unterträger ist darauf zu achten, dass die Antennen gleichmäßig ausgelastet werden. Um die Rechenkomplexität für die vollständige Durchsuchung gering zu halten, wird ein suboptimaler Auswahlalgorithmus verwendet. Mit Hilfe einer Innenraummessanordnung werden reale Kanaldaten für die Simulationen gewonnen. Die Evaluierung des MIMO IA Systems mit Antennenauswahl für deterministische und gemessene Kanäle hat eine Verbesserung bei der Daten- und Fehlerrate unter realen Bedingungen ergeben. Als drittes beschäftigt sich die vorliegende Arbeit mit den Möglichkeiten, die sich durch MIMO IA Systeme für das Ressourcenmanagementproblem bei kognitiven Funksystemen ergeben. In kognitiven Funksystemen müssen MIMO IA Systeme mit primären koexistieren. Es wird eine IA basierte Ressourcenzuteilung vorgeschlagen, um so die spektrale Effizienz des kognitiven Systems zu erhöhen ohne die Qualität (QoS) des primären Systems zu beeinträchtigen. Der vorgeschlagenen IA Algorithmus sorgt dafür, dass die Zweitnutzer (engl. secondary user, SU) untereinander kooperieren und sich das zur Verfügung stehende Spektrum teilen, um so die DoF des kognitiven Systems zu erhöhen. Die Anzahl der SUs, die sich eine Unterträgerfrequenz teilen, ist durch die IA Randbedingungen begrenzt. Die Suche nach der optimalen Ressourcenverteilung stellt ein gemischt-ganzzahliges Problem dar, zu dessen Lösung ein effizienter zweistufiger suboptimaler Algorithmus vorgeschlagen wird. Im ersten Schritt wird durch Frequenzzusammenlegung (Clusterbildung), unter Berücksichtigung einer fairen Durchsatzverteilung unter den SUs, die IA Anforderung erfüllt. Dazu wird jede Unterträgerfrequenz einer praktikablen Anzahl an SUs zugeteilt. Im zweiten Schritt wird die Sendeleistung für die einzelnen Unterträgerfrequenzen und SUs so festgelegt, dass die Interferenzbedingungen des Primärsystems nicht verletzt werden. Die Simulationsergebnisse für IA mit Frequenzzusammenlegung zeigen eine wesentliche Verbesserung der Datenrate verglichen mit kognitiven Systemen, die auf orthogonalen Mehrfachzugriffsverfahren beruhen. Die in dieser Arbeit betrachteten Punkte und erzielten Lösungen führen zu einer wesentlichen Steigerung der spektralen Effizienz von IA Systemen und zeigen deren Zuverlässigkeit unter realen Bedingungen
    corecore