701 research outputs found

    Cross View Action Recognition

    Get PDF
    openCross View Action Recognition (CVAR) appraises a system's ability to recognise actions from viewpoints that are unfamiliar to the system. The state of the art methods that train on large amounts of training data rely on variation in the training data itself to increase their ability to tackle viewpoints changes. Therefore, these methods not only require a large scale dataset of appropriate classes for the application every time they train, but also correspondingly large amount of computation power for the training process leading to high costs, in terms of time, effort, funds and electrical energy. In this thesis, we propose a methodological pipeline that tackles change in viewpoint, training on small datasets and employing sustainable amounts of resources. Our method uses the optical flow input with a stream of a pre-trained model as-is to obtain a feature. Thereafter, this feature is used to train a custom designed classifier that promotes view-invariant properties. Our method only uses video information as input, in contrast to another set of methods that approach CVAR by using depth or pose input at the expense of increased sensor costs. We present a number of comparative analysis that aided the design of the pipelines, farther assessing the power of each component in the pipeline. The technique can also be adopted to existing, trained classifiers, with minimal fine-tuning, as this work demonstrates by comparing classifiers including shallow classifiers, deep pre-trained classifiers and our proposed classifier trained from scratch. Additionally, we present a set of qualitative results that promote our understanding of the relationship between viewpoints in the feature-space.openXXXII CICLO - INFORMATICA E INGEGNERIA DEI SISTEMI/ COMPUTER SCIENCE AND SYSTEMS ENGINEERING - InformaticaGoyal, Gaurv

    Statistical/Geometric Techniques for Object Representation and Recognition

    Get PDF
    Object modeling and recognition are key areas of research in computer vision and graphics with wide range of applications. Though research in these areas is not new, traditionally most of it has focused on analyzing problems under controlled environments. The challenges posed by real life applications demand for more general and robust solutions. The wide variety of objects with large intra-class variability makes the task very challenging. The difficulty in modeling and matching objects also vary depending on the input modality. In addition, the easy availability of sensors and storage have resulted in tremendous increase in the amount of data that needs to be processed which requires efficient algorithms suitable for large-size databases. In this dissertation, we address some of the challenges involved in modeling and matching of objects in realistic scenarios. Object matching in images require accounting for large variability in the appearance due to changes in illumination and view point. Any real world object is characterized by its underlying shape and albedo, which unlike the image intensity are insensitive to changes in illumination conditions. We propose a stochastic filtering framework for estimating object albedo from a single intensity image by formulating the albedo estimation as an image estimation problem. We also show how this albedo estimate can be used for illumination insensitive object matching and for more accurate shape recovery from a single image using standard shape from shading formulation. We start with the simpler problem where the pose of the object is known and only the illumination varies. We then extend the proposed approach to handle unknown pose in addition to illumination variations. We also use the estimated albedo maps for another important application, which is recognizing faces across age progression. Many approaches which address the problem of modeling and recognizing objects from images assume that the underlying objects are of diffused texture. But most real world objects exhibit a combination of diffused and specular properties. We propose an approach for separating the diffused and specular reflectance from a given color image so that the algorithms proposed for objects of diffused texture become applicable to a much wider range of real world objects. Representing and matching the 2D and 3D geometry of objects is also an integral part of object matching with applications in gesture recognition, activity classification, trademark and logo recognition, etc. The challenge in matching 2D/3D shapes lies in accounting for the different rigid and non-rigid deformations, large intra-class variability, noise and outliers. In addition, since shapes are usually represented as a collection of landmark points, the shape matching algorithm also has to deal with the challenges of missing or unknown correspondence across these data points. We propose an efficient shape indexing approach where the different feature vectors representing the shape are mapped to a hash table. For a query shape, we show how the similar shapes in the database can be efficiently retrieved without the need for establishing correspondence making the algorithm extremely fast and scalable. We also propose an approach for matching and registration of 3D point cloud data across unknown or missing correspondence using an implicit surface representation. Finally, we discuss possible future directions of this research

    Computational Methods for Task-Directed Sensor Data Fusion and Sensor Planning

    Get PDF
    In this paper, we consider the problem of task-directed information gathering. We first develop a decision-theoretic model of task-directed sensing in which sensors are modeled as noise-contaminated, uncertain measurement systems and sensing tasks are modeled by a transformation describing the type of information required by the task, a utility function describing sensitivity to error, and a cost function describing time or resource constraints on the system. This description allows us to develop a standard conditional Bayes decision-making model where the value of information, or payoff, of an estimate is defined as the average utility (the expected value of some function of decision or estimation error) relative to the current probability distribution and the best estimate is that which maximizes payoff. The optimal sensor viewing strategy is that which maximizes the net payoff (decision value minus observation costs) of the final estimate. The advantage of this solution is generality--it does not assume a particular sensing modality or sensing task. However, solutions to this updating problem do not exist in closed-form. This, motivates the development of an approximation to the optimal solution based on a grid-based implementation of Bayes\u27 theorem. We describe this algorithm, analyze its error properties, and indicate how it can be made robust to errors in the description of sensors and discrepancies between geometric models and sensed objects. We also present the results of this fusion technique applied to several different information gathering tasks in simulated situations and in a distributed sensing system we have constructed

    A Review on Human Activity Recognition Using Vision-Based Method

    Get PDF

    Facial Expression Recognition System

    Get PDF
    A key requirement for developing any innovative system in a computing environment is to integrate a sufficiently friendly interface with the average end user. Accurate design of such a user-centered interface, however, means more than just the ergonomics of the panels and displays. It also requires that designers precisely define what information to use and how, where, and when to use it. Facial expression as a natural, non-intrusive and efficient way of communication has been considered as one of the potential inputs of such interfaces. The work of this thesis aims at designing a robust Facial Expression Recognition (FER) system by combining various techniques from computer vision and pattern recognition. Expression recognition is closely related to face recognition where a lot of research has been done and a vast array of algorithms have been introduced. FER can also be considered as a special case of a pattern recognition problem and many techniques are available. In the designing of an FER system, we can take advantage of these resources and use existing algorithms as building blocks of our system. So a major part of this work is to determine the optimal combination of algorithms. To do this, we first divide the system into 3 modules, i.e. Preprocessing, Feature Extraction and Classification, then for each of them some candidate methods are implemented, and eventually the optimal configuration is found by comparing the performance of different combinations. Another issue that is of great interest to facial expression recognition systems designers is the classifier which is the core of the system. Conventional classification algorithms assume the image is a single variable function of a underlying class label. However this is not true in face recognition area where the appearance of the face is influenced by multiple factors: identity, expression, illumination and so on. To solve this problem, in this thesis we propose two new algorithms, namely Higher Order Canonical Correlation Analysis and Simple Multifactor Analysis which model the image as a multivariable function. The addressed issues are challenging problems and are substantial for developing a facial expression recognition system

    Uniscale and multiscale gait recognition in realistic scenario

    Get PDF
    The performance of a gait recognition method is affected by numerous challenging factors that degrade its reliability as a behavioural biometrics for subject identification in realistic scenario. Thus for effective visual surveillance, this thesis presents five gait recog- nition methods that address various challenging factors to reliably identify a subject in realistic scenario with low computational complexity. It presents a gait recognition method that analyses spatio-temporal motion of a subject with statistical and physical parameters using Procrustes shape analysis and elliptic Fourier descriptors (EFD). It introduces a part- based EFD analysis to achieve invariance to carrying conditions, and the use of physical parameters enables it to achieve invariance to across-day gait variation. Although spatio- temporal deformation of a subject’s shape in gait sequences provides better discriminative power than its kinematics, inclusion of dynamical motion characteristics improves the iden- tification rate. Therefore, the thesis presents a gait recognition method which combines spatio-temporal shape and dynamic motion characteristics of a subject to achieve robust- ness against the maximum number of challenging factors compared to related state-of-the- art methods. A region-based gait recognition method that analyses a subject’s shape in image and feature spaces is presented to achieve invariance to clothing variation and carry- ing conditions. To take into account of arbitrary moving directions of a subject in realistic scenario, a gait recognition method must be robust against variation in view. Hence, the the- sis presents a robust view-invariant multiscale gait recognition method. Finally, the thesis proposes a gait recognition method based on low spatial and low temporal resolution video sequences captured by a CCTV. The computational complexity of each method is analysed. Experimental analyses on public datasets demonstrate the efficacy of the proposed methods

    A Study on Human Motion Acquisition and Recognition Employing Structured Motion Database

    Get PDF
    九州工業大学博士学位論文 学位記番号:工博甲第332号 学位授与年月日:平成24年3月23日1 Introduction||2 Human Motion Representation||3 Human Motion Recognition||4 Automatic Human Motion Acquisition||5 Human Motion Recognition Employing Structured Motion Database||6 Analysis on the Constraints in Human Motion Recognition||7 Multiple Persons’ Action Recognition||8 Discussion and ConclusionsHuman motion analysis is an emerging research field for the video-based applications capable of acquiring and recognizing human motions or actions. The automaticity of such a system with these capabilities has vital importance in real-life scenarios. With the increasing number of applications, the demand for a human motion acquisition system is gaining importance day-by-day. We develop such kind of acquisition system based on body-parts modeling strategy. The system is able to acquire the motion by positioning body joints and interpreting those joints by the inter-parts inclination. Besides the development of the acquisition system, there is increasing need for a reliable human motion recognition system in recent years. There are a number of researches on motion recognition is performed in last two decades. At the same time, an enormous amount of bulk motion datasets are becoming available. Therefore, it becomes an indispensable task to develop a motion database that can deal with large variability of motions efficiently. We have developed such a system based on the structured motion database concept. In order to gain a perspective on this issue, we have analyzed various aspects of the motion database with a view to establishing a standard recognition scheme. The conventional structured database is subjected to improvement by considering three aspects: directional organization, nearest neighbor searching problem resolution, and prior direction estimation. In order to investigate and analyze comprehensively the effect of those aspects on motion recognition, we have adopted two forms of motion representation, eigenspace-based motion compression, and B-Tree structured database. Moreover, we have also analyzed the two important constraints in motion recognition: missing information and clutter outdoor motions. Two separate systems based on these constraints are also developed that shows the suitable adoption of the constraints. However, several people occupy a scene in practical cases. We have proposed a detection-tracking-recognition integrated action recognition system to deal with multiple people case. The system shows decent performance in outdoor scenarios. The experimental results empirically illustrate the suitability and compatibility of various factors of the motion recognition

    Electromechanical Dynamics of High Photovoltaic Power Grids

    Get PDF
    This dissertation study focuses on the impact of high PV penetration on power grid electromechanical dynamics. Several major aspects of power grid electromechanical dynamics are studied under high PV penetration, including frequency response and control, inter-area oscillations, transient rotor angle stability and electromechanical wave propagation.To obtain dynamic models that can reasonably represent future power systems, Chapter One studies the co-optimization of generation and transmission with large-scale wind and solar. The stochastic nature of renewables is considered in the formulation of mixed-integer programming model. Chapter Two presents the development procedures of high PV model and investigates the impact of high PV penetration on frequency responses. Chapter Three studies the impact of PV penetration on inter-area oscillations of the U.S. Eastern Interconnection system. Chapter Four presents the impacts of high PV on other electromechanical dynamic issues, including transient rotor angle stability and electromechanical wave propagation. Chapter Five investigates the frequency response enhancement by conventional resources. Chapter Six explores system frequency response improvement through real power control of wind and PV. For improving situation awareness and frequency control, Chapter Seven studies disturbance location determination based on electromechanical wave propagation. In addition, a new method is developed to generate the electromechanical wave propagation speed map, which is useful to detect system inertia distribution change. Chapter Eight provides a review on power grid data architectures for monitoring and controlling power grids. Challenges and essential elements of data architecture are analyzed to identify various requirements for operating high-renewable power grids and a conceptual data architecture is proposed. Conclusions of this dissertation study are given in Chapter Nine

    High Performance Video Stream Analytics System for Object Detection and Classification

    Get PDF
    Due to the recent advances in cameras, cell phones and camcorders, particularly the resolution at which they can record an image/video, large amounts of data are generated daily. This video data is often so large that manually inspecting it for object detection and classification can be time consuming and error prone, thereby it requires automated analysis to extract useful information and meta-data. The automated analysis from video streams also comes with numerous challenges such as blur content and variation in illumination conditions and poses. We investigate an automated video analytics system in this thesis which takes into account the characteristics from both shallow and deep learning domains. We propose fusion of features from spatial frequency domain to perform highly accurate blur and illumination invariant object classification using deep learning networks. We also propose the tuning of hyper-parameters associated with the deep learning network through a mathematical model. The mathematical model used to support hyper-parameter tuning improved the performance of the proposed system during training. The outcomes of various hyper-parameters on system's performance are compared. The parameters that contribute towards the most optimal performance are selected for the video object classification. The proposed video analytics system has been demonstrated to process a large number of video streams and the underlying infrastructure is able to scale based on the number and size of the video stream(s) being processed. The extensive experimentation on publicly available image and video datasets reveal that the proposed system is significantly more accurate and scalable and can be used as a general purpose video analytics system.N/
    corecore