17 research outputs found

    Human behavior analysis in video surveillance: A Social Signal Processing perspective

    Get PDF
    The analysis of human activities is one of the most intriguing and important open issues for the automated video surveillance community. Since few years ago, it has been handled following a mere Computer Vision and Pattern Recognition perspective, where an activity corresponded to a temporal sequence of explicit actions (run, stop, sit, walk, etc.). Even under this simplistic assumption, the issue is hard, due to the strong diversity of the people appearance, the number of individuals considered (we may monitor single individuals, groups, crowd), the variability of the environmental conditions (indoor/outdoor, different weather conditions), and the kinds of sensors employed. More recently, the automated surveillance of human activities has been faced considering a new perspective, that brings in notions and principles from the social, affective, and psychological literature, and that is called Social Signal Processing (SSP). SSP employs primarily nonverbal cues, most of them are outside of conscious awareness, like face expressions and gazing, body posture and gestures, vocal characteristics, relative distances in the space and the like. This paper is the first review analyzing this new trend, proposing a structured snapshot of the state of the art and envisaging novel challenges in the surveillance domain where the cross-pollination of Computer Science technologies and Sociology theories may offer valid investigation strategies

    自己および相互オクルージョンを考慮したマルチタスク深層学習による人物スケルトン推定

    Get PDF
    学位の種別: 修士University of Tokyo(東京大学

    Intelligent Sensors for Human Motion Analysis

    Get PDF
    The book, "Intelligent Sensors for Human Motion Analysis," contains 17 articles published in the Special Issue of the Sensors journal. These articles deal with many aspects related to the analysis of human movement. New techniques and methods for pose estimation, gait recognition, and fall detection have been proposed and verified. Some of them will trigger further research, and some may become the backbone of commercial systems

    Dynamic reconfiguration methods for active camera networks

    Get PDF
    [no abstract

    FROM VISUAL SALIENCY TO VIDEO BEHAVIOUR UNDERSTANDING

    Get PDF
    In a world of ever increasing amounts of video data, we are forced to abandon traditional methods of scene interpretation by fully manual means. Under such circumstances, some form of automation is highly desirable but this can be a very open ended issue with high complexity. Dealing with such large amounts of data is a non-trivial task that requires efficient selective extraction of parts of a scene which have the potential to develop a higher semantic meaning, alone, or in combination with others. In particular, the types of video data that are in need of automated analysis tend to be outdoor scenes with high levels of activity generated from either foreground or background. Such dynamic scenes add considerable complexity to the problem since we cannot rely on motion energy alone to detect regions of interest. Furthermore, the behaviour of these regions of motion can differ greatly, while still being highly dependent, both spatially and temporally on the movement of other objects within the scene. Modelling these dependencies, whilst eliminating as much redundancy from the feature extraction process as possible are the challenges addressed by this thesis. In the first half, finding the right mechanism to extract and represent meaningful features from dynamic scenes with no prior knowledge is investigated. Meaningful or salient information is treated as the parts of a scene that stand out or seem unusual or interesting to us. The novelty of the work is that it is able to select salient scales in both space and time in which a particular spatio-temporal volume is considered interesting relative to the rest of the scene. By quantifying the temporal saliency values of regions of motion, it is possible to consider their importance in terms of both the long and short-term. Variations in entropy over spatio-temporal scales are used to select a context dependent measure of the local scene dynamics. A method of quantifying temporal saliency is devised based on the variation of the entropy of the intensity distribution in a spatio-temporal volume over incraeasing scales. Entropy is used over traditional filter methods since the stability or predictability of the intensity distribution over scales of a local spatio-temporal region can be defined more robustly relative to the context of its neighbourhood, even for regions exhibiting high intensity variation due to being extremely textured. Results show that it is possible to extract both locally salient features as well as globally salient temporal features from contrasting scenerios. In the second part of the thesis, focus will shift towards binding these spatio-temporally salient features together so that some semantic meaning can be inferred from their interaction. Interaction in this sense, refers to any form of temporally correlated behaviour between any salient regions of motion in a scene. Feature binding as a mechanism for interactive behaviour understanding is particularly important if we consider that regions of interest may not be treated as particularly significant individually, but represent much more semantically when considered in combination. Temporally correlated behaviour is identified and classified using accumulated co-occurrences of salient features at two levels. Firstly, co-occurrences are accumulated for spatio-temporally proximate salient features to form a local representation. Then, at the next level, the co-occurrence of these locally spatio-temporally bound features are accumulated again in order to discover unusual behaviour in the scene. The novelty of this work is that there are no assumptions made about whether interacting regions should be spatially proximate. Furthermore, no prior knowledge of the scene topology is used. Results show that it is possible to detect unusual interactions between regions of motion, which can visually infer higher levels of semantics. In the final part of the thesis, a more specific investigation of human behaviour is addressed through classification and detection of interactions between 2 human subjects. Here, further modifications are made to the feature extraction process in order to quantify the spatiotemporal saliency of a region of motion. These features are then grouped to find the people in the scene. Then, a loose pose distribution model is extracted for each person for finding salient correlations between poses of two interacting people using canonical correlation analysis. These canonical factors can be formed into trajectories and used for classification. Levenshtein distance is then used to categorise the features. The novelty of the work is that the interactions do not have to be spatially connected or proximate for them to be recognised. Furthermore, the data used is outdoors and cluttered with non-stationary background. Results show that co-occurrence techniques have the potential to provide a more generalised, compact, and meaningful representation of dynamic interactive scene behaviour.EPRSC, part-funded by QinetiQ Ltd and a travel grant was also contributed by RAEng

    Using semantic drift on social media for event detection, differentiation and segmentation

    Get PDF
    With observable paradigm shift in computer science from predictive modeling to the generative one, it became important to maximise exploration of the pathways towards useful data production. With currently dominating statistical and compositional data augmentation strategies, opportunities also emerged for more application-driven routes. The main value of such approaches lies in their capacity to offer insights into context or event specific data productions, currently overlooked by more topologically neutral machine learning approaches. The purpose of this thesis is therefore to provide empirical evidence for useful data generation by dynamic event-specific lexical semantic resources. Various Web 2.0 applications due to their popularity have been accumulating large amounts of semantically rich metadata, which became readily available and easily exploitable. Tags, usually consisting of a single word, are one type of such data. Tag uses can vary largely across systems and platforms; Also known under the term folksonomy, tags are usually non-hierarchical and open-ended, thus re-flecting users' unique perspectives regarding various contexts, or resources. This platform-enabled liberty of expression, however, has led to situations of frequent semantic ambiguity due to spelling mistakes, morphological variations, polysemy, multilingualism or inaccurate tag-to-resource associations. As a consequence, tag spaces are often regarded as inconsistent, noisy and hardly reliable data sources. Recent surge of interest amongst distributional semanticists in long- and short-term fluctuations of word meanings on social media has suggested routes for successful temporal sense disambiguation, thus inviting discussions around useful real-world applications for such emerging data resources. One of such applications - event analytics from the crowd behaviour perspective - is gaining an increasing attention from researchers and practitioners, especially in the fields of operations and situational management. Pursuing pragmatic aims of event detection, differen- tiation and segmentation, this application domain is represented predominantly by repetitive catastrophic events (such as natural hazards), during which directly or indirectly exposed populations tend to share their situational experiences on social media. This thesis consists of three main parts, each corresponding to specific problem in event analytics: (i) detection, (ii) differentiation and (iii) segmentation. In the first part I used the concept of ontological semantic proximity on the words candidates for semantic drift in order to highlight the dynamics of their semantic oscillations within event-specific category (i.e., flooding). In my second experiment I followed on these initial findings and performed an analysis verifying whether semantically unstable lexical material can augment our knowledge about main sub-types of floods, such as `slow' (e.g., groundwater and pluvial floods) and `fast' (surface water and riverine floods) ones. In my third experiment I employed combined lexico-visual modalities of the crowdsourced material to reconstruct changing perceptions of flood events in order to understand how event severity can or cannot determine situationally resilient behaviours

    Semantic discovery and reuse of business process patterns

    Get PDF
    Patterns currently play an important role in modern information systems (IS) development and their use has mainly been restricted to the design and implementation phases of the development lifecycle. Given the increasing significance of business modelling in IS development, patterns have the potential of providing a viable solution for promoting reusability of recurrent generalized models in the very early stages of development. As a statement of research-in-progress this paper focuses on business process patterns and proposes an initial methodological framework for the discovery and reuse of business process patterns within the IS development lifecycle. The framework borrows ideas from the domain engineering literature and proposes the use of semantics to drive both the discovery of patterns as well as their reuse
    corecore