4 research outputs found

    Time-slice analysis of dyadic human activity

    Get PDF
    La reconnaissance d’activités humaines à partir de données vidéo est utilisée pour la surveillance ainsi que pour des applications d’interaction homme-machine. Le principal objectif est de classer les vidéos dans l’une des k classes d’actions à partir de vidéos entièrement observées. Cependant, de tout temps, les systèmes intelligents sont améliorés afin de prendre des décisions basées sur des incertitudes et ou des informations incomplètes. Ce besoin nous motive à introduire le problème de l’analyse de l’incertitude associée aux activités humaines et de pouvoir passer à un nouveau niveau de généralité lié aux problèmes d’analyse d’actions. Nous allons également présenter le problème de reconnaissance d’activités par intervalle de temps, qui vise à explorer l’activité humaine dans un intervalle de temps court. Il a été démontré que l’analyse par intervalle de temps est utile pour la caractérisation des mouvements et en général pour l’analyse de contenus vidéo. Ces études nous encouragent à utiliser ces intervalles de temps afin d’analyser l’incertitude associée aux activités humaines. Nous allons détailler à quel degré de certitude chaque activité se produit au cours de la vidéo. Dans cette thèse, l’analyse par intervalle de temps d’activités humaines avec incertitudes sera structurée en 3 parties. i) Nous présentons une nouvelle famille de descripteurs spatiotemporels optimisés pour la prédiction précoce avec annotations d’intervalle de temps. Notre représentation prédictive du point d’intérêt spatiotemporel (Predict-STIP) est basée sur l’idée de la contingence entre intervalles de temps. ii) Nous exploitons des techniques de pointe pour extraire des points d’intérêts afin de représenter ces intervalles de temps. iii) Nous utilisons des relations (uniformes et par paires) basées sur les réseaux neuronaux convolutionnels entre les différentes parties du corps de l’individu dans chaque intervalle de temps. Les relations uniformes enregistrent l’apparence locale de la partie du corps tandis que les relations par paires captent les relations contextuelles locales entre les parties du corps. Nous extrayons les spécificités de chaque image dans l’intervalle de temps et examinons différentes façons de les agréger temporellement afin de générer un descripteur pour tout l’intervalle de temps. En outre, nous créons une nouvelle base de données qui est annotée à de multiples intervalles de temps courts, permettant la modélisation de l’incertitude inhérente à la reconnaissance d’activités par intervalle de temps. Les résultats expérimentaux montrent l’efficience de notre stratégie dans l’analyse des mouvements humains avec incertitude.Recognizing human activities from video data is routinely leveraged for surveillance and human-computer interaction applications. The main focus has been classifying videos into one of k action classes from fully observed videos. However, intelligent systems must to make decisions under uncertainty, and based on incomplete information. This need motivates us to introduce the problem of analysing the uncertainty associated with human activities and move to a new level of generality in the action analysis problem. We also present the problem of time-slice activity recognition which aims to explore human activity at a small temporal granularity. Time-slice recognition is able to infer human behaviours from a short temporal window. It has been shown that temporal slice analysis is helpful for motion characterization and for video content representation in general. These studies motivate us to consider timeslices for analysing the uncertainty associated with human activities. We report to what degree of certainty each activity is occurring throughout the video from definitely not occurring to definitely occurring. In this research, we propose three frameworks for time-slice analysis of dyadic human activity under uncertainty. i) We present a new family of spatio-temporal descriptors which are optimized for early prediction with time-slice action annotations. Our predictive spatiotemporal interest point (Predict-STIP) representation is based on the intuition of temporal contingency between time-slices. ii) we exploit state-of-the art techniques to extract interest points in order to represent time-slices. We also present an accumulative uncertainty to depict the uncertainty associated with partially observed videos for the task of early activity recognition. iii) we use Convolutional Neural Networks-based unary and pairwise relations between human body joints in each time-slice. The unary term captures the local appearance of the joints while the pairwise term captures the local contextual relations between the parts. We extract these features from each frame in a time-slice and examine different temporal aggregations to generate a descriptor for the whole time-slice. Furthermore, we create a novel dataset which is annotated at multiple short temporal windows, allowing the modelling of the inherent uncertainty in time-slice activity recognition. All the three methods have been evaluated on TAP dataset. Experimental results demonstrate the effectiveness of our framework in the analysis of dyadic activities under uncertaint

    Hierarchical Task Recognition and Planning in Smart Homes with Partial Observability

    Get PDF
    Older adults with cognitive impairment have significantly burdened their families and the society due to costly caring and waste of labors. Developing intelligent assistant agents (IAAs) in smart homes that can help those people accomplishing activities of daily living (ADLs) independently has attracted tremendous attention, from both academia and industry. Ideally, IAAs should recognize older adults’ goals and reason about further steps needed for the goals. This paper proposed a goal recognition and planning algorithm to support an IAA in smart home. The algorithm addresses several important issues. First it can deal with partial observability by Bayesian inference for step recognition. Even advanced sensors are not guaranteed to be 100% reliable. Besides, due to limited accessibility or privacy, not all attributes of physical objects can be measured by sensors. The proposed algorithm can reason about ongoing goals with some sensors missing or unreliable. Second, the algorithm reasons about concurrent goals. For everyday life, a person is typically involved in multi-tasks by switching back and forth. Based on the context, the proposed algorithm can assign a step to the correct goal and keep tracks of the goal’s ongoing status. The context involves status of ongoing goals inferred from a recognition procedure, and desired next steps and tasks, which are obtained through a planning procedure. Last but not least, the algorithm can handle incorrectly executed steps. For older adults with cognitive impairment, executing unrelated or wrong steps towards certain goals is common in their daily life. A module is designed to hand wrong steps by detecting and then prompt the person with correct steps. The algorithm is based on Hierarchical Task Network (HTN), of which the knowledge base is composed of methods (for tasks) and operators (for steps). Such hierarchical modeling of tasks and steps enables the algorithm to deal with partially ordered subtasks and alternative plans. Furthermore, the preconditions of methods and operators enable to generate feasible hints of next steps and tasks by considering uncertainties in belief states. In the experiment, a simulator is designed to simulate the virtual sensors and a virtual human executing a sequence of steps predefined in a test case. The algorithm is tested on many simulated easy or difficult cases. For example single goal and correct steps are easy test cases. Having multiple goals with wrong steps makes the problem more difficult. Also cases of sensors missing are experimented. The results shows that the algorithm works very well on simple cases, achieving nearly 100% accuracy. Even for the hardest cases, the performance is acceptable when sensor reliabilities are above 0.95. Test cases with missing sensors also provide meaningful guideline for setting up sensors for an intelligent assistant agent

    A hierarchical human activity recognition framework based on automated reasoning

    No full text
    Conventional human activity recognition approaches are mainly based on machine learning methods, which are not working well for composite activity recognition due to the complexity and uncertainty of real scenarios. We propose in this paper an automated reasoning based hierarchical framework for human activity recognition. This approach constructs a hierarchical structure for representing the composite activity by a composition of lower-level actions and gestures according to its semantic meaning. This hierarchical structure is then transformed into logical formulas and rules, based on which the resolution based automated reasoning is applied to recognize the composite activity given the recognized lower-level actions by machine learning methods
    corecore