577 research outputs found

    Fusion of non-visual and visual sensors for human tracking

    Get PDF
    Human tracking is an extensively researched yet still challenging area in the Computer Vision field, with a wide range of applications such as surveillance and healthcare. People may not be successfully tracked with merely the visual information in challenging cases such as long-term occlusion. Thus, we propose to combine information from other sensors with the surveillance cameras to persistently localize and track humans, which is becoming more promising with the pervasiveness of mobile devices such as cellphones, smart watches and smart glasses embedded with all kinds of sensors including accelerometers, gyroscopes, magnetometers, GPS, WiFi modules and so on. In this thesis, we firstly investigate the application of Inertial Measurement Unit (IMU) from mobile devices to human activity recognition and human tracking, we then develop novel persistent human tracking and indoor localization algorithms by the fusion of non-visual sensors and visual sensors, which not only overcomes the occlusion challenge in visual tracking, but also alleviates the calibration and drift problems in IMU tracking --Abstract, page iii

    Visual Human Tracking and Group Activity Analysis: A Video Mining System for Retail Marketing

    Get PDF
    Thesis (PhD) - Indiana University, Computer Sciences, 2007In this thesis we present a system for automatic human tracking and activity recognition from video sequences. The problem of automated analysis of visual information in order to derive descriptors of high level human activities has intrigued computer vision community for decades and is considered to be largely unsolved. A part of this interest is derived from the vast range of applications in which such a solution may be useful. We attempt to find efficient formulations of these tasks as applied to the extracting customer behavior information in a retail marketing context. Based on these formulations, we present a system that visually tracks customers in a retail store and performs a number of activity analysis tasks based on the output from the tracker. In tracking we introduce new techniques for pedestrian detection, initialization of the body model and a formulation of the temporal tracking as a global trans-dimensional optimization problem. Initial human detection is addressed by a novel method for head detection, which incorporates the knowledge of the camera projection model.The initialization of the human body model is addressed by newly developed shape and appearance descriptors. Temporal tracking of customer trajectories is performed by employing a human body tracking system designed as a Bayesian jump-diffusion filter. This approach demonstrates the ability to overcome model dimensionality ambiguities as people are leaving and entering the scene. Following the tracking, we developed a two-stage group activity formulation based upon the ideas from swarming research. For modeling purposes, all moving actors in the scene are viewed here as simplistic agents in the swarm. This allows to effectively define a set of inter-agent interactions, which combine to derive a distance metric used in further swarm clustering. This way, in the first stage the shoppers that belong to the same group are identified by deterministically clustering bodies to detect short term events and in the second stage events are post-processed to form clusters of group activities with fuzzy memberships. Quantitative analysis of the tracking subsystem shows an improvement over the state of the art methods, if used under similar conditions. Finally, based on the output from the tracker, the activity recognition procedure achieves over 80% correct shopper group detection, as validated by the human generated ground truth results

    Spatiotemporal Learning of Multivehicle Interaction Patterns in Lane-Change Scenarios

    Full text link
    Interpretation of common-yet-challenging interaction scenarios can benefit well-founded decisions for autonomous vehicles. Previous research achieved this using their prior knowledge of specific scenarios with predefined models, limiting their adaptive capabilities. This paper describes a Bayesian nonparametric approach that leverages continuous (i.e., Gaussian processes) and discrete (i.e., Dirichlet processes) stochastic processes to reveal underlying interaction patterns of the ego vehicle with other nearby vehicles. Our model relaxes dependency on the number of surrounding vehicles by developing an acceleration-sensitive velocity field based on Gaussian processes. The experiment results demonstrate that the velocity field can represent the spatial interactions between the ego vehicle and its surroundings. Then, a discrete Bayesian nonparametric model, integrating Dirichlet processes and hidden Markov models, is developed to learn the interaction patterns over the temporal space by segmenting and clustering the sequential interaction data into interpretable granular patterns automatically. We then evaluate our approach in the highway lane-change scenarios using the highD dataset collected from real-world settings. Results demonstrate that our proposed Bayesian nonparametric approach provides an insight into the complicated lane-change interactions of the ego vehicle with multiple surrounding traffic participants based on the interpretable interaction patterns and their transition properties in temporal relationships. Our proposed approach sheds light on efficiently analyzing other kinds of multi-agent interactions, such as vehicle-pedestrian interactions. View the demos via https://youtu.be/z_vf9UHtdAM.Comment: for the supplements, see https://chengyuan-zhang.github.io/Multivehicle-Interaction

    Crowd simulation and visualization

    Get PDF
    Large-scale simulation and visualization are essential topics in areas as different as sociology, physics, urbanism, training, entertainment among others. This kind of systems requires a vast computational power and memory resources commonly available in High Performance Computing HPC platforms. Currently, the most potent clusters have heterogeneous architectures with hundreds of thousands and even millions of cores. The industry trends inferred that exascale clusters would have thousands of millions. The technical challenges for simulation and visualization process in the exascale era are intertwined with difficulties in other areas of research, including storage, communication, programming models and hardware. For this reason, it is necessary prototyping, testing, and deployment a variety of approaches to address the technical challenges identified and evaluate the advantages and disadvantages of each proposed solution. The focus of this research is interactive large-scale crowd simulation and visualization. To exploit to the maximum the capacity of the current HPC infrastructure and be prepared to take advantage of the next generation. The project develops a new approach to scale crowd simulation and visualization on heterogeneous computing cluster using a task-based technique. Its main characteristic is hardware agnostic. It abstracts the difficulties that imply the use of heterogeneous architectures like memory management, scheduling, communications, and synchronization — facilitating development, maintenance, and scalability. With the goal of flexibility and take advantage of computing resources as best as possible, the project explores different configurations to connect the simulation with the visualization engine. This kind of system has an essential use in emergencies. Therefore, urban scenes were implemented as realistic as possible; in this way, users will be ready to face real events. Path planning for large-scale crowds is a challenge to solve, due to the inherent dynamism in the scenes and vast search space. A new path-finding algorithm was developed. It has a hierarchical approach which offers different advantages: it divides the search space reducing the problem complexity, it can obtain a partial path instead of wait for the complete one, which allows a character to start moving and compute the rest asynchronously. It can reprocess only a part if necessary with different levels of abstraction. A case study is presented for a crowd simulation in urban scenarios. Geolocated data are used, they were produced by mobile devices to predict individual and crowd behavior and detect abnormal situations in the presence of specific events. It was also address the challenge of combining all these individual’s location with a 3D rendering of the urban environment. The data processing and simulation approach are computationally expensive and time-critical, it relies thus on a hybrid Cloud-HPC architecture to produce an efficient solution. Within the project, new models of behavior based on data analytics were developed. It was developed the infrastructure to be able to consult various data sources such as social networks, government agencies or transport companies such as Uber. Every time there is more geolocation data available and better computation resources which allow performing analysis of greater depth, this lays the foundations to improve the simulation models of current crowds. The use of simulations and their visualization allows to observe and organize the crowds in real time. The analysis before, during and after daily mass events can reduce the risks and associated logistics costs.La simulación y visualización a gran escala son temas esenciales en áreas tan diferentes como la sociología, la física, el urbanismo, la capacitación, el entretenimiento, entre otros. Este tipo de sistemas requiere una gran capacidad de cómputo y recursos de memoria comúnmente disponibles en las plataformas de computo de alto rendimiento. Actualmente, los equipos más potentes tienen arquitecturas heterogéneas con cientos de miles e incluso millones de núcleos. Las tendencias de la industria infieren que los equipos en la era exascale tendran miles de millones. Los desafíos técnicos en el proceso de simulación y visualización en la era exascale se entrelazan con dificultades en otras áreas de investigación, incluidos almacenamiento, comunicación, modelos de programación y hardware. Por esta razón, es necesario crear prototipos, probar y desplegar una variedad de enfoques para abordar los desafíos técnicos identificados y evaluar las ventajas y desventajas de cada solución propuesta. El foco de esta investigación es la visualización y simulación interactiva de multitudes a gran escala. Aprovechar al máximo la capacidad de la infraestructura actual y estar preparado para aprovechar la próxima generación. El proyecto desarrolla un nuevo enfoque para escalar la simulación y visualización de multitudes en un clúster de computo heterogéneo utilizando una técnica basada en tareas. Su principal característica es que es hardware agnóstico. Abstrae las dificultades que implican el uso de arquitecturas heterogéneas como la administración de memoria, las comunicaciones y la sincronización, lo que facilita el desarrollo, el mantenimiento y la escalabilidad. Con el objetivo de flexibilizar y aprovechar los recursos informáticos lo mejor posible, el proyecto explora diferentes configuraciones para conectar la simulación con el motor de visualización. Este tipo de sistemas tienen un uso esencial en emergencias. Por lo tanto, se implementaron escenas urbanas lo más realistas posible, de esta manera los usuarios estarán listos para enfrentar eventos reales. La planificación de caminos para multitudes a gran escala es un desafío a resolver, debido al dinamismo inherente en las escenas y el vasto espacio de búsqueda. Se desarrolló un nuevo algoritmo de búsqueda de caminos. Tiene un enfoque jerárquico que ofrece diferentes ventajas: divide el espacio de búsqueda reduciendo la complejidad del problema, puede obtener una ruta parcial en lugar de esperar a la completa, lo que permite que un personaje comience a moverse y calcule el resto de forma asíncrona, puede reprocesar solo una parte si es necesario con diferentes niveles de abstracción. Se presenta un caso de estudio para una simulación de multitud en escenarios urbanos. Se utilizan datos geolocalizados producidos por dispositivos móviles para predecir el comportamiento individual y público y detectar situaciones anormales en presencia de eventos específicos. También se aborda el desafío de combinar la ubicación de todos estos individuos con una representación 3D del entorno urbano. Dentro del proyecto, se desarrollaron nuevos modelos de comportamiento basados ¿¿en el análisis de datos. Se creo la infraestructura para poder consultar varias fuentes de datos como redes sociales, agencias gubernamentales o empresas de transporte como Uber. Cada vez hay más datos de geolocalización disponibles y mejores recursos de cómputo que permiten realizar un análisis de mayor profundidad, esto sienta las bases para mejorar los modelos de simulación de las multitudes actuales. El uso de simulaciones y su visualización permite observar y organizar las multitudes en tiempo real. El análisis antes, durante y después de eventos multitudinarios diarios puede reducir los riesgos y los costos logísticos asociadosPostprint (published version

    Survey on video anomaly detection in dynamic scenes with moving cameras

    Full text link
    The increasing popularity of compact and inexpensive cameras, e.g.~dash cameras, body cameras, and cameras equipped on robots, has sparked a growing interest in detecting anomalies within dynamic scenes recorded by moving cameras. However, existing reviews primarily concentrate on Video Anomaly Detection (VAD) methods assuming static cameras. The VAD literature with moving cameras remains fragmented, lacking comprehensive reviews to date. To address this gap, we endeavor to present the first comprehensive survey on Moving Camera Video Anomaly Detection (MC-VAD). We delve into the research papers related to MC-VAD, critically assessing their limitations and highlighting associated challenges. Our exploration encompasses three application domains: security, urban transportation, and marine environments, which in turn cover six specific tasks. We compile an extensive list of 25 publicly-available datasets spanning four distinct environments: underwater, water surface, ground, and aerial. We summarize the types of anomalies these datasets correspond to or contain, and present five main categories of approaches for detecting such anomalies. Lastly, we identify future research directions and discuss novel contributions that could advance the field of MC-VAD. With this survey, we aim to offer a valuable reference for researchers and practitioners striving to develop and advance state-of-the-art MC-VAD methods.Comment: Under revie

    Evaluating indoor positioning systems in a shopping mall : the lessons learned from the IPIN 2018 competition

    Get PDF
    The Indoor Positioning and Indoor Navigation (IPIN) conference holds an annual competition in which indoor localization systems from different research groups worldwide are evaluated empirically. The objective of this competition is to establish a systematic evaluation methodology with rigorous metrics both for real-time (on-site) and post-processing (off-site) situations, in a realistic environment unfamiliar to the prototype developers. For the IPIN 2018 conference, this competition was held on September 22nd, 2018, in Atlantis, a large shopping mall in Nantes (France). Four competition tracks (two on-site and two off-site) were designed. They consisted of several 1 km routes traversing several floors of the mall. Along these paths, 180 points were topographically surveyed with a 10 cm accuracy, to serve as ground truth landmarks, combining theodolite measurements, differential global navigation satellite system (GNSS) and 3D scanner systems. 34 teams effectively competed. The accuracy score corresponds to the third quartile (75th percentile) of an error metric that combines the horizontal positioning error and the floor detection. The best results for the on-site tracks showed an accuracy score of 11.70 m (Track 1) and 5.50 m (Track 2), while the best results for the off-site tracks showed an accuracy score of 0.90 m (Track 3) and 1.30 m (Track 4). These results showed that it is possible to obtain high accuracy indoor positioning solutions in large, realistic environments using wearable light-weight sensors without deploying any beacon. This paper describes the organization work of the tracks, analyzes the methodology used to quantify the results, reviews the lessons learned from the competition and discusses its future

    The Prevalence of Correspondence Computations in Visual Processing

    Get PDF
    Correspondence computation is the general process during which received perceptual signals are assigned to the same or different sources. It is a pervasive process involved in many visual cognitive tasks. People need to make this computation both for simultaneously received signals and for temporally separated signals. Due to the noisy nature of visual input signals, it is an error-prone process. Therefore, it serves as the cause of many limits in human performance. However, its role in human cognitive abilities has often been ignored or underestimated. Besides, how correspondence computations are done, and how it is related to other computations in the visual system, haven’t been explored a lot. In this dissertation, I combined evidence from human behavioral experiment, computational modeling, and testing of brain-damaged patient to investigate these questions. In Chapter 2, I showed that human participants couldn’t accurately report the number of presented objects in a typical visual working memory task. I then showed that a clustering algorithm with noise in the correspondence process could simulate human performance very well. Therefore, parts of the limits in human memory ability could be explained by imperfect correspondence computations. In Chapter 3, I explored how different correspondence computation algorithms are combined to solve the problem of motion direction judgment. I found that a lower- level luminance transient detection system and a higher-level position comparison system work complementary to each other. The relative contribution of each system depends on the signal strength of that system. In Chapter 4, I showed that limit in object tracking ability is a result of noisy visual inputs, suboptimal eye-movement strategy, and probabilistic correspondence computations. No external resource-like limits are needed to understand human’s limited capacity in tracking multiple objects at the same time. In sum, these results suggest that correspondence computations play important roles in visual cognition. It is pervasive, and a failure in this process could lead to further failures in related tasks. Human visual cognition should be understood in terms of the involved computations and the possible errors could arise during these computations

    Understanding Complex Human Behaviour in Images and Videos.

    Full text link
    Understanding human motions and activities in images and videos is an important problem in many application domains, including surveillance, robotics, video indexing, and sports analysis. Although much progress has been made in classifying single person's activities in simple videos, little efforts have been made toward the interpretation of behaviors of multiple people in natural videos. In this thesis, I will present my research endeavor toward the understanding of behaviors of multiple people in natural images and videos. I identify four major challenges in this problem: i) identifying individual properties of people in videos, ii) modeling and recognizing the behavior of multiple people, iii) understanding human activities in multiple levels of resolutions and iv) learning characteristic patterns of interactions between people or people and surrounding environment. I discuss how we solve these challenging problems using various computer vision and machine learning technologies. I conclude with final remarks, observations, and possible future research directions.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/99956/1/wgchoi_1.pd

    High Resolution Near-Infrared Imaging Observations of the Galactic Centre

    Get PDF
    Ziel der vorliegenden Doktorarbeit war es, neue Erkenntnisse über die Struktur, Zusammensetzung und Dynamik des zentralen Sternhaufens unserer Milchstraße zu gewinnen. Im Mittelpunkt unserer Analysen stand dabei vor allem die Natur der Konzentration einiger Millionen Sonnenmassen dunkler Materie im Zentrum dieses Haufens, bei welcher es sich vermutlich um ein supermassives Schwarzes Loch handelt. Schon seit Jahrzehnten wurde vermutet, dass die kompakte, nicht-thermische Radioquelle Sagittarius A* (Sgr A*), welche 1974 entdeckt wurde, mit einem solchen Objekt assoziiert ist. In großen Teilen basiert diese Arbeit auf Beobachtungen des galaktischen Zentrums mit der neuartigen Nahinfrarotkamera CONICA und dem dazugehörigen System für adaptive Optik, NAOS, am Very Large Telescope der Europäischen Südsternwarte. Dieses kombinierte System wurde Ende 2001/Anfang 2002 in Betrieb genommen und bietet ideale Voraussetzungen für tiefe, hochaufgelöste Nahinfrarot- Beobachtungen des galaktischen Zentrums. Ein grundliegendes Problem, welches es zu lösen galt, war die Astrometrie der Aufnahmen des Sternfeldes im galaktischen Zentrum. Ein akkurates astrometrisches System ist eine essentielle Voraussetzung dafür, Sgr A* auf Infrarotbildern zu identifizieren und die relativen Positionen und Bewegungen der Sterne in seiner Umgebung zu messen. Mit Hilfe von SiO Maser Sternen, deren Position durch Radiointerferometrie zu < 1 Millibogensekunde bestimmt werden kann, gelang es uns, die Position der nicht-thermischen Radioquelle Sagittarius A* (Sgr A*), welche mit dem vermuteten schwarzen Loch assoziiert ist, relativ zu den Sternen in seiner Umgebung mit einer Genauigkeit von < 10 mas zu bestimmen. Durch Sternzählungen in tiefen, hochauflösenden Bildern konnten wir zeigen, dass die Sterndichte zu Sgr A* hin mit einem Potenzgesetz ansteigt, dass der Sternhaufen also einen sogenannten Cusp in einem Radius von ca. 100 oder 40 mpc um das vermutete schwarze Loch aufzeigt. In einer Distanz < 4 mpc von Sgr A* steigt die Massendichte des Haufens auf über 108 M an. Die Sternpopulation im Cusp zeigt einen Mangel an Riesensternen und an Sternen auf dem horizontalen Ast relativ zum umgebenden Haufen. Hierfür könnten Sternkollisionen und/oder Massensegregation verantwortlich sein
    • …
    corecore