23 research outputs found

    Tasking networked CCTV cameras and mobile phones to identify and localize multiple people

    Full text link
    We present a method to identify and localize people by leveraging existing CCTV camera infrastructure along with inertial sensors (accelerometer and magnetometer) within each person’s mobile phones. Since a person’s motion path, as observed by the camera, must match the local motion measurements from their phone, we are able to uniquely identify people with the phones ’ IDs by detecting the statistical dependence between the phone and camera measurements. For this, we express the problem as consisting of a twomeasurement HMM for each person, with one camera measurement and one phone measurement. Then we use a maximum a posteriori formulation to find the most likely ID assignments. Through sensor fusion, our method largely bypasses the motion correspondence problem from computer vision and is able to track people across large spatial or temporal gaps in sensing. We evaluate the system through simulations and experiments in a real camera network testbed

    TRUSS: Tracking Risk with Ubiquitous Smart Sensing

    Get PDF
    We present TRUSS, or Tracking Risk with Ubiquitous Smart Sensing, a novel system that infers and renders safety context on construction sites by fusing data from wearable devices, distributed sensing infrastructure, and video. Wearables stream real-time levels of dangerous gases, dust, noise, light quality, altitude, and motion to base stations that synchronize the mobile devices, monitor the environment, and capture video. At the same time, low-power video collection and processing nodes track the workers as they move through the view of the cameras, identifying the tracks using information from the sensors. These processes together connect the context-mining wearable sensors to the video; information derived from the sensor data is used to highlight salient elements in the video stream. The augmented stream in turn provides users with better understanding of real-time risks, and supports informed decision-making. We tested our system in an initial deployment on an active construction site.Intel CorporationMassachusetts Institute of Technology. Media LaboratoryEni S.p.A. (Firm

    Automatic Synchronization of Wearable Sensors and Video-Cameras for Ground Truth Annotation -- A Practical Approach

    Full text link

    Design Considerations for Multi-Stakeholder Display Analytics

    Get PDF
    Measuring viewer interactions through detailed analytics will be crucial to improving the overall performance of future open display networks. However, in contrast to traditional sign and web analytics systems, such display networks are likely to feature multiple stakeholders each with the ability to collect a subset of the required analytics information. Combining analytics data from multiple stakeholders could lead to new insights, but stakeholders may have limited willingness to share information due to privacy concerns or commercial sensitivities. In this paper, we provide a comprehensive overview of analytics data that might be captured by different stakeholders in a display network, make the case for the synthesis of analytics data in such display networks, present design considerations for future architectures designed to enable the sharing of display analytics information, and offer an example of how such systems might be implemented

    Moving Beyond Weak Identifiers for Proxemic Interaction

    Get PDF

    Gestures Everywhere: A Multimodal Sensor Fusion and Analysis Framework for Pervasive Displays

    Get PDF
    Gestures Everywhere is a dynamic framework for multimodal sensor fusion, pervasive analytics and gesture recognition. Our framework aggregates the real-time data from approximately 100 sensors that include RFID readers, depth cameras and RGB cameras distributed across 30 interactive displays that are located in key public areas of the MIT Media Lab. Gestures Everywhere fuses the multimodal sensor data using radial basis function particle filters and performs real-time analysis on the aggregated data. This includes key spatio-temporal properties such as presence, location and identity; in addition to higher-level analysis including social clustering and gesture recognition. We describe the algorithms and architecture of our system and discuss the lessons learned from the systems deployment

    Who is where? Matching people in video to wearable acceleration during crowded mingling events

    Get PDF
    ConferenciaWe address the challenging problem of associating acceler- ation data from a wearable sensor with the corresponding spatio-temporal region of a person in video during crowded mingling scenarios. This is an important rst step for multi- sensor behavior analysis using these two modalities. Clearly, as the numbers of people in a scene increases, there is also a need to robustly and automatically associate a region of the video with each person's device. We propose a hierarchi- cal association approach which exploits the spatial context of the scene, outperforming the state-of-the-art approaches signi cantly. Moreover, we present experiments on match- ing from 3 to more than 130 acceleration and video streams which, to our knowledge, is signi cantly larger than prior works where only up to 5 device streams are associated

    Recognising Complex Activities with Histograms of Relative Tracklets

    Get PDF
    One approach to the recognition of complex human activities is to use feature descriptors that encode visual inter-actions by describing properties of local visual features with respect to trajectories of tracked objects. We explore an example of such an approach in which dense tracklets are described relative to multiple reference trajectories, providing a rich representation of complex interactions between objects of which only a subset can be tracked. SpeciïŹcally, we report experiments in which reference trajectories are provided by tracking inertial sensors in a food preparation sce-nario. Additionally, we provide baseline results for HOG, HOF and MBH, and combine these features with others for multi-modal recognition. The proposed histograms of relative tracklets (RETLETS) showed better activity recognition performance than dense tracklets, HOG, HOF, MBH, or their combination. Our comparative evaluation of features from accelerometers and video highlighted a performance gap between visual and accelerometer-based motion features and showed a substantial performance gain when combining features from these sensor modalities. A considerable further performance gain was observed in combination with RETLETS and reference tracklet features

    Beyond the lens : communicating context through sensing, video, and visualization

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 101-103).Responding to rapid growth in sensor network deployments that outpaces research efforts to understand or relate the new data streams, this thesis presents a collection of interfaces to sensor network data that encourage open-ended browsing while emphasizing saliency of representation. These interfaces interpret, visualize, and communicate context from sensors, through control panels and virtual environments that synthesize multimodal sensor data into interactive visualizations. This work extends previous efforts in cross-reality to incorporate augmented video as well as complex interactive animations, making use of sensor fusion to saliently represent contextual information to users in a variety of application domains, from building information management to real-time risk assessment to personal privacy. Three applications were developed as part of this work and are discussed here: DoppelLab, an immersive, cross-reality browsing environment for sensor network data; Flurry, an installation that composites video from multiple sources throughout a building in real time, to create an interactive and incorporative view of activity; and Tracking Risk with Ubiquitous Smart Sensing (TRUSS), an ongoing research effort aimed at applying real-time sensing, sensor fusion, and interactive visual analytic interfaces to construction site safety and decision support. Another project in active development, called the Disappearing Act, allows users to remove themselves from a set of live video streams using wearable sensor tags. Though these examples may seem disconnected, they share underlying technologies and research developments, as well as a common set of design principles, which are elucidated in this thesis. Building on developments in sensor networks, computer vision, and graphics, this work aims to create interfaces and visualizations that fuse perspectives, broaden contextual understanding, and encourage exploration of real-time sensor network data.by Gershon Dublon.S.M
    corecore