1,917 research outputs found

    SALSA: A Novel Dataset for Multimodal Group Behavior Analysis

    Get PDF
    Studying free-standing conversational groups (FCGs) in unstructured social settings (e.g., cocktail party ) is gratifying due to the wealth of information available at the group (mining social networks) and individual (recognizing native behavioral and personality traits) levels. However, analyzing social scenes involving FCGs is also highly challenging due to the difficulty in extracting behavioral cues such as target locations, their speaking activity and head/body pose due to crowdedness and presence of extreme occlusions. To this end, we propose SALSA, a novel dataset facilitating multimodal and Synergetic sociAL Scene Analysis, and make two main contributions to research on automated social interaction analysis: (1) SALSA records social interactions among 18 participants in a natural, indoor environment for over 60 minutes, under the poster presentation and cocktail party contexts presenting difficulties in the form of low-resolution images, lighting variations, numerous occlusions, reverberations and interfering sound sources; (2) To alleviate these problems we facilitate multimodal analysis by recording the social interplay using four static surveillance cameras and sociometric badges worn by each participant, comprising the microphone, accelerometer, bluetooth and infrared sensors. In addition to raw data, we also provide annotations concerning individuals' personality as well as their position, head, body orientation and F-formation information over the entire event duration. Through extensive experiments with state-of-the-art approaches, we show (a) the limitations of current methods and (b) how the recorded multiple cues synergetically aid automatic analysis of social interactions. SALSA is available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure

    Dynamic Occupancy Grid Prediction for Urban Autonomous Driving: A Deep Learning Approach with Fully Automatic Labeling

    Full text link
    Long-term situation prediction plays a crucial role in the development of intelligent vehicles. A major challenge still to overcome is the prediction of complex downtown scenarios with multiple road users, e.g., pedestrians, bikes, and motor vehicles, interacting with each other. This contribution tackles this challenge by combining a Bayesian filtering technique for environment representation, and machine learning as long-term predictor. More specifically, a dynamic occupancy grid map is utilized as input to a deep convolutional neural network. This yields the advantage of using spatially distributed velocity estimates from a single time step for prediction, rather than a raw data sequence, alleviating common problems dealing with input time series of multiple sensors. Furthermore, convolutional neural networks have the inherent characteristic of using context information, enabling the implicit modeling of road user interaction. Pixel-wise balancing is applied in the loss function counteracting the extreme imbalance between static and dynamic cells. One of the major advantages is the unsupervised learning character due to fully automatic label generation. The presented algorithm is trained and evaluated on multiple hours of recorded sensor data and compared to Monte-Carlo simulation

    Device Free Localisation Techniques in Indoor Environments

    Get PDF
    The location estimation of a target for a long period was performed only by device based localisation technique which is difficult in applications where target especially human is non-cooperative. A target was detected by equipping a device using global positioning systems, radio frequency systems, ultrasonic frequency systems, etc. Device free localisation (DFL) is an upcoming technology in automated localisation in which target need not equip any device for identifying its position by the user. For achieving this objective, the wireless sensor network is a better choice due to its growing popularity. This paper describes the possible categorisation of recently developed DFL techniques using wireless sensor network. The scope of each category of techniques is analysed by comparing their potential benefits and drawbacks. Finally, future scope and research directions in this field are also summarised

    Biosignal‐based human–machine interfaces for assistance and rehabilitation : a survey

    Get PDF
    As a definition, Human–Machine Interface (HMI) enables a person to interact with a device. Starting from elementary equipment, the recent development of novel techniques and unobtrusive devices for biosignals monitoring paved the way for a new class of HMIs, which take such biosignals as inputs to control various applications. The current survey aims to review the large literature of the last two decades regarding biosignal‐based HMIs for assistance and rehabilitation to outline state‐of‐the‐art and identify emerging technologies and potential future research trends. PubMed and other databases were surveyed by using specific keywords. The found studies were further screened in three levels (title, abstract, full‐text), and eventually, 144 journal papers and 37 conference papers were included. Four macrocategories were considered to classify the different biosignals used for HMI control: biopotential, muscle mechanical motion, body motion, and their combinations (hybrid systems). The HMIs were also classified according to their target application by considering six categories: prosthetic control, robotic control, virtual reality control, gesture recognition, communication, and smart environment control. An ever‐growing number of publications has been observed over the last years. Most of the studies (about 67%) pertain to the assistive field, while 20% relate to rehabilitation and 13% to assistance and rehabilitation. A moderate increase can be observed in studies focusing on robotic control, prosthetic control, and gesture recognition in the last decade. In contrast, studies on the other targets experienced only a small increase. Biopotentials are no longer the leading control signals, and the use of muscle mechanical motion signals has experienced a considerable rise, especially in prosthetic control. Hybrid technologies are promising, as they could lead to higher performances. However, they also increase HMIs’ complex-ity, so their usefulness should be carefully evaluated for the specific application

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    Measuring human-induced vibrations of civil engineering structures via vision-based motion tracking

    Get PDF
    We present a novel framework for measuring the body motion of multiple individuals in a group or crowd via a vision-based tracking algorithm, thus to enable studies of humaninduced vibrations of civil engineering structures, such as floors and grandstands. To overcome the difficulties typically observed in this scenario, such as illumination change and object deformation, an online ensemble learning algorithm, which is adaptive to the non-stationary environment, is adopted. Incorporated with an easily carried and installed hardware, the system can capture the characteristics of displacements or accelerations for multiple individuals in a group of various sizes and in a real-world setting. To demonstrate the efficacy of the proposed system, measured displacements and calculated accelerations are compared to the simultaneous measurements obtained by two widely used motion tracking systems. Extensive experiments illustrate that the proposed system achieves equivalent performance as popular wireless inertial sensors and a marker-based optical system, but without limitations commonly associated with such traditional systems. The comparable experiments can also be used to guide the application of our proposed syste
    • 

    corecore