1,917 research outputs found
SALSA: A Novel Dataset for Multimodal Group Behavior Analysis
Studying free-standing conversational groups (FCGs) in unstructured social
settings (e.g., cocktail party ) is gratifying due to the wealth of information
available at the group (mining social networks) and individual (recognizing
native behavioral and personality traits) levels. However, analyzing social
scenes involving FCGs is also highly challenging due to the difficulty in
extracting behavioral cues such as target locations, their speaking activity
and head/body pose due to crowdedness and presence of extreme occlusions. To
this end, we propose SALSA, a novel dataset facilitating multimodal and
Synergetic sociAL Scene Analysis, and make two main contributions to research
on automated social interaction analysis: (1) SALSA records social interactions
among 18 participants in a natural, indoor environment for over 60 minutes,
under the poster presentation and cocktail party contexts presenting
difficulties in the form of low-resolution images, lighting variations,
numerous occlusions, reverberations and interfering sound sources; (2) To
alleviate these problems we facilitate multimodal analysis by recording the
social interplay using four static surveillance cameras and sociometric badges
worn by each participant, comprising the microphone, accelerometer, bluetooth
and infrared sensors. In addition to raw data, we also provide annotations
concerning individuals' personality as well as their position, head, body
orientation and F-formation information over the entire event duration. Through
extensive experiments with state-of-the-art approaches, we show (a) the
limitations of current methods and (b) how the recorded multiple cues
synergetically aid automatic analysis of social interactions. SALSA is
available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure
Dynamic Occupancy Grid Prediction for Urban Autonomous Driving: A Deep Learning Approach with Fully Automatic Labeling
Long-term situation prediction plays a crucial role in the development of
intelligent vehicles. A major challenge still to overcome is the prediction of
complex downtown scenarios with multiple road users, e.g., pedestrians, bikes,
and motor vehicles, interacting with each other. This contribution tackles this
challenge by combining a Bayesian filtering technique for environment
representation, and machine learning as long-term predictor. More specifically,
a dynamic occupancy grid map is utilized as input to a deep convolutional
neural network. This yields the advantage of using spatially distributed
velocity estimates from a single time step for prediction, rather than a raw
data sequence, alleviating common problems dealing with input time series of
multiple sensors. Furthermore, convolutional neural networks have the inherent
characteristic of using context information, enabling the implicit modeling of
road user interaction. Pixel-wise balancing is applied in the loss function
counteracting the extreme imbalance between static and dynamic cells. One of
the major advantages is the unsupervised learning character due to fully
automatic label generation. The presented algorithm is trained and evaluated on
multiple hours of recorded sensor data and compared to Monte-Carlo simulation
Device Free Localisation Techniques in Indoor Environments
The location estimation of a target for a long period was performed only by device based localisation technique which is difficult in applications where target especially human is non-cooperative. A target was detected by equipping a device using global positioning systems, radio frequency systems, ultrasonic frequency systems, etc. Device free localisation (DFL) is an upcoming technology in automated localisation in which target need not equip any device for identifying its position by the user. For achieving this objective, the wireless sensor network is a better choice due to its growing popularity. This paper describes the possible categorisation of recently developed DFL techniques using wireless sensor network. The scope of each category of techniques is analysed by comparing their potential benefits and drawbacks. Finally, future scope and research directions in this field are also summarised
Biosignalâbased humanâmachine interfaces for assistance and rehabilitation : a survey
As a definition, HumanâMachine Interface (HMI) enables a person to interact with a device. Starting from elementary equipment, the recent development of novel techniques and unobtrusive devices for biosignals monitoring paved the way for a new class of HMIs, which take such biosignals as inputs to control various applications. The current survey aims to review the large literature of the last two decades regarding biosignalâbased HMIs for assistance and rehabilitation to outline stateâofâtheâart and identify emerging technologies and potential future research trends. PubMed and other databases were surveyed by using specific keywords. The found studies were further screened in three levels (title, abstract, fullâtext), and eventually, 144 journal papers and 37 conference papers were included. Four macrocategories were considered to classify the different biosignals used for HMI control: biopotential, muscle mechanical motion, body motion, and their combinations (hybrid systems). The HMIs were also classified according to their target application by considering six categories: prosthetic control, robotic control, virtual reality control, gesture recognition, communication, and smart environment control. An everâgrowing number of publications has been observed over the last years. Most of the studies (about 67%) pertain to the assistive field, while 20% relate to rehabilitation and 13% to assistance and rehabilitation. A moderate increase can be observed in studies focusing on robotic control, prosthetic control, and gesture recognition in the last decade. In contrast, studies on the other targets experienced only a small increase. Biopotentials are no longer the leading control signals, and the use of muscle mechanical motion signals has experienced a considerable rise, especially in prosthetic control. Hybrid technologies are promising, as they could lead to higher performances. However, they also increase HMIsâ complex-ity, so their usefulness should be carefully evaluated for the specific application
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Measuring human-induced vibrations of civil engineering structures via vision-based motion tracking
We present a novel framework for measuring the body motion of multiple individuals in a
group or crowd via a vision-based tracking algorithm, thus to enable studies of humaninduced
vibrations of civil engineering structures, such as floors and grandstands. To overcome
the difficulties typically observed in this scenario, such as illumination change and
object deformation, an online ensemble learning algorithm, which is adaptive to the
non-stationary environment, is adopted. Incorporated with an easily carried and installed
hardware, the system can capture the characteristics of displacements or accelerations for
multiple individuals in a group of various sizes and in a real-world setting. To demonstrate
the efficacy of the proposed system, measured displacements and calculated accelerations
are compared to the simultaneous measurements obtained by two widely used motion
tracking systems. Extensive experiments illustrate that the proposed system achieves
equivalent performance as popular wireless inertial sensors and a marker-based optical
system, but without limitations commonly associated with such traditional systems. The
comparable experiments can also be used to guide the application of our proposed syste
- âŠ