5,529 research outputs found
3D Refuse-derived Fuel Particle Tracking-by-Detection Using a Plenoptic Camera System
Multiple particle tracking-by-detection is a widely investigated issue in image processing. The paper presents approaches to detecting and tracking various refuse-derived fuel particles in a industrial environment using a plenoptic camera system, which is able to yield 2D gray value information and 3D point clouds with noticeable fluctuations. The presented approaches, including an innovative combined detection method and a post-processing framework for multiple particle tracking, aim at making the most of the acquired 2D and 3D information to deal with the fluctuations of the measuring system. The proposed novel detection method fuses the captured 2D gray value information and 3D point clouds, which is superior to applying single information. Subsequently, the particles are tracked by the linear Kalman filter and 2.5D global nearest neighbor (GNN) and joint probabilistic data association (JPDA) approach, respectively. As a result of several inaccurate detection results caused by the measuring system, the initial tracking results contain faulty and incomplete tracklets that entail a post-processing process. The developed post-processing approach based merely on particle motion similarity benefits a precise tracking performance by eliminating faulty tracklets, deleting outliers, connecting tracklets, and fusing trajectories. The proposed approaches are quantitatively assessed with manuelly labeled ground truth datasets to prove their availability and adequacy as well. The presented combined detection method provides the highest F 1 -score, and the proposed post-processing framework enhances the tracking performance significantly with regard to several recommended evaluation indices
RGB-D And Thermal Sensor Fusion: A Systematic Literature Review
In the last decade, the computer vision field has seen significant progress
in multimodal data fusion and learning, where multiple sensors, including
depth, infrared, and visual, are used to capture the environment across diverse
spectral ranges. Despite these advancements, there has been no systematic and
comprehensive evaluation of fusing RGB-D and thermal modalities to date. While
autonomous driving using LiDAR, radar, RGB, and other sensors has garnered
substantial research interest, along with the fusion of RGB and depth
modalities, the integration of thermal cameras and, specifically, the fusion of
RGB-D and thermal data, has received comparatively less attention. This might
be partly due to the limited number of publicly available datasets for such
applications. This paper provides a comprehensive review of both,
state-of-the-art and traditional methods used in fusing RGB-D and thermal
camera data for various applications, such as site inspection, human tracking,
fault detection, and others. The reviewed literature has been categorised into
technical areas, such as 3D reconstruction, segmentation, object detection,
available datasets, and other related topics. Following a brief introduction
and an overview of the methodology, the study delves into calibration and
registration techniques, then examines thermal visualisation and 3D
reconstruction, before discussing the application of classic feature-based
techniques as well as modern deep learning approaches. The paper concludes with
a discourse on current limitations and potential future research directions. It
is hoped that this survey will serve as a valuable reference for researchers
looking to familiarise themselves with the latest advancements and contribute
to the RGB-DT research field.Comment: 33 pages, 20 figure
Face-from-Depth for Head Pose Estimation on Depth Images
Depth cameras allow to set up reliable solutions for people monitoring and behavior understanding, especially when unstable or poor illumination conditions make unusable common RGB sensors.
Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only.
A head detection and localization module is also included, in order to develop a complete end-to-end system.
The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output.
Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image. We empirically demonstrate that this positively impacts the system performances.
We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup.
Experimental results show that our method overcomes several recent state-of-art works based on both intensity and depth input data, running in real-time at more than 30 frames per second
Chapter From the Lab to the Real World: Affect Recognition Using Multiple Cues and Modalities
Interdisciplinary concept of dissipative soliton is unfolded in connection with ultrafast fibre lasers. The different mode-locking techniques as well as experimental realizations of dissipative soliton fibre lasers are surveyed briefly with an emphasis on their energy scalability. Basic topics of the dissipative soliton theory are elucidated in connection with concepts of energy scalability and stability. It is shown that the parametric space of dissipative soliton has reduced dimension and comparatively simple structure that simplifies the analysis and optimization of ultrafast fibre lasers. The main destabilization scenarios are described and the limits of energy scalability are connected with impact of optical turbulence and stimulated Raman scattering. The fast and slow dynamics of vector dissipative solitons are exposed
Event-based Vision: A Survey
Event cameras are bio-inspired sensors that differ from conventional frame
cameras: Instead of capturing images at a fixed rate, they asynchronously
measure per-pixel brightness changes, and output a stream of events that encode
the time, location and sign of the brightness changes. Event cameras offer
attractive properties compared to traditional cameras: high temporal resolution
(in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low
power consumption, and high pixel bandwidth (on the order of kHz) resulting in
reduced motion blur. Hence, event cameras have a large potential for robotics
and computer vision in challenging scenarios for traditional cameras, such as
low-latency, high speed, and high dynamic range. However, novel methods are
required to process the unconventional output of these sensors in order to
unlock their potential. This paper provides a comprehensive overview of the
emerging field of event-based vision, with a focus on the applications and the
algorithms developed to unlock the outstanding properties of event cameras. We
present event cameras from their working principle, the actual sensors that are
available and the tasks that they have been used for, from low-level vision
(feature detection and tracking, optic flow, etc.) to high-level vision
(reconstruction, segmentation, recognition). We also discuss the techniques
developed to process events, including learning-based techniques, as well as
specialized processors for these novel sensors, such as spiking neural
networks. Additionally, we highlight the challenges that remain to be tackled
and the opportunities that lie ahead in the search for a more efficient,
bio-inspired way for machines to perceive and interact with the world
Design and Characterization of a Dust Injector for Future Studies of Tungsten Dust in the STOR-M Plasma
Dust generation from Plasma Facing Components (PFC) is a problem for tokamaks as they
approach suitable reactor conditions. Tungsten dust is especially detrimental in the core,
due to associated high Z bremsstrahlung power losses. As Tungsten is a primary candidate
for PFC materials in large projects such as ITER, this remains a pressing issue. In order
to better understand dust dynamics in tokamaks, a dust injection experiment is proposed
for the Saskatchewan Torus-Modified (STOR-M). This experiment will utilize calibrated,
spherical tungsten micro-particles. A known mass of these tungsten micro-particles are
to be injected into STOR-M with control over the position of the dust plume. This will
enable future observation and study of dust dynamics within STOR-M.
In preparation for this experiment, a new dust injector has been designed, based on the
fast gas valve for the University of Saskatchewan Compact Torus Injector. An experimental
test apparatus was developed to characterize the dust injector. In the experiment, nitrogen
gas and dust particles are injected into the test vacuum chamber under various dust
injector parameters. Vacuum chamber pressures range from 10−4
- 10−5 Torr, which is
within the operation range of STOR-M. These particles are then imaged with a high-speed
camera via laser light scattering. Collected 12-bit raw image data was then processed and
analysed. This analysis fully characterizes the dust injector in terms of the time evolution
of the injector dust plume, amount of gas injected and injected dust mass
Computer vision based posture estimation and fall detection.
Falls are a major health problem, especially in the elderly population. Increasing fall events demands a high quality of service and dedicated medical treatment which is an economic burden. Serious injuries due to fall can cost lives in the absence of immediate care and support. There- fore, a monitoring system that can accurately detect fall events and generate instant alerts for immediate care is extremely necessary. To address this problem, this research aims to develop a computer vision-based fall detection system. This study proposes fall detection in three stages: (A) Detection of human silhouette and recognition of the pose, (B) Detection of the human as three regions for different postures including fall and (C) Recognise fall and non-fall using locations of human body regions as distinguishing features. The first stages of work comprise human silhouette detection and identification of activities in the form of different poses. Identifying a pose is important to understand a fall event where a change of pose defines its characteristics. A fall event comprises of sequential change of poses and ends up in a lying pose. Initial pose during a fall can be standing, sitting or bending but the final pose is usually a lying pose. It would, therefore, be beneficial if lying pose is recognised more accurately than other normal activities such as standing, sitting, bending or crawling to address a fall. Hence in the first stage, Background Subtraction (BS) is used to detect human silhouette. After background subtraction, the foreground images were used in a Convolutional Neural Network (CNN) to recognise different poses. The RGB and the Depth images were captured from a Kinect Sensor. The fusion of RGB and Depth images were explored for feeding to a convolutional neural net- work. Depth together with RGB complimented each other to overcome their weakness respectively and proved to be a significant strategy. The classification was performed using CNN to recognise different activities with 81% accuracy on validation. The other challenge in fall detection is the tracking of a person during a fall. Background Subtraction is not sufficient to track a fallen person especially when there are lighting and viewpoint variations in the environment and present of another object like furniture, a pet or even another person. Furthermore, tracking be- comes tougher during the fall in comparison to normal activities like walking or sitting because the rate of change pose is higher during a fall. To overcome this, the idea is to locate the regions in the body in every frame and consider it as a stable tracking strategy. The location of the body parts provides crucial information to distinguish falls from the other normal activities as the person is detected all the time during these activities. Hence the second stage of this research consists of posture detection using the pose estimation technique. This research proposes to use CNN based pose estimation using simplified human postures. The available joints are grouped according to three regions: Head, Torso and Leg and then finally fed to the CNN model with just three inputs instead of several available joints. This strategy added stability in pose detection and proved to be more effective against complex poses observed during a fall. To train the CNN model, transfer learning technique was used. The model was able to achieve 96.7% accuracy in detecting the three regions on different human postures on the publicly available dataset. A system which considers all the lying poses as falls can also generate a higher false alarm. Lying on bed or sofa can easily generate a fall alarm if they are recognised as falls. Hence, it is important to recognise actual fall by considering a sequence of frames that defines a fall and not just the lying pose. In the third and final stage, this study proposes Long Short-Term Memory (LSTM) recurrent networks-based fall detection. The proposed LSTM model uses the detected three region’s location as input features. LSTM is capable of using contextual information from the sequential input patterns. Therefore, the LSTM model was fed with location features of different postures in a sequence for training. The model was able to learn fall patterns and distinguish them from other activities with 88.33% accuracy. Furthermore, the precision of the fall class was 1.0. This is highly desirable in the case of fall detection as there is no false alarm and this means that the cost incurred in calling medical support for a false alarm can be completely avoided
- …