965 research outputs found

    An assistive model of obstacle detection based on deep learning: YOLOv3 for visually impaired people

    Get PDF
    The World Health Organization (WHO) reported in 2019 that at least 2.2 billion people were visual-impairment or blindness. The main problem of living for visually impaired people have been facing difficulties in moving even indoor or outdoor situations. Therefore, their lives are not safe and harmful. In this paper, we proposed an assistive application model based on deep learning: YOLOv3 with a Darknet-53 base network for visually impaired people on a smartphone. The Pascal VOC2007 and Pascal VOC2012 were used for the training set and used Pascal VOC2007 test set for validation. The assistive model was installed on a smartphone with an eSpeak synthesizer which generates the audio output to the user. The experimental result showed a high speed and also high detection accuracy. The proposed application with the help of technology will be an effective way to assist visually impaired people to interact with the surrounding environment in their daily life

    Object Detection and Tracking in Wide Area Surveillance Using Thermal Imagery

    Full text link
    The main objective behind this thesis is to examine how existing vision-based detection and tracking algorithms perform in thermal imagery-based video surveillance. While color-based surveillance has been extensively studied, these techniques can not be used during low illumination, at night, or with lighting changes and shadows which limits their applicability. The main contributions in this thesis are (1) the creation of a new color-thermal dataset, (2) a detailed performance comparison of different color-based detection and tracking algorithms on thermal data and (3) the proposal of an adaptive neural network for false detection rejection. Since there are not many publicly available datasets for thermal-video surveillance, a new UNLV Thermal Color Pedestrian Dataset was collected to evaluate the performance of popular color-based detection and tracking in thermal images. The dataset provides an overhead view of humans walking through a courtyard and is appropriate for aerial surveillance scenarios such as unmanned aerial systems (UAS). Three popular detection schemes are studied for thermal pedestrian detection: 1) Haar-like features, 2) local binary pattern (LBP) and 3) background subtraction motion detection. A i) Kalman filter predictor and ii) optical flow are used for tracking. Results show that combining Haar and LBP detections with a 50% overlap rule and tracking using Kalman filters can improve the true positive rate (TPR) of detection by 20%. However, motion-based methods are better at rejecting false positive in non-moving camera scenarios. The Kalman filter with LBP detection is the most efficient tracker but optical flow better rejects false noise detections. This thesis also presents a technique for learning and characterizing pedestrian detections with heat maps and an object-centric motion compensation method for UAS. Finally, an adaptive method to reject false detections using error back propagation using a neural network. The adaptive rejection scheme is able to successfully learn to identify static false detections for improved detection performance

    Development and Application of Fire Video Image Detection Technology in China’s Road Tunnels

    Get PDF
    A large number of highway tunnels, urban road tunnels and underwater tunnels have been constructed throughout China over the last two decades. With the rapid increase in vehicle traffic, the number of fire incidents in road tunnels have also substantially increased. This paper aims to review the development and application of fire video image detection (VID) technology and their impact on fire safety in China’s road tunnels. The challenges of fire safety in China’s road tunnels are analyzed. The capabilities and limitations of fire detection technologies currently used in China’s road tunnels are discussed. The research and development of fire VID technology in road tunnels, including various detection algorithms, evolution of VID systems and evaluation of their performances in various tunnel tests are reviewed. Some cases involving VID applications in China’s road tunnels are reported. The studies show that the fire VID systems have unique features in providing fire protection and their detection capability and reliability have been enhanced over the decades with the advance in detection algorithms, hardware and integration with other tunnel systems. They have become an important safety system in China’s road tunnels

    Ball Trajectory Inference from Multi-Agent Sports Contexts Using Set Transformer and Hierarchical Bi-LSTM

    Full text link
    As artificial intelligence spreads out to numerous fields, the application of AI to sports analytics is also in the spotlight. However, one of the major challenges is the difficulty of automated acquisition of continuous movement data during sports matches. In particular, it is a conundrum to reliably track a tiny ball on a wide soccer pitch with obstacles such as occlusion and imitations. Tackling the problem, this paper proposes an inference framework of ball trajectory from player trajectories as a cost-efficient alternative to ball tracking. We combine Set Transformers to get permutation-invariant and equivariant representations of the multi-agent contexts with a hierarchical architecture that intermediately predicts the player ball possession to support the final trajectory inference. Also, we introduce the reality loss term and postprocessing to secure the estimated trajectories to be physically realistic. The experimental results show that our model provides natural and accurate trajectories as well as admissible player ball possession at the same time. Lastly, we suggest several practical applications of our framework including missing trajectory imputation, semi-automated pass annotation, automated zoom-in for match broadcasting, and calculating possession-wise running performance metrics

    Estimating Level of Engagement from Ocular Landmarks

    Get PDF
    E-learning offers many advantages like being economical, flexible and customizable, but also has challenging aspects such as lack of – social-interaction, which results in contemplation and sense of remoteness. To overcome these and sustain learners’ motivation, various stimuli can be incorporated. Nevertheless, such adjustments initially require an assessment of engagement level. In this respect, we propose estimating engagement level from facial landmarks exploiting the facts that (i) perceptual decoupling is promoted by blinking during mentally demanding tasks; (ii) eye strain increases blinking rate, which also scales with task disengagement; (iii) eye aspect ratio is in close connection with attentional state and (iv) users’ head position is correlated with their level of involvement. Building empirical models of these actions, we devise a probabilistic estimation framework. Our results indicate that high and low levels of engagement are identified with considerable accuracy, whereas medium levels are inherently more challenging, which is also confirmed by inter-rater agreement of expert coders
    corecore