84 research outputs found

    Adaptive Deep Learning Detection Model for Multi-Foggy Images

    Get PDF
    The fog has different features and effects within every single environment. Detection whether there is fog in the image is considered a challenge and giving the type of fog has a substantial enlightening effect on image defogging. Foggy scenes have different types such as scenes based on fog density level and scenes based on fog type. Machine learning techniques have a significant contribution to the detection of foggy scenes. However, most of the existing detection models are based on traditional machine learning models, and only a few studies have adopted deep learning models. Furthermore, most of the existing machines learning detection models are based on fog density-level scenes. However, to the best of our knowledge, there is no such detection model based on multi-fog type scenes have presented yet. Therefore, the main goal of our study is to propose an adaptive deep learning model for the detection of multi-fog types of images. Moreover, due to the lack of a publicly available dataset for inhomogeneous, homogenous, dark, and sky foggy scenes, a dataset for multi-fog scenes is presented in this study (https://github.com/Karrar-H-Abdulkareem/Multi-Fog-Dataset). Experiments were conducted in three stages. First, the data collection phase is based on eight resources to obtain the multi-fog scene dataset. Second, a classification experiment is conducted based on the ResNet-50 deep learning model to obtain detection results. Third, evaluation phase where the performance of the ResNet-50 detection model has been compared against three different models. Experimental results show that the proposed model has presented a stable classification performance for different foggy images with a 96% score for each of Classification Accuracy Rate (CAR), Recall, Precision, F1-Score which has specific theoretical and practical significance. Our proposed model is suitable as a pre-processing step and might be considered in different real-time applications

    Computational Media Aesthetics for Media Synthesis

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    The relationship between wayfinding performance, spatial layout and landmarks in virtual environments

    Get PDF
    Environmental factors, including landmarks that affect people’s wayfinding performance in unfamiliar environments have been discussed in a great number of studies. However, there is still no consensus on the factors that shape people’s performance or what makes a landmark preferable during wayfinding. Hence, this study aims to understand the impact of different spatial layouts, environmental conditions and landmarks on people’s wayfinding performance, and the factors that make landmarks salient. Sea Hero Quest (SHQ), an online game that has been played by more than 4.3 million people from 2016 to date, is selected as a case study to investigate the impact of different environments and other factors, in particular landmarks. Forty-five wayfinding levels of SHQ are analysed and compared using Geographic Information System (GIS) and Space syntax axial, segment and visibility graph analyses. A cluster analysis is conducted to examine the relationship between levels. Varying conditions associated with landmarks, weather and maps were taken into consideration. In order to investigate the process of selecting landmarks, visual, structural (whether landmarks are global or local) and cognitive saliency are analysed using web-based surveys, saliency algorithms and the visibility of landmarks. Results of this study show that the complexity of layouts plays a major role in wayfinding; as the complexity of layout increases, so does the time taken to complete the wayfinding task. Similarly, the weather condition has an effect; as the weather becomes foggy and visibility decreases, the time taken to complete the wayfinding task increases. It is discovered that landmarks that are visible for more than 25% of a journey can be defined as global landmarks whereas the rest can be defined as local landmarks. Findings also show that landmarks that are visually salient (objects with a unique colour and size) and structurally salient (objects that are closer to people) are registered more by people in unfamiliar environments. This study contributes to the existing literature by exploring the factors that affect people’s wayfinding performance by using the largest dataset in the field (so providing more accurate results), focusing on 45 different layouts (while current research studies mostly focus on one or two different layouts), by proposing a threshold to distinguish global and local landmarks, and analysing visual, structural and cognitive saliency through various measures

    Semi-supervised wildfire smoke detection based on smoke-aware consistency

    Get PDF
    The semi-transparency property of smoke integrates it highly with the background contextual information in the image, which results in great visual differences in different areas. In addition, the limited annotation of smoke images from real forest scenarios brings more challenges for model training. In this paper, we design a semi-supervised learning strategy, named smokeaware consistency (SAC), to maintain pixel and context perceptual consistency in different backgrounds. Furthermore, we propose a smoke detection strategy with triple classification assistance for smoke and smoke-like object discrimination. Finally, we simplified the LFNet fire-smoke detection network to LFNet-v2, due to the proposed SAC and triple classification assistance that can perform the functions of some specific module. The extensive experiments validate that the proposed method significantly outperforms state-of-the-art object detection algorithms on wildfire smoke datasets and achieves satisfactory performance under challenging weather conditions.Peer ReviewedPostprint (published version

    Biologically-inspired robust motion segmentation using mutual information

    Get PDF
    This paper presents a neuroscience inspired information theoretic approach to motion segmentation. Robust motion segmentation represents a fundamental first stage in many surveillance tasks. As an alternative to widely adopted individual segmentation approaches, which are challenged in different ways by imagery exhibiting a wide range of environmental variation and irrelevant motion, this paper presents a new biologically-inspired approach which computes the multivariate mutual information between multiple complementary motion segmentation outputs. Performance evaluation across a range of datasets and against competing segmentation methods demonstrates robust performance

    Expression-dependent susceptibility to face distortions in processing of facial expressions of emotion

    Get PDF
    Our capability of recognizing facial expressions of emotion under different viewing conditions implies the existence of an invariant expression representation. As natural visual signals are often distorted and our perceptual strategy changes with external noise level, it is essential to understand how expression perception is susceptible to face distortion and whether the same facial cues are used to process high- and low-quality face images. We systematically manipulated face image resolution (experiment 1) and blur (experiment 2), and measured participants' expression categorization accuracy, perceived expression intensity and associated gaze patterns. Our analysis revealed a reasonable tolerance to face distortion in expression perception. Reducing image resolution up to 48 × 64 pixels or increasing image blur up to 15 cycles/image had little impact on expression assessment and associated gaze behaviour. Further distortion led to decreased expression categorization accuracy and intensity rating, increased reaction time and fixation duration, and stronger central fixation bias which was not driven by distortion-induced changes in local image saliency. Interestingly, the observed distortion effects were expression-dependent with less deterioration impact on happy and surprise expressions, suggesting this distortion-invariant facial expression perception might be achieved through the categorical model involving a non-linear configural combination of local facial features. [Abstract copyright: Copyright © 2018 Elsevier Ltd. All rights reserved.

    Biosignalų požymių regos diskomfortui vertinti išskyrimas ir tyrimas

    Get PDF
    Comfortable stereoscopic perception continues to be an essential area of research. The growing interest in virtual reality content and increasing market for head-mounted displays (HMDs) still cause issues of balancing depth perception and comfortable viewing. Stereoscopic views are stimulating binocular cues – one type of several available human visual depth cues which becomes conflicting cues when stereoscopic displays are used. Depth perception by binocular cues is based on matching of image features from one retina with corresponding features from the second retina. It is known that our eyes can tolerate small amounts of retinal defocus, which is also known as Depth of Focus. When magnitudes are larger, a problem of visual discomfort arises. The research object of the doctoral dissertation is a visual discomfort level. This work aimed at the objective evaluation of visual discomfort, based on physiological signals. Different levels of disparity and the number of details in stereoscopic views in some cases make it difficult to find the focus point for comfortable depth perception quickly. During this investigation, a tendency for differences in single sensor-based electroencephalographic EEG signal activity at specific frequencies was found. Additionally, changes in eye tracker collected gaze signals were also found. A dataset of EEG and gaze signal records from 28 control subjects was collected and used for further evaluation. The dissertation consists of an introduction, three chapters and general conclusions. The first chapter reveals the fundamental knowledge ways of measuring visual discomfort based on objective and subjective methods. In the second chapter theoretical research results are presented. This research was aimed to investigate methods which use physiological signals to detect changes on the level of sense of presence. Results of the experimental research are presented in the third chapter. This research aimed to find differences in collected physiological signals when a level of visual discomfort changes. An experiment with 28 control subjects was conducted to collect these signals. The results of the thesis were published in six scientific publications – three in peer-reviewed scientific papers, three in conference proceedings. Additionally, the results of the research were presented in 8 conferences.Dissertatio

    Three-dimensional imaging with multiple degrees of freedom using data fusion

    Get PDF
    This paper presents an overview of research work and some novel strategies and results on using data fusion in 3-D imaging when using multiple information sources. We examine a variety of approaches and applications such as 3-D imaging integrated with polarimetric and multispectral imaging, low levels of photon flux for photon-counting 3-D imaging, and image fusion in both multiwavelength 3-D digital holography and 3-D integral imaging. Results demonstrate the benefits data fusion provides for different purposes, including visualization enhancement under different conditions, and 3-D reconstruction quality improvement

    Sensor fusion in distributed cortical circuits

    Get PDF
    The substantial motion of the nature is to balance, to survive, and to reach perfection. The evolution in biological systems is a key signature of this quintessence. Survival cannot be achieved without understanding the surrounding world. How can a fruit fly live without searching for food, and thereby with no form of perception that guides the behavior? The nervous system of fruit fly with hundred thousand of neurons can perform very complicated tasks that are beyond the power of an advanced supercomputer. Recently developed computing machines are made by billions of transistors and they are remarkably fast in precise calculations. But these machines are unable to perform a single task that an insect is able to do by means of thousands of neurons. The complexity of information processing and data compression in a single biological neuron and neural circuits are not comparable with that of developed today in transistors and integrated circuits. On the other hand, the style of information processing in neural systems is also very different from that of employed by microprocessors which is mostly centralized. Almost all cognitive functions are generated by a combined effort of multiple brain areas. In mammals, Cortical regions are organized hierarchically, and they are reciprocally interconnected, exchanging the information from multiple senses. This hierarchy in circuit level, also preserves the sensory world within different levels of complexity and within the scope of multiple modalities. The main behavioral advantage of that is to understand the real-world through multiple sensory systems, and thereby to provide a robust and coherent form of perception. When the quality of a sensory signal drops, the brain can alternatively employ other information pathways to handle cognitive tasks, or even to calibrate the error-prone sensory node. Mammalian brain also takes a good advantage of multimodal processing in learning and development; where one sensory system helps another sensory modality to develop. Multisensory integration is considered as one of the main factors that generates consciousness in human. Although, we still do not know where exactly the information is consolidated into a single percept, and what is the underpinning neural mechanism of this process? One straightforward hypothesis suggests that the uni-sensory signals are pooled in a ploy-sensory convergence zone, which creates a unified form of perception. But it is hard to believe that there is just one single dedicated region that realizes this functionality. Using a set of realistic neuro-computational principles, I have explored theoretically how multisensory integration can be performed within a distributed hierarchical circuit. I argued that the interaction of cortical populations can be interpreted as a specific form of relation satisfaction in which the information preserved in one neural ensemble must agree with incoming signals from connected populations according to a relation function. This relation function can be seen as a coherency function which is implicitly learnt through synaptic strength. Apart from the fact that the real world is composed of multisensory attributes, the sensory signals are subject to uncertainty. This requires a cortical mechanism to incorporate the statistical parameters of the sensory world in neural circuits and to deal with the issue of inaccuracy in perception. I argued in this thesis how the intrinsic stochasticity of neural activity enables a systematic mechanism to encode probabilistic quantities within neural circuits, e.g. reliability, prior probability. The systematic benefit of neural stochasticity is well paraphrased by the problem of Duns Scotus paradox: imagine a donkey with a deterministic brain that is exposed to two identical food rewards. This may make the animal suffer and die starving because of indecision. In this thesis, I have introduced an optimal encoding framework that can describe the probability function of a Gaussian-like random variable in a pool of Poisson neurons. Thereafter a distributed neural model is proposed that can optimally combine conditional probabilities over sensory signals, in order to compute Bayesian Multisensory Causal Inference. This process is known as a complex multisensory function in the cortex. Recently it is found that this process is performed within a distributed hierarchy in sensory cortex. Our work is amongst the first successful attempts that put a mechanistic spotlight on understanding the underlying neural mechanism of Multisensory Causal Perception in the brain, and in general the theory of decentralized multisensory integration in sensory cortex. Engineering information processing concepts in the brain and developing new computing technologies have been recently growing. Neuromorphic Engineering is a new branch that undertakes this mission. In a dedicated part of this thesis, I have proposed a Neuromorphic algorithm for event-based stereoscopic fusion. This algorithm is anchored in the idea of cooperative computing that dictates the defined epipolar and temporal constraints of the stereoscopic setup, to the neural dynamics. The performance of this algorithm is tested using a pair of silicon retinas
    corecore