Search CORE

562 research outputs found

Unsupervised Deep Single-Image Intrinsic Decomposition using Illumination-Varying Image Sequences

Author: Bonneel N.
Bonneel N.
Garces E.
Lalonde J.‐F.
Meka A.
Publication venue
Publication date: 03/09/2018
Field of study

Machine learning based Single Image Intrinsic Decomposition (SIID) methods decompose a captured scene into its albedo and shading images by using the knowledge of a large set of known and realistic ground truth decompositions. Collecting and annotating such a dataset is an approach that cannot scale to sufficient variety and realism. We free ourselves from this limitation by training on unannotated images. Our method leverages the observation that two images of the same scene but with different lighting provide useful information on their intrinsic properties: by definition, albedo is invariant to lighting conditions, and cross-combining the estimated albedo of a first image with the estimated shading of a second one should lead back to the second one's input image. We transcribe this relationship into a siamese training scheme for a deep convolutional neural network that decomposes a single image into albedo and shading. The siamese setting allows us to introduce a new loss function including such cross-combinations, and to train solely on (time-lapse) images, discarding the need for any ground truth annotations. As a result, our method has the good properties of i) taking advantage of the time-varying information of image sequences in the (pre-computed) training step, ii) not requiring ground truth data to train on, and iii) being able to decompose single images of unseen scenes at runtime. To demonstrate and evaluate our work, we additionally propose a new rendered dataset containing illumination-varying scenes and a set of quantitative metrics to evaluate SIID algorithms. Despite its unsupervised nature, our results compete with state of the art methods, including supervised and non data-driven methods.Comment: To appear in Pacific Graphics 201

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

Gaussian mixture model classifiers for detection and tracking in UAV video streams.

Author: Pillay Treshan.
Publication venue
Publication date: 01/01/2017
Field of study

Masters Degree. University of KwaZulu-Natal, Durban.Manual visual surveillance systems are subject to a high degree of human-error and operator fatigue. The automation of such systems often employs detectors, trackers and classifiers as fundamental building blocks. Detection, tracking and classification are especially useful and challenging in Unmanned Aerial Vehicle (UAV) based surveillance systems. Previous solutions have addressed challenges via complex classification methods. This dissertation proposes less complex Gaussian Mixture Model (GMM) based classifiers that can simplify the process; where data is represented as a reduced set of model parameters, and classification is performed in the low dimensionality parameter-space. The specification and adoption of GMM based classifiers on the UAV visual tracking feature space formed the principal contribution of the work. This methodology can be generalised to other feature spaces. This dissertation presents two main contributions in the form of submissions to ISI accredited journals. In the first paper, objectives are demonstrated with a vehicle detector incorporating a two stage GMM classifier, applied to a single feature space, namely Histogram of Oriented Gradients (HoG). While the second paper demonstrates objectives with a vehicle tracker using colour histograms (in RGB and HSV), with Gaussian Mixture Model (GMM) classifiers and a Kalman filter. The proposed works are comparable to related works with testing performed on benchmark datasets. In the tracking domain for such platforms, tracking alone is insufficient. Adaptive detection and classification can assist in search space reduction, building of knowledge priors and improved target representations. Results show that the proposed approach improves performance and robustness. Findings also indicate potential further enhancements such as a multi-mode tracker with global and local tracking based on a combination of both papers

ResearchSpace@UKZN

Cortical Dynamics of Navigation and Steering in Natural Scenes: Motion-Based Object Segmentation, Heading, and Obstacle Avoidance

Author: Browning Andrew N.
Grossberg Stephen
Mingolla Ennio
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/12/2008
Field of study

Visually guided navigation through a cluttered natural scene is a challenging problem that animals and humans accomplish with ease. The ViSTARS neural model proposes how primates use motion information to segment objects and determine heading for purposes of goal approach and obstacle avoidance in response to video inputs from real and virtual environments. The model produces trajectories similar to those of human navigators. It does so by predicting how computationally complementary processes in cortical areas MT-/MSTv and MT+/MSTd compute object motion for tracking and self-motion for navigation, respectively. The model retina responds to transients in the input stream. Model V1 generates a local speed and direction estimate. This local motion estimate is ambiguous due to the neural aperture problem. Model MT+ interacts with MSTd via an attentive feedback loop to compute accurate heading estimates in MSTd that quantitatively simulate properties of human heading estimation data. Model MT interacts with MSTv via an attentive feedback loop to compute accurate estimates of speed, direction and position of moving objects. This object information is combined with heading information to produce steering decisions wherein goals behave like attractors and obstacles behave like repellers. These steering decisions lead to navigational trajectories that closely match human performance.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National Geospatial Intelligence Agency (NMA201-01-1-2016

Boston University Institutional Repository (OpenBU)

Recommended from our members

Mapping the natural visual world of the zebrafish (Danio rerio): from sensory input to behavioural output

Author: Nevala Noora Emilia
Publication venue
Publication date: 13/08/2020
Field of study

Vision is one of the most crucial senses for animals to catch prey, find mates and stay alive. The tetrachromatic zebrafish (Danio rerio) is a widely used model animal in visual neuroscience with four cone photoreceptors sensitive to UV, blue, green and red light. However, a detailed understanding of how their visual system is adapted to the natural environment, and what is important for the fish to see in their shallow freshwater habitats of the Indian subcontinent, has been missing. Therefore, it also has not been possible to carefully assess the importance of different parts of the light spectrum for their natural behaviours. In this thesis I introduce a new method for natural imaging, characterise the spectral composition of zebrafish’s natural visual world and demonstrate the role of UV light in their prey capture behaviours. To characterise the light conditions in natural environments, I developed and built two hyperspectral scanners to take spectrally detailed light measurements in shallow ponds and slowly moving streams in North-East India. As expected, the spectral profile becomes increasingly monochromatic and red shifted when moving from surface to the bottom. However, the short wavelength dominated surface and long wavelength dominated bottom are separated with colour-rich horizon. These spectral statistics match rather perfectly with the cone densities and colour processing abilities of the bipolar cells in the larval zebrafish retina. Previous work has demonstrated how prey capture behaviours on larval zebrafish can be triggered by small, bright spots. The short wavelength dominated upper part of the visual field projects light from UV bright prey items perfectly to the ventro-temporal part of the retina (“strike zone”) with high density of UV cones. Finally, with my behaviour experiments I demonstrate how prey capture behaviours are strongly driven by UV bright paramecia detected with the strike zone

Sussex Research Online

Shadow removal utilizing multiplicative fusion of texture and colour features for surveillance image

Author: Teo Kah Ming
Publication venue
Publication date: 01/01/2018
Field of study

Automated surveillance systems often identify shadows as parts of a moving object which jeopardized subsequent image processing tasks such as object identification and tracking. In this thesis, an improved shadow elimination method for an indoor surveillance system is presented. This developed method is a fusion of several image processing methods. Firstly, the image is segmented using the Statistical Region Merging algorithm to obtain the segmented potential shadow regions. Next, multiple shadow identification features which include Normalized Cross-Correlation, Local Color Constancy and Hue-Saturation-Value shadow cues are applied on the images to generate feature maps. These feature maps are used for identifying and removing cast shadows according to the segmented regions. The video dataset used is the Autonomous Agents for On-Scene Networked Incident Management which covers both indoor and outdoor video scenes. The benchmarking result indicates that the developed method is on-par with several normally used shadow detection methods. The developed method yields a mean score of 85.17% for the video sequence in which the strongest shadow is present and a mean score of 89.93% for the video having the most complex textured background. This research contributes to the development and improvement of a functioning shadow eliminator method that is able to cope with image noise and various illumination changes

Universiti Teknologi Malaysia Institutional Repository

Neuromorphic perception for greenhouse technology using event-based sensors

Author: El Arja Sami
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2022
Field of study

Event-Based Cameras (EBCs), unlike conventional cameras, feature independent pixels that asynchronously generate outputs upon detecting changes in their field of view. Short calculations are performed on each event to mimic the brain. The output is a sparse sequence of events with high temporal precision. Conventional computer vision algorithms do not leverage these properties. Thus a new paradigm has been devised. While event cameras are very efficient in representing sparse sequences of events with high temporal precision, many approaches are challenged in applications where a large amount of spatially-temporally rich information must be processed in real-time. In reality, most tasks in everyday life take place in complex and uncontrollable environments, which require sophisticated models and intelligent reasoning. Typical hard problems in real-world scenes are detecting various non-uniform objects or navigation in an unknown and complex environment. In addition, colour perception is an essential fundamental property in distinguishing objects in natural scenes. Colour is a new aspect of event-based sensors, which work fundamentally differently from standard cameras, measuring per-pixel brightness changes per colour filter asynchronously rather than measuring “absolute” brightness at a constant rate. This thesis explores neuromorphic event-based processing methods for high-noise and cluttered environments with imbalanced classes. A fully event-driven processing pipeline was developed for agricultural applications to perform fruits detection and classification to unlock the outstanding properties of event cameras. The nature of features in such data was explored, and methods to represent and detect features were demonstrated. A framework for detecting and classifying features was developed and evaluated on the N-MNIST and Dynamic Vision Sensor (DVS) gesture datasets. The same network was evaluated on laboratory recorded and real-world data with various internal variations for fruits detection such as overlap, variation in size and appearance. In addition, a method to handle highly imbalanced data was developed. We examined the characteristics of spatio-temporal patterns for each colour filter to help expand our understanding of this novel data and explored their applications in classification tasks where colours were more relevant features than shapes and appearances. The results presented in this thesis demonstrate the potential and efficacy of event- based systems by demonstrating the applicability of colour event data and the viability of event-driven classification

Western Sydney ResearchDirect

Advances in video motion analysis research for mature and emerging application areas

Author: Heinrich A.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2015
Field of study

Repository TU/e

Pure OAI Repository

Feature-based image patch classiﬁcation for moving shadow detection

Author: Russell Mosin
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2019
Field of study

Moving object detection is a ﬁrst step towards many computer vision applications, such as human interaction and tracking, video surveillance, and traﬃc monitoring systems. Accurate estimation of the target object’s size and shape is often required before higher-level tasks (e.g., object tracking or recog nition) can be performed. However, these properties can be derived only when the foreground object is detected precisely. Background subtraction is a common technique to extract foreground objects from image sequences. The purpose of background subtraction is to detect changes in pixel values within a given frame. The main problem with background subtraction and other related object detection techniques is that cast shadows tend to be misclassiﬁed as either parts of the foreground objects (if objects and their cast shadows are bonded together) or independent foreground objects (if objects and shadows are separated). The reason for this phenomenon is the presence of similar characteristics between the target object and its cast shadow, i.e., shadows have similar motion, attitude, and intensity changes as the moving objects that cast them. Detecting shadows of moving objects is challenging because of problem atic situations related to shadows, for example, chromatic shadows, shadow color blending, foreground-background camouﬂage, nontextured surfaces and dark surfaces. Various methods for shadow detection have been proposed in the liter ature to address these problems. Many of these methods use general-purpose image feature descriptors to detect shadows. These feature descriptors may be eﬀective in distinguishing shadow points from the foreground object in a speciﬁc problematic situation; however, such methods often fail to distinguish shadow points from the foreground object in other situations. In addition, many of these moving shadow detection methods require prior knowledge of the scene condi tions and/or impose strong assumptions, which make them excessively restrictive in practice. The aim of this research is to develop an eﬃcient method capable of addressing possible environmental problems associated with shadow detection while simultaneously improving the overall accuracy and detection stability. In this research study, possible problematic situations for dynamic shad ows are addressed and discussed in detail. On the basis of the analysis, a ro bust method, including change detection and shadow detection, is proposed to address these environmental problems. A new set of two local feature descrip tors, namely, binary patterns of local color constancy (BPLCC) and light-based gradient orientation (LGO), is introduced to address the identiﬁed problematic situations by incorporating intensity, color, texture, and gradient information. The feature vectors are concatenated in a column-by-column manner to con struct one dictionary for the objects and another dictionary for the shadows. A new sparse representation framework is then applied to ﬁnd the nearest neighbor of the test image segment by computing a weighted linear combination of the reference dictionary. Image segment classiﬁcation is then performed based on the similarity between the test image and the sparse representations of the two classes. The performance of the proposed framework on common shadow detec tion datasets is evaluated, and the method shows improved performance com pared with state-of-the-art methods in terms of the shadow detection rate, dis crimination rate, accuracy, and stability. By achieving these signiﬁcant improve ments, the proposed method demonstrates its ability to handle various problems associated with image processing and accomplishes the aim of this thesis

Western Sydney ResearchDirect