29,550 research outputs found
Assessing neural network scene classification from degraded images
Scene recognition is an essential component of both machine and biological vision. Recent advances in computer vision using deep convolutional neural networks (CNNs) have demonstrated impressive sophistication in scene recognition, through training on large datasets of labeled scene images (Zhou et al. 2018, 2014). One criticism of CNN-based approaches is that performance may not generalize well beyond the training image set (Torralba and Efros 2011), and may be hampered by minor image modifications, which in some cases are barely perceptible to the human eye (Goodfellow et al. 2015; Szegedy et al. 2013). While these “adversarial examples” may be unlikely in natural contexts, during many real-world visual tasks scene information can be degraded or limited due to defocus blur, camera motion, sensor noise, or occluding objects. Here, we quantify the impact of several image degradations (some common, and some more exotic) on indoor/outdoor scene classification using CNNs. For comparison, we use human observers as a benchmark, and also evaluate performance against classifiers using limited, manually selected descriptors. While the CNNs outperformed the other classifiers and rivaled human accuracy for intact images, our results show that their classification accuracy is more affected by image degradations than human observers. On a practical level, however, accuracy of the CNNs remained well above chance for a wide range of image manipulations that disrupted both local and global image statistics. We also examine the level of image-by-image agreement with human observers, and find that the CNNs' agreement with observers varied as a function of the nature of image manipulation. In many cases, this agreement was not substantially different from the level one would expect to observe for two independent classifiers. Together, these results suggest that CNN-based scene classification techniques are relatively robust to several image degradations. However, the pattern of classifications obtained for ambiguous images does not appear to closely reflect the strategies employed by human observers
Recommended from our members
Multi-line Adaptive Perimetry (MAP): A New Procedure for Quantifying Visual Field Integrity for Rapid Assessment of Macular Diseases.
PurposeIn order to monitor visual defects associated with macular degeneration (MD), we present a new psychophysical assessment called multiline adaptive perimetry (MAP) that measures visual field integrity by simultaneously estimating regions associated with perceptual distortions (metamorphopsia) and visual sensitivity loss (scotoma).MethodsWe first ran simulations of MAP with a computerized model of a human observer to determine optimal test design characteristics. In experiment 1, predictions of the model were assessed by simulating metamorphopsia with an eye-tracking device with 20 healthy vision participants. In experiment 2, eight patients (16 eyes) with macular disease completed two MAP assessments separated by about 12 weeks, while a subset (10 eyes) also completed repeated Macular Integrity Assessment (MAIA) microperimetry and Amsler grid exams.ResultsResults revealed strong repeatability of MAP and high accuracy, sensitivity, and specificity (0.89, 0.81, and 0.90, respectively) in classifying patient eyes with severe visual impairment. We also found a significant relationship in terms of the spatial patterns of performance across visual field loci derived from MAP and MAIA microperimetry. However, there was a lack of correspondence between MAP and subjective Amsler grid reports in isolating perceptually distorted regions.ConclusionsThese results highlight the validity and efficacy of MAP in producing quantitative maps of visual field disturbances, including simultaneous mapping of metamorphopsia and sensitivity impairment.Translational relevanceFuture work will be needed to assess applicability of this examination for potential early detection of MD symptoms and/or portable assessment on a home device or computer
A Neural Model of How the Brain Computes Heading from Optic Flow in Realistic Scenes
Animals avoid obstacles and approach goals in novel cluttered environments using visual information, notably optic flow, to compute heading, or direction of travel, with respect to objects in the environment. We present a neural model of how heading is computed that describes interactions among neurons in several visual areas of the primate magnocellular pathway, from retina through V1, MT+, and MSTd. The model produces outputs which are qualitatively and quantitatively similar to human heading estimation data in response to complex natural scenes. The model estimates heading to within 1.5° in random dot or photo-realistically rendered scenes and within 3° in video streams from driving in real-world environments. Simulated rotations of less than 1 degree per second do not affect model performance, but faster simulated rotation rates deteriorate performance, as in humans. The model is part of a larger navigational system that identifies and tracks objects while navigating in cluttered environments.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National-Geospatial Intelligence Agency (NMA201-01-1-2016
Recommended from our members
Predicting glaucomatous visual field deterioration through short multivariate time series modelling
In bio-medical domains there are many
applications involving the modelling of
multivariate time series (MTS) data. One area
that has been largely overlooked so far is the
particular type of time series where the data set
consists of a large number of variables but with
a small number of observations. In this paper we
describe the development of a novel computational
method based on genetic algorithms that bypasses
the size restrictions of traditional statistical
MTS methods, makes no distribution assumptions,
and also locates the order and associated
parameters as a whole step. We apply this method to the prediction and modelling of glaucomatous
visual field deterioration
A neural network approach to audio-assisted movie dialogue detection
A novel framework for audio-assisted dialogue detection based on indicator functions and neural networks is investigated. An indicator function defines that an actor is present at a particular time instant. The cross-correlation function of a pair of indicator functions and the magnitude of the corresponding cross-power spectral density are fed as input to neural networks for dialogue detection. Several types of artificial neural networks, including multilayer perceptrons, voted perceptrons, radial basis function networks, support vector machines, and particle swarm optimization-based multilayer perceptrons are tested. Experiments are carried out to validate the feasibility of the aforementioned approach by using ground-truth indicator functions determined by human observers on 6 different movies. A total of 41 dialogue instances and another 20 non-dialogue instances is employed. The average detection accuracy achieved is high, ranging between 84.78%±5.499% and 91.43%±4.239%
- …