Search CORE

2,471 research outputs found

A Neural Model of How the Brain Computes Heading from Optic Flow in Realistic Scenes

Author: Browing Andrew N.
Grossberg Stephen
Mingolla Ennio
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/12/2008
Field of study

Animals avoid obstacles and approach goals in novel cluttered environments using visual information, notably optic flow, to compute heading, or direction of travel, with respect to objects in the environment. We present a neural model of how heading is computed that describes interactions among neurons in several visual areas of the primate magnocellular pathway, from retina through V1, MT+, and MSTd. The model produces outputs which are qualitatively and quantitatively similar to human heading estimation data in response to complex natural scenes. The model estimates heading to within 1.5° in random dot or photo-realistically rendered scenes and within 3° in video streams from driving in real-world environments. Simulated rotations of less than 1 degree per second do not affect model performance, but faster simulated rotation rates deteriorate performance, as in humans. The model is part of a larger navigational system that identifies and tracks objects while navigating in cluttered environments.National Science Foundation (SBE-0354378, BCS-0235398); Office of Naval Research (N00014-01-1-0624); National-Geospatial Intelligence Agency (NMA201-01-1-2016

Boston University Institutional Repository (OpenBU)

Sparse variational regularization for visual motion estimation

Author: Nawaz Muhammad Wasim
Publication venue: School of electrical, computer and telecommunications engineering
Publication date: 01/01/2016
Field of study

The computation of visual motion is a key component in numerous computer vision tasks such as object detection, visual object tracking and activity recognition. Despite exten- sive research effort, efficient handling of motion discontinuities, occlusions and illumina- tion changes still remains elusive in visual motion estimation. The work presented in this thesis utilizes variational methods to handle the aforementioned problems because these methods allow the integration of various mathematical concepts into a single en- ergy minimization framework. This thesis applies the concepts from signal sparsity to the variational regularization for visual motion estimation. The regularization is designed in such a way that it handles motion discontinuities and can detect object occlusions

Research Online

Bedload transport analysis using image processing techniques

Author: Baranya S.
Conevski S.
Ermilov A. A.
Fleit G.
Guerrero M.
Ruther N.
Publication venue
Publication date: 01/01/2022
Field of study

Bedload transport is an important factor to describe the hydromorphological processes of fluvial systems. However, conventional bedload sampling methods have large uncertainty, making it harder to understand this notoriously complex phenomenon. In this study, a novel, image-based approach, the Video-based Bedload Tracker (VBT), is implemented to quantify gravel bedload transport by combining two different techniques: Statistical Background Model and Large-Scale Particle Image Velocimetry. For testing purposes, we use underwater videos, captured in a laboratory flume, with future field adaptation as an overall goal. VBT offers a full statistics of the individual velocity and grainsize data for the moving particles. The paper introduces the testing of the method which requires minimal preprocessing (a simple and quick 2D Gaussian filter) to retrieve and calculate bedload transport rate. A detailed sensitivity analysis is also carried out to introduce the parameters of the method, during which it was found that by simply relying on literature and the visual evaluation of the resulting segmented videos, it is simple to set them to the correct values. Practical aspects of the applicability of VBT in the field are also discussed and a statistical filter, accounting for the suspended sediment and air bubbles, is provided

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Recommended from our members

Learning to See with Minimal Human Supervision

Author: Cheng Zezhou
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2023
Field of study

Deep learning has significantly advanced computer vision in the past decade, paving the way for practical applications such as facial recognition and autonomous driving. However, current techniques depend heavily on human supervision, limiting their broader deployment. This dissertation tackles this problem by introducing algorithms and theories to minimize human supervision in three key areas: data, annotations, and neural network architectures, in the context of various visual understanding tasks such as object detection, image restoration, and 3D generation. First, we present self-supervised learning algorithms to handle in-the-wild images and videos that traditionally require time-consuming manual curation and labeling. We demonstrate that when a deep network is trained to be invariant to geometric and photometric transformations, representations from its intermediate layers are highly predictive of object semantic parts such as eyes and noses. This insight offers a simple unsupervised learning framework that significantly improves the efficiency and accuracy of few-shot landmark prediction and matching. We then present a technique for learning single-view 3D object pose estimation models by utilizing in-the-wild videos where objects turn (e.g., cars in roundabouts). This technique achieves competitive performance with respect to existing state-of-the-art without requiring any manual labels during training. We also contribute an Accidental Turntables Dataset, containing a challenging set of 41,212 images of cars in cluttered backgrounds, motion blur, and illumination changes that serve as a benchmark for 3D pose estimation. Second, we address variations in labeling styles across different annotators, which leads to a type of noisy label referred to as heterogeneous label. This variability in human annotation can cause subpar performance during both the training and testing phases. To mitigate this, we have developed a framework that models the labeling styles of individual annotators, reducing the impact of human annotation variations and enhancing the performance of standard object detection models. We have also applied this framework to analyze ecological data, which are often collected opportunistically across different case studies without consistent annotation guidelines. Through this application, we have obtained several insightful observations into large-scale bird migration behaviors and their relationship to climate change. Our next study explores the challenges of designing neural networks, an area that lacks a comprehensive theoretical understanding. By linking deep neural networks with Gaussian processes, we propose a novel Bayesian interpretation of the deep image prior, which parameterizes a natural image as the output of a convolutional network with random parameters and random input. This approach offers valuable insights to optimize the design of neural networks for various image restoration tasks. Lastly, we introduce several machine-learning techniques to reconstruct and edit 3D shapes from 2D images with minimal human effort. We first present a generic multi-modal generative model that bridges 2D images and 3D shapes via a shared latent space, and demonstrate its applications on versatile 3D shape generation and manipulation tasks. Additionally, we develop a framework for joint estimation of 3D neural scene representation and camera poses. This approach outperforms prior works and allows us to operate in the general SE(3) camera pose setting, unlike the baselines. The results also indicate this method can be complementary to classical structure-from-motion (SfM) pipelines as it compares favorably to SfM on low-texture and low-resolution images

ScholarWorks@UMass Amherst

水中イメージングシステムのための画質改善に関する研究

Author: Lu Huimin
Publication venue: 芹川, 聖一
Publication date: 01/03/2014
Field of study

Underwater survey systems have numerous scientific or industrial applications in the fields of geology, biology, mining, and archeology. These application fields involve various tasks such as ecological studies, environmental damage assessment, and ancient prospection. During two decades, underwater imaging systems are mainly equipped by Underwater Vehicles (UV) for surveying in water or ocean. Challenges associated with obtaining visibility of objects have been difficult to overcome due to the physical properties of the medium. In the last two decades, sonar is usually used for the detection and recognition of targets in the ocean or underwater environment. However, because of the low quality of images by sonar imaging, optical vision sensors are then used instead of it for short range identification. Optical imaging provides short-range, high-resolution visual information of the ocean floor. However, due to the light transmission’s physical properties in the water medium, the optical imaged underwater images are usually performance as poor visibility. Light is highly attenuated when it travels in the ocean. Consequence, the imaged scenes result as poorly contrasted and hazy-like obstructions. The underwater imaging processing techniques are important to improve the quality of underwater images. As mentioned before, underwater images have poor visibility because of the medium scattering and light distortion. In contrast to common photographs, underwater optical images suffer from poor visibility owing to the medium, which causes scattering, color distortion, and absorption. Large suspended particles cause scattering similar to the scattering of light in fog or turbid water that contain many suspended particles. Color distortion occurs because different wavelengths are attenuated to different degrees in water; consequently, images of ambient in the underwater environments are dominated by a bluish tone, because higher wavelengths are attenuated more quickly. Absorption of light in water substantially reduces its intensity. The random attenuation of light causes a hazy appearance as the light backscattered by water along the line of sight considerably degrades image contrast. Especially, objects at a distance of more than 10 meters from the observation point are almost unreadable because colors are faded as characteristic wavelengths, which are filtered according to the distance traveled by light in water. So, traditional image processing methods are not suitable for processing them well. This thesis proposes strategies and solutions to tackle the above mentioned problems of underwater survey systems. In this thesis, we contribute image pre-processing, denoising, dehazing, inhomogeneities correction, color correction and fusion technologies for underwater image quality improvement. The main content of this thesis is as follows. First, comprehensive reviews of the current and most prominent underwater imaging systems are provided in Chapter 1. A main features and performance based classification criterion for the existing systems is presented. After that, by analyzing the challenges of the underwater imaging systems, a hardware based approach and non-hardware based approach is introduced. In this thesis, we are concerned about the image processing based technologies, which are one of the non-hardware approaches, and take most recent methods to process the low quality underwater images. As the different sonar imaging systems applied in much equipment, such as side-scan sonar, multi-beam sonar. The different sonar acquires different images with different characteristics. Side-scan sonar acquires high quality imagery of the seafloor with very high spatial resolution but poor locational accuracy. On the contrast, multi-beam sonar obtains high precision position and underwater depth in seafloor points. In order to fully utilize all information of these two types of sonars, it is necessary to fuse the two kinds of sonar data in Chapter 2. Considering the sonar image forming principle, for the low frequency curvelet coefficients, we use the maximum local energy method to calculate the energy of two sonar images. For the high frequency curvelet coefficients, we take absolute maximum method as a measurement. The main attributes are: firstly, the multi-resolution analysis method is well adapted the cured-singularities and point-singularities. It is useful for sonar intensity image enhancement. Secondly, maximum local energy is well performing the intensity sonar images, which can achieve perfect fusion result [42]. In Chapter 3, as analyzed the underwater laser imaging system, a Bayesian Contourlet Estimator of Bessel K Form (BCE-BKF) based denoising algorithm is proposed. We take the BCE-BKF probability density function (PDF) to model neighborhood of contourlet coefficients. After that, according to the proposed PDF model, we design a maximum a posteriori (MAP) estimator, which relies on a Bayesian statistics representation of the contourlet coefficients of noisy images. The denoised laser images have better contrast than the others. There are three obvious virtues of the proposed method. Firstly, contourlet transform decomposition prior to curvelet transform and wavelet transform by using ellipse sampling grid. Secondly, BCE-BKF model is more effective in presentation of the noisy image contourlet coefficients. Thirdly, the BCE-BKF model takes full account of the correlation between coefficients [107]. In Chapter 4, we describe a novel method to enhance underwater images by dehazing. In underwater optical imaging, absorption, scattering, and color distortion are three major issues in underwater optical imaging. Light rays traveling through water are scattered and absorbed according to their wavelength. Scattering is caused by large suspended particles that degrade optical images captured underwater. Color distortion occurs because different wavelengths are attenuated to different degrees in water; consequently, images of ambient underwater environments are dominated by a bluish tone. Our key contribution is to propose a fast image and video dehazing algorithm, to compensate the attenuation discrepancy along the propagation path, and to take the influence of the possible presence of an artificial lighting source into consideration [108]. In Chapter 5, we describe a novel method of enhancing underwater optical images or videos using guided multilayer filter and wavelength compensation. In certain circumstances, we need to immediately monitor the underwater environment by disaster recovery support robots or other underwater survey systems. However, due to the inherent optical properties and underwater complex environment, the captured images or videos are distorted seriously. Our key contributions proposed include a novel depth and wavelength based underwater imaging model to compensate for the attenuation discrepancy along the propagation path and a fast guided multilayer filtering enhancing algorithm. The enhanced images are characterized by a reduced noised level, better exposure of the dark regions, and improved global contrast where the finest details and edges are enhanced significantly [109]. The performance of the proposed approaches and the benefits are concluded in Chapter 6. Comprehensive experiments and extensive comparison with the existing related techniques demonstrate the accuracy and effect of our proposed methods.九州工業大学博士学位論文学位記番号:工博甲第367号　学位授与年月日:平成26年3月25日CHAPTER 1 INTRODUCTION|CHAPTER 2 MULTI-SOURCE IMAGES FUSION|CHAPTER 3 LASER IMAGES DENOISING|CHAPTER 4 OPTICAL IMAGE DEHAZING|CHAPTER 5 SHALLOW WATER DE-SCATTERING|CHAPTER 6 CONCLUSIONS九州工業大学平成25年

Kyutacar : Kyushu Institute of Technology Academic Repository

Kyushu Institute of Technology of Academic Repository

Deep into the Eyes: Applying Machine Learning to improve Eye-Tracking

Author: Chaudhary Aayush Kumar
Publication venue: RIT Scholar Works
Publication date: 15/04/2022
Field of study

Eye-tracking has been an active research area with applications in personal and behav- ioral studies, medical diagnosis, virtual reality, and mixed reality applications. Improving the robustness, generalizability, accuracy, and precision of eye-trackers while maintaining privacy is crucial. Unfortunately, many existing low-cost portable commercial eye trackers suffer from signal artifacts and a low signal-to-noise ratio. These trackers are highly depen- dent on low-level features such as pupil edges or diffused bright spots in order to precisely localize the pupil and corneal reflection. As a result, they are not reliable for studying eye movements that require high precision, such as microsaccades, smooth pursuit, and ver- gence. Additionally, these methods suffer from reflective artifacts, occlusion of the pupil boundary by the eyelid and often require a manual update of person-dependent parame- ters to identify the pupil region. In this dissertation, I demonstrate (I) a new method to improve precision while maintaining the accuracy of head-fixed eye trackers by combin- ing velocity information from iris textures across frames with position information, (II) a generalized semantic segmentation framework for identifying eye regions with a further extension to identify ellipse fits on the pupil and iris, (III) a data-driven rendering pipeline to generate a temporally contiguous synthetic dataset for use in many eye-tracking ap- plications, and (IV) a novel strategy to preserve privacy in eye videos captured as part of the eye-tracking process. My work also provides the foundation for future research by addressing critical questions like the suitability of using synthetic datasets to improve eye-tracking performance in real-world applications, and ways to improve the precision of future commercial eye trackers with improved camera specifications

RIT Scholar Works

Full 3D Reconstruction of Non-Rigidly Deforming Objects

Author: Afzal Hassan
Aouada Djamila
Mirbach Bruno
Ottersten Björn
Publication venue
Publication date: 01/01/2018
Field of study

Open Repository and Bibliography - Luxembourg