205 research outputs found

    Parallel software implementation of recursive multidimensional digital filters for point-target detection in cluttered infrared scenes

    Full text link
    A technique for the enhancement of point targets in clutter is described. The local 3-D spectrum at each pixel is estimated recursively. An optical flow-field for the textured background is then generated using the 3-D autocorrelation function and the local velocity estimates are used to apply high-pass velocity-selective spatiotemporal filters, with finite impulse responses (FIRs), to subtract the background clutter signal, leaving the foreground target signal, plus noise. Parallel software implementations using a multicore central processing unit (CPU) and a graphical processing unit (GPU) are investigated.Comment: To appear in Proc. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Added header and DO

    Augmented Terrain-Based Navigation to Enable Persistent Autonomy for Underwater Vehicles in GPS-Denied Environments

    Get PDF
    Aquatic robots, such as Autonomous Underwater Vehicles (AUVs), play a major role in the study of ocean processes that require long-term sampling efforts and commonly perform navigation via dead-reckoning using an accelerometer, a magnetometer, a compass, an IMU and a depth sensor for feedback. However, these instruments are subjected to large drift, leading to unbounded uncertainty in location. Moreover, the spatio-temporal dynamics of the ocean environment, coupled with limited communication capabilities, make navigation and localization difficult, especially in coastal regions where the majority of interesting phenomena occur. To add to this, the interesting features are themselves spatio-temporally dynamic, and effective sampling requires a good understanding of vehicle localization relative to the sampled feature. Therefore, our work is motivated by the desire to enable intelligent data collection of complex dynamics and processes that occur in coastal ocean environments to further our understanding and prediction capabilities. The study originated from the need to localize and navigate aquatic robots in a GPS-denied environment and examine the role of the spatio-temporal dynamics of the ocean into the localization and navigation processes. The methods and techniques needed range from the data collection to the localization and navigation algorithms used on-board of the aquatic vehicles. The focus of this work is to develop algorithms for localization and navigation of AUVs in GPS-denied environments. We developed an Augmented terrain-based framework that incorporates physical science data, i.e., temperature, salinity, pH, etc., to enhance the topographic map that the vehicle uses to navigate. In this navigation scheme, the bathymetric data are combined with the physical science data to enrich the uniqueness of the underlying terrain map and increase the accuracy of underwater localization. Another technique developed in this work addresses the problem of tracking an underwater vehicle when the GPS signal suddenly becomes unavailable. The methods include the whitening of the data to reveal the true statistical distance between datapoints and also incorporates physical science data to enhance the topographic map. Simulations were performed at Lake Nighthorse, Colorado, USA, between April 25th and May 2nd 2018 and at Big Fisherman\u27s Cove, Santa Catalina Island, California, USA, on July 13th and July 14th 2016. Different missions were executed on different environments (snow, rain and the presence of plumes). Results showed that these two methodologies for localization and tracking work for reference maps that had been recorded within a week and the accuracy on the average error in localization can be compared to the errors found when using GPS if the time in which the observations were taken are the same period of the day (morning, afternoon or night). The whitening of the data had positive results when compared to localizing without whitening

    From Pixels to Spikes: Efficient Multimodal Learning in the Presence of Domain Shift

    Get PDF
    Computer vision aims to provide computers with a conceptual understanding of images or video by learning a high-level representation. This representation is typically derived from the pixel domain (i.e., RGB channels) for tasks such as image classification or action recognition. In this thesis, we explore how RGB inputs can either be pre-processed or supplemented with other compressed visual modalities, in order to improve the accuracy-complexity tradeoff for various computer vision tasks. Beginning with RGB-domain data only, we propose a multi-level, Voronoi based spatial partitioning of images, which are individually processed by a convolutional neural network (CNN), to improve the scale invariance of the embedding. We combine this with a novel and efficient approach for optimal bit allocation within the quantized cell representations. We evaluate this proposal on the content-based image retrieval task, which constitutes finding similar images in a dataset to a given query. We then move to the more challenging domain of action recognition, where a video sequence is classified according to its constituent action. In this case, we demonstrate how the RGB modality can be supplemented with a flow modality, comprising motion vectors extracted directly from the video codec. The motion vectors (MVs) are used both as input to a CNN and as an activity sensor for providing selective macroblock (MB) decoding of RGB frames instead of full-frame decoding. We independently train two CNNs on RGB and MV correspondences and then fuse their scores during inference, demonstrating faster end-to-end processing and competitive classification accuracy to recent work. In order to explore the use of more efficient sensing modalities, we replace the MV stream with a neuromorphic vision sensing (NVS) stream for action recognition. NVS hardware mimics the biological retina and operates with substantially lower power and at significantly higher sampling rates than conventional active pixel sensing (APS) cameras. Due to the lack of training data in this domain, we generate emulated NVS frames directly from consecutive RGB frames and use these to train a teacher-student framework that additionally leverages on the abundance of optical flow training data. In the final part of this thesis, we introduce a novel unsupervised domain adaptation method for further minimizing the domain shift between emulated (source) and real (target) NVS data domains

    A Comprehensive Literature Review on Convolutional Neural Networks

    Get PDF
    The fields of computer vision and image processing from their initial days have been dealing with the problems of visual recognition. Convolutional Neural Networks (CNNs) in machine learning are deep architectures built as feed-forward neural networks or perceptrons, which are inspired by the research done in the fields of visual analysis by the visual cortex of mammals like cats. This work gives a detailed analysis of CNNs for the computer vision tasks, natural language processing, fundamental sciences and engineering problems along with other miscellaneous tasks. The general CNN structure along with its mathematical intuition and working, a brief critical commentary on the advantages and disadvantages, which leads researchers to search for alternatives to CNN’s are also mentioned. The paper also serves as an appreciation of the brain-child of past researchers for the existence of such a fecund architecture for handling multidimensional data and approaches to improve their performance further

    Event-based Simultaneous Localization and Mapping: A Comprehensive Survey

    Full text link
    In recent decades, visual simultaneous localization and mapping (vSLAM) has gained significant interest in both academia and industry. It estimates camera motion and reconstructs the environment concurrently using visual sensors on a moving robot. However, conventional cameras are limited by hardware, including motion blur and low dynamic range, which can negatively impact performance in challenging scenarios like high-speed motion and high dynamic range illumination. Recent studies have demonstrated that event cameras, a new type of bio-inspired visual sensor, offer advantages such as high temporal resolution, dynamic range, low power consumption, and low latency. This paper presents a timely and comprehensive review of event-based vSLAM algorithms that exploit the benefits of asynchronous and irregular event streams for localization and mapping tasks. The review covers the working principle of event cameras and various event representations for preprocessing event data. It also categorizes event-based vSLAM methods into four main categories: feature-based, direct, motion-compensation, and deep learning methods, with detailed discussions and practical guidance for each approach. Furthermore, the paper evaluates the state-of-the-art methods on various benchmarks, highlighting current challenges and future opportunities in this emerging research area. A public repository will be maintained to keep track of the rapid developments in this field at {\url{https://github.com/kun150kun/ESLAM-survey}}

    Deep learning for inverse problems in remote sensing: super-resolution and SAR despeckling

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    LEARNING OF DENSE OPTICAL FLOW, MOTION AND DEPTH, FROM SPARSE EVENT CAMERAS

    Get PDF
    With recent advances in the field of autonomous driving, autonomous agents need to safely navigate around humans or other moving objects in unconstrained, highly dynamic environments. In this thesis, we demonstrate the feasibility of reconstructing dense depth, optical flow and motion information from a neuromorphic imaging device, called Dynamic Vision Sensor (DVS). The DVS only records sparse and asynchronous events when the changes of lighting occur at camera pixels. Our work is the first monocular pipeline that generates dense depth and optical flow from sparse event data only. To tackle this problem of reconstructing dense information from sparse information, we introduce the Evenly-Cascaded convolutional Network (ECN), a bio-inspired multi-level, multi-resolution neural network architecture. The network features an evenly-shaped design, and utilization of both high and low level features. With just 150k parameters, our self-supervised pipeline is able to surpass pipelines that are 100x larger. We evaluate our pipeline on the MVSEC self driving dataset and present results for depth, optical flow and and egomotion estimation in wild outdoor scenes. Due to the lightweight design, the inference part of the network runs at 250 FPS on a single GPU, making the pipeline ready for realtime robotics applications. Our experiments demonstrate significant improvements upon previous works that used deep learning on event data, as well as the ability of our pipeline to perform well during both day and night. We also extend our pipeline to dynamic indoor scenes with independent moving objects. In addition to camera egomotion and a dense depth map, the network utilizes a mixture model to segment and compute per-object 3D translational velocities for moving objects. For this indoor task we are able to train a shallow network with just 40k parameters, which computes qualitative depth and egomotion. Our analysis of the training shows modern neural networks are trained on tangled signals. This tangling effect can be imagined as a blurring introduced both by nature and by the training process. We propose to untangle the data with network deconvolution. We notice significantly better convergence without using any standard normalization techniques, which suggests us deconvolution is what we need

    Multisensor Fusion Remote Sensing Technology For Assessing Multitemporal Responses In Ecohydrological Systems

    Get PDF
    Earth ecosystems and environment have been changing rapidly due to the advanced technologies and developments of humans. Impacts caused by human activities and developments are difficult to acquire for evaluations due to the rapid changes. Remote sensing (RS) technology has been implemented for environmental managements. A new and promising trend in remote sensing for environment is widely used to measure and monitor the earth environment and its changes. RS allows large-scaled measurements over a large region within a very short period of time. Continuous and repeatable measurements are the very indispensable features of RS. Soil moisture is a critical element in the hydrological cycle especially in a semiarid or arid region. Point measurement to comprehend the soil moisture distribution contiguously in a vast watershed is difficult because the soil moisture patterns might greatly vary temporally and spatially. Space-borne radar imaging satellites have been popular because they have the capability to exhibit all weather observations. Yet the estimation methods of soil moisture based on the active or passive satellite imageries remain uncertain. This study aims at presenting a systematic soil moisture estimation method for the Choke Canyon Reservoir Watershed (CCRW), a semiarid watershed with an area of over 14,200 km2 in south Texas. With the aid of five corner reflectors, the RADARSAT-1 Synthetic Aperture Radar (SAR) imageries of the study area acquired in April and September 2004 were processed by both radiometric and geometric calibrations at first. New soil moisture estimation models derived by genetic programming (GP) technique were then developed and applied to support the soil moisture distribution analysis. The GP-based nonlinear function derived in the evolutionary process uniquely links a series of crucial topographic and geographic features. Included in this process are slope, aspect, vegetation cover, and soil permeability to compliment the well-calibrated SAR data. Research indicates that the novel application of GP proved useful for generating a highly nonlinear structure in regression regime, which exhibits very strong correlations statistically between the model estimates and the ground truth measurements (volumetric water content) on the basis of the unseen data sets. In an effort to produce the soil moisture distributions over seasons, it eventually leads to characterizing local- to regional-scale soil moisture variability and performing the possible estimation of water storages of the terrestrial hydrosphere. A new evolutionary computational, supervised classification scheme (Riparian Classification Algorithm, RICAL) was developed and used to identify the change of riparian zones in a semi-arid watershed temporally and spatially. The case study uniquely demonstrates an effort to incorporating both vegetation index and soil moisture estimates based on Landsat 5 TM and RADARSAT-1 imageries while trying to improve the riparian classification in the Choke Canyon Reservoir Watershed (CCRW), South Texas. The CCRW was selected as the study area contributing to the reservoir, which is mostly agricultural and range land in a semi-arid coastal environment. This makes the change detection of riparian buffers significant due to their interception capability of non-point source impacts within the riparian buffer zones and the maintenance of ecosystem integrity region wide. The estimation of soil moisture based on RADARSAT-1 Synthetic Aperture Radar (SAR) satellite imagery as previously developed was used. Eight commonly used vegetation indices were calculated from the reflectance obtained from Landsat 5 TM satellite images. The vegetation indices were individually used to classify vegetation cover in association with genetic programming algorithm. The soil moisture and vegetation indices were integrated into Landsat TM images based on a pre-pixel channel approach for riparian classification. Two different classification algorithms were used including genetic programming, and a combination of ISODATA and maximum likelihood supervised classification. The white box feature of genetic programming revealed the comparative advantage of all input parameters. The GP algorithm yielded more than 90% accuracy, based on unseen ground data, using vegetation index and Landsat reflectance band 1, 2, 3, and 4. The detection of changes in the buffer zone was proved to be technically feasible with high accuracy. Overall, the development of the RICAL algorithm may lead to the formulation of more effective management strategies for the handling of non-point source pollution control, bird habitat monitoring, and grazing and live stock management in the future. Soil properties, landscapes, channels, fault lines, erosion/deposition patches, and bedload transport history show geologic and geomorphologic features in a variety of watersheds. In response to these unique watershed characteristics, the hydrology of large-scale watersheds is often very complex. Precipitation, infiltration and percolation, stream flow, plant transpiration, soil moisture changes, and groundwater recharge are intimately related with each other to form water balance dynamics on the surface of these watersheds. Within this chapter, depicted is an optimal site selection technology using a grey integer programming (GIP) model to assimilate remote sensing-based geo-environmental patterns in an uncertain environment with respect to some technical and resources constraints. It enables us to retrieve the hydrological trends and pinpoint the most critical locations for the deployment of monitoring stations in a vast watershed. Geo-environmental information amassed in this study includes soil permeability, surface temperature, soil moisture, precipitation, leaf area index (LAI) and normalized difference vegetation index (NDVI). With the aid of a remote sensing-based GIP analysis, only five locations out of more than 800 candidate sites were selected by the spatial analysis, and then confirmed by a field investigation. The methodology developed in this remote sensing-based GIP analysis will significantly advance the state-of-the-art technology in optimum arrangement/distribution of water sensor platforms for maximum sensing coverage and information-extraction capacity. Effective water resources management is a critically important priority across the globe. While water scarcity limits the uses of water in many ways, floods also have caused so many damages and lives. To more efficiently use the limited amount of water or to resourcefully provide adequate time for flood warning, the results have led us to seek advanced techniques for improving streamflow forecasting. The objective of this section of research is to incorporate sea surface temperature (SST), Next Generation Radar (NEXRAD) and meteorological characteristics with historical stream data to forecast the actual streamflow using genetic programming. This study case concerns the forecasting of stream discharge of a complex-terrain, semi-arid watershed. This study elicits microclimatological factors and the resultant stream flow rate in river system given the influence of dynamic basin features such as soil moisture, soil temperature, ambient relative humidity, air temperature, sea surface temperature, and precipitation. Evaluations of the forecasting results are expressed in terms of the percentage error (PE), the root-mean-square error (RMSE), and the square of the Pearson product moment correlation coefficient (r-squared value). The developed models can predict streamflow with very good accuracy with an r-square of 0.84 and PE of 1% for a 30-day prediction
    • …
    corecore