328 research outputs found

    A Multi-scale colour and Keypoint Density-based Approach for Visual Saliency Detection.

    Get PDF
    In the first seconds of observation of an image, several visual attention processes are involved in the identification of the visual targets that pop-out from the scene to our eyes. Saliency is the quality that makes certain regions of an image stand out from the visual field and grab our attention. Saliency detection models, inspired by visual cortex mechanisms, employ both colour and luminance features. Furthermore, both locations of pixels and presence of objects influence the Visual Attention processes. In this paper, we propose a new saliency method based on the combination of the distribution of interest points in the image with multiscale analysis, a centre bias module and a machine learning approach. We use perceptually uniform colour spaces to study how colour impacts on the extraction of saliency. To investigate eye-movements and assess the performances of saliency methods over object-based images, we conduct experimental sessions on our dataset ETTO (Eye Tracking Through Objects). Experiments show our approach to be accurate in the detection of saliency concerning state-of-the-art methods and accessible eye-movement datasets. The performances over object-based images are excellent and consistent on generic pictures. Besides, our work reveals interesting findings on some relationships between saliency and perceptually uniform colour spaces

    Unsupervised image saliency detection with Gestalt-laws guided optimization and visual attention based refinement.

    Get PDF
    Visual attention is a kind of fundamental cognitive capability that allows human beings to focus on the region of interests (ROIs) under complex natural environments. What kind of ROIs that we pay attention to mainly depends on two distinct types of attentional mechanisms. The bottom-up mechanism can guide our detection of the salient objects and regions by externally driven factors, i.e. color and location, whilst the top-down mechanism controls our biasing attention based on prior knowledge and cognitive strategies being provided by visual cortex. However, how to practically use and fuse both attentional mechanisms for salient object detection has not been sufficiently explored. To the end, we propose in this paper an integrated framework consisting of bottom-up and top-down attention mechanisms that enable attention to be computed at the level of salient objects and/or regions. Within our framework, the model of a bottom-up mechanism is guided by the gestalt-laws of perception. We interpreted gestalt-laws of homogeneity, similarity, proximity and figure and ground in link with color, spatial contrast at the level of regions and objects to produce feature contrast map. The model of top-down mechanism aims to use a formal computational model to describe the background connectivity of the attention and produce the priority map. Integrating both mechanisms and applying to salient object detection, our results have demonstrated that the proposed method consistently outperforms a number of existing unsupervised approaches on five challenging and complicated datasets in terms of higher precision and recall rates, AP (average precision) and AUC (area under curve) values

    Target detection, tracking, and localization using multi-spectral image fusion and RF Doppler differentials

    Get PDF
    It is critical for defense and security applications to have a high probability of detection and low false alarm rate while operating over a wide variety of conditions. Sensor fusion, which is the the process of combining data from two or more sensors, has been utilized to improve the performance of a system by exploiting the strengths of each sensor. This dissertation presents algorithms to fuse multi-sensor data that improves system performance by increasing detection rates, lowering false alarms, and improving track performance. Furthermore, this dissertation presents a framework for comparing algorithm error for image registration which is a critical pre-processing step for multi-spectral image fusion. First, I present an algorithm to improve detection and tracking performance for moving targets in a cluttered urban environment by fusing foreground maps from multi-spectral imagery. Most research in image fusion consider visible and long-wave infrared bands; I examine these bands along with near infrared and mid-wave infrared. To localize and track a particular target of interest, I present an algorithm to fuse output from the multi-spectral image tracker with a constellation of RF sensors measuring a specific cellular emanation. The fusion algorithm matches the Doppler differential from the RF sensors with the theoretical Doppler Differential of the video tracker output by selecting the sensor pair that minimizes the absolute difference or root-mean-square difference. Finally, a framework to quantify shift-estimation error for both area- and feature-based algorithms is presented. By exploiting synthetically generated visible and long-wave infrared imagery, error metrics are computed and compared for a number of area- and feature-based shift estimation algorithms. A number of key results are presented in this dissertation. The multi-spectral image tracker improves the location accuracy of the algorithm while improving the detection rate and lowering false alarms for most spectral bands. All 12 moving targets were tracked through the video sequence with only one lost track that was later recovered. Targets from the multi-spectral tracking algorithm were correctly associated with their corresponding cellular emanation for all targets at lower measurement uncertainty using the root-mean-square difference while also having a high confidence ratio for selecting the true target from background targets. For the area-based algorithms and the synthetic air-field image pair, the DFT and ECC algorithms produces sub-pixel shift-estimation error in regions such as shadows and high contrast painted line regions. The edge orientation feature descriptors increase the number of sub-field estimates while improving the shift-estimation error compared to the Lowe descriptor

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Two and three dimensional segmentation of multimodal imagery

    Get PDF
    The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

    Object detection in dual-band infrared

    Get PDF
    Dual-Band Infrared (DBIR) offers the advantage of combining Mid-Wave Infrared (MWIR) and Long-Wave Infrared (LWIR) within a single field-of-view (FoV). This provides additional information for each spectral band. DBIR camera systems find applications in both military and civilian contexts. This work introduces a novel labeled DBIR dataset that includes civilian vehicles, aircraft, birds, and people. The dataset is designed for utilization in object detection and tracking algorithms. It comprises 233 objects with tracks spanning up to 1,300 frames, encompassing images in both MW and LW. This research reviews pertinent literature related to object detection, object detection in the infrared spectrum, and data fusion. Two sets of experiments were conducted using this DBIR dataset: Motion Detection and CNNbased object detection. For motion detection, a parallel implementation of the Visual Background Extractor (ViBe) was developed, employing ConnectedComponents analysis to generate bounding boxes. To assess these bounding boxes, Intersection-over-Union (IoU) calculations were performed. The results demonstrate that DBIR enhances the IoU of bounding boxes in 6.11% of cases within sequences where the camera’s field of view remains stationary. A size analysis reveals ViBe’s effectiveness in detecting small and dim objects within this dataset. A subsequent experiment employed You Only Look Once (YOLO) versions 4 and 7 to conduct inference on this dataset, following image preprocessing. The inference models were trained using visible spectrum MS COCO data. The findings confirm that YOLOv4/7 effectively detect objects within the infrared spectrum in this dataset. An assessment of these CNNs’ performance relative to the size of the detected object highlights the significance of object size in detection capabilities. Notably, DBIR substantially enhances detection capabilities in both YOLOv4 and YOLOv7; however, in the latter case, the number of False Positive detections increases. Consequently, while DBIR improves the recall of YOLOv4/7, the introduction of DBIR information reduces the precision of YOLOv7. This study also demonstrates the complementary nature of ViBe and YOLO in their detection capabilities based on object size in this data set. Though this is known prior art, an approach using these two approaches in a hybridized configuration is discussed. ViBe excels in detecting small, distant objects, while YOLO excels in detecting larger, closer objects. The research underscores that DBIR offers multiple advantages over MW or LW alone in modern computer vision algorithms, warranting further research investment

    Application of Multi-Sensor Fusion Technology in Target Detection and Recognition

    Get PDF
    Application of multi-sensor fusion technology has drawn a lot of industrial and academic interest in recent years. The multi-sensor fusion methods are widely used in many applications, such as autonomous systems, remote sensing, video surveillance, and the military. These methods can obtain the complementary properties of targets by considering multiple sensors. On the other hand, they can achieve a detailed environment description and accurate detection of interest targets based on the information from different sensors.This book collects novel developments in the field of multi-sensor, multi-source, and multi-process information fusion. Articles are expected to emphasize one or more of the three facets: architectures, algorithms, and applications. Published papers dealing with fundamental theoretical analyses, as well as those demonstrating their application to real-world problems

    Hypothesis Testing Using Spatially Dependent Heavy-Tailed Multisensor Data

    Get PDF
    The detection of spatially dependent heavy-tailed signals is considered in this dissertation. While the central limit theorem, and its implication of asymptotic normality of interacting random processes, is generally useful for the theoretical characterization of a wide variety of natural and man-made signals, sensor data from many different applications, in fact, are characterized by non-Gaussian distributions. A common characteristic observed in non-Gaussian data is the presence of heavy-tails or fat tails. For such data, the probability density function (p.d.f.) of extreme values decay at a slower-than-exponential rate, implying that extreme events occur with greater probability. When these events are observed simultaneously by several sensors, their observations are also spatially dependent. In this dissertation, we develop the theory of detection for such data, obtained through heterogeneous sensors. In order to validate our theoretical results and proposed algorithms, we collect and analyze the behavior of indoor footstep data using a linear array of seismic sensors. We characterize the inter-sensor dependence using copula theory. Copulas are parametric functions which bind univariate p.d.f. s, to generate a valid joint p.d.f. We model the heavy-tailed data using the class of alpha-stable distributions. We consider a two-sided test in the Neyman-Pearson framework and present an asymptotic analysis of the generalized likelihood test (GLRT). Both, nested and non-nested models are considered in the analysis. We also use a likelihood maximization-based copula selection scheme as an integral part of the detection process. Since many types of copula functions are available in the literature, selecting the appropriate copula becomes an important component of the detection problem. The performance of the proposed scheme is evaluated numerically on simulated data, as well as using indoor seismic data. With appropriately selected models, our results demonstrate that a high probability of detection can be achieved for false alarm probabilities of the order of 10^-4. These results, using dependent alpha-stable signals, are presented for a two-sensor case. We identify the computational challenges associated with dependent alpha-stable modeling and propose alternative schemes to extend the detector design to a multisensor (multivariate) setting. We use a hierarchical tree based approach, called vines, to model the multivariate copulas, i.e., model the spatial dependence between multiple sensors. The performance of the proposed detectors under the vine-based scheme are evaluated on the indoor footstep data, and significant improvement is observed when compared against the case when only two sensors are deployed. Some open research issues are identified and discussed

    Adaptive detection and tracking using multimodal information

    Get PDF
    This thesis describes work on fusing data from multiple sources of information, and focuses on two main areas: adaptive detection and adaptive object tracking in automated vision scenarios. The work on adaptive object detection explores a new paradigm in dynamic parameter selection, by selecting thresholds for object detection to maximise agreement between pairs of sources. Object tracking, a complementary technique to object detection, is also explored in a multi-source context and an efficient framework for robust tracking, termed the Spatiogram Bank tracker, is proposed as a means to overcome the difficulties of traditional histogram tracking. As well as performing theoretical analysis of the proposed methods, specific example applications are given for both the detection and the tracking aspects, using thermal infrared and visible spectrum video data, as well as other multi-modal information sources
    corecore