16 research outputs found

    Detection of Salient Objects in Images Using Frequency Domain and Deep Convolutional Features

    Get PDF
    In image processing and computer vision tasks such as object of interest image segmentation, adaptive image compression, object based image retrieval, seam carving, and medical imaging, the cost of information storage and computational complexity is generally a great concern. Therefore, for these and other applications, identifying and focusing only on the parts of the image that are visually most informative is much desirable. These most informative parts or regions that also have more contrast with the rest of the image are called the salient regions of the image, and the process of identifying them is referred to as salient object detection. The main challenges in devising a salient object detection scheme are in extracting the image features that correctly differentiate the salient objects from the non-salient ones, and then utilizing them to detect the salient objects accurately. Several salient object detection methods have been developed in the literature using spatial domain image features. However, these methods generally cannot detect the salient objects uniformly or with clear boundaries between the salient and non-salient regions. This is due to the fact that in these methods, unnecessary frequency content of the image get retained or the useful ones from the original image get suppressed. Frequency domain features can address these limitations by providing a better representation of the image. Some salient object detection schemes have been developed based on the features extracted using the Fourier or Fourier like transforms. While these methods are more successful in detecting the entire salient object in images with small salient regions, in images with large salient regions these methods have a tendency to highlight the boundaries of the salient region rather than doing so for the entire salient region. This is due to the fact that in the Fourier transform of an image, the global contrast is more dominant than the local ones. Moreover, it is known that the Fourier transform cannot provide simultaneous spatial and frequency localization. It is known that multi-resolution feature extraction techniques can provide more accurate features for different image processing tasks, since features that might not get extracted at one resolution may be detected at another resolution. However, not much work has been done to employ multi-resolution feature extraction techniques for salient object detection. In view of this, the objective of this thesis is to develop schemes for image salient object detection using multi-resolution feature extraction techniques both in the frequency domain and the spatial domain. The first part of this thesis is concerned with developing salient object detection methods using multi-resolution frequency domain features. The wavelet transform has the ability of performing multi-resolution simultaneous spatial and frequency localized analysis, which makes it a better feature extraction tool compared to the Fourier or other Fourier like transforms. In this part of the thesis, first a salient object detection scheme is developed by extracting features from the high-pass coefficients of the wavelet decompositions of the three color channels of images, and devising a scheme for the weighted linear combination of the color channel features. Despite the advantages of the wavelet transform in image feature extraction, it is not very effective in capturing line discontinuities, which correspond to directional information in the image. In order to circumvent the lack of directional flexibility of the wavelet-based features, in this part of the thesis, another salient object detection scheme is also presented by extracting local and global features from the non-subsampled contourlet coefficients of the image color channels. The local features are extracted from the local variations of the low-pass coefficients, whereas the global features are obtained based on the distribution of the subband coefficients afforded by the directional flexibility provided by the non-subsampled contourlet transform. In the past few years, there has been a surge of interest in employing deep convolutional neural networks to extract image features for different applications. These networks provide a platform for automatically extracting low-level appearance features and high-level semantic features at different resolutions from the raw images. The second part of this thesis is, therefore, concerned with the investigation of salient object detection using multiresolution deep convolutional features. The existing deep salient object detection schemes are based on the standard convolution. However, performing the standard convolution is computationally expensive specially when the number of channels increases through the layers of a deep network. In this part of the thesis, using a lightweight depthwise separable convolution, a deep salient object detection network that exploits the fusion of multi-level and multi-resolution image features through judicious skip connections between the layers is developed. The proposed deep salient object detection network is aimed at providing good performance with a much reduced complexity compared to the existing deep salient object detection methods. Extensive experiments are conducted in order to evaluate the performance of the proposed salient object detection methods by applying them to the natural images from several datasets. It is shown that the performance of the proposed methods are superior to that of the existing methods of salient object detection

    Infrared and visible image fusion with edge detail implantation

    Get PDF
    Infrared and visible image fusion aims to integrate complementary information from the same scene images captured by different types of sensors into one image to obtain a fusion image with richer information. Recently, deep learning-based infrared and visible image fusion methods have been widely used. However, it is still a difficult problem how to maintain the edge detail information in the source images more effectively. To address this problem, we propose a novel infrared and visible image fusion method with edge detail implantation. The proposed method no longer improves the performance of edge details in the fused image through making the extracted features contain edge detail information like traditional methods, but by processing source image information and edge detail information separately, and supplementing edge details to the main framework. Technically, we propose a two-branch feature representation framework. One branch is used to directly extract features from the input source image, while the other is utilized to extract features of edge map. The edge detail branch mainly provides edge detail features for the source image input branch, ensuring that the output features contain rich edge detail information. In the fusion of multi-source features, we respectively fuse the source image features and the edge detail features, and use the fusion results of edge details to guide and enhance the fusion results of source image features so that they contain richer edge detail information. A large number of experimental results demonstrate the effectiveness of the proposed method

    Sparse and Redundant Representations for Inverse Problems and Recognition

    Get PDF
    Sparse and redundant representation of data enables the description of signals as linear combinations of a few atoms from a dictionary. In this dissertation, we study applications of sparse and redundant representations in inverse problems and object recognition. Furthermore, we propose two novel imaging modalities based on the recently introduced theory of Compressed Sensing (CS). This dissertation consists of four major parts. In the first part of the dissertation, we study a new type of deconvolution algorithm that is based on estimating the image from a shearlet decomposition. Shearlets provide a multi-directional and multi-scale decomposition that has been mathematically shown to represent distributed discontinuities such as edges better than traditional wavelets. We develop a deconvolution algorithm that allows for the approximation inversion operator to be controlled on a multi-scale and multi-directional basis. Furthermore, we develop a method for the automatic determination of the threshold values for the noise shrinkage for each scale and direction without explicit knowledge of the noise variance using a generalized cross validation method. In the second part of the dissertation, we study a reconstruction method that recovers highly undersampled images assumed to have a sparse representation in a gradient domain by using partial measurement samples that are collected in the Fourier domain. Our method makes use of a robust generalized Poisson solver that greatly aids in achieving a significantly improved performance over similar proposed methods. We will demonstrate by experiments that this new technique is more flexible to work with either random or restricted sampling scenarios better than its competitors. In the third part of the dissertation, we introduce a novel Synthetic Aperture Radar (SAR) imaging modality which can provide a high resolution map of the spatial distribution of targets and terrain using a significantly reduced number of needed transmitted and/or received electromagnetic waveforms. We demonstrate that this new imaging scheme, requires no new hardware components and allows the aperture to be compressed. Also, it presents many new applications and advantages which include strong resistance to countermesasures and interception, imaging much wider swaths and reduced on-board storage requirements. The last part of the dissertation deals with object recognition based on learning dictionaries for simultaneous sparse signal approximations and feature extraction. A dictionary is learned for each object class based on given training examples which minimize the representation error with a sparseness constraint. A novel test image is then projected onto the span of the atoms in each learned dictionary. The residual vectors along with the coefficients are then used for recognition. Applications to illumination robust face recognition and automatic target recognition are presented

    Sonar image interpretation for sub-sea operations

    Get PDF
    Mine Counter-Measure (MCM) missions are conducted to neutralise underwater explosives. Automatic Target Recognition (ATR) assists operators by increasing the speed and accuracy of data review. ATR embedded on vehicles enables adaptive missions which increase the speed of data acquisition. This thesis addresses three challenges; the speed of data processing, robustness of ATR to environmental conditions and the large quantities of data required to train an algorithm. The main contribution of this thesis is a novel ATR algorithm. The algorithm uses features derived from the projection of 3D boxes to produce a set of 2D templates. The template responses are independent of grazing angle, range and target orientation. Integer skewed integral images, are derived to accelerate the calculation of the template responses. The algorithm is compared to the Haar cascade algorithm. For a single model of sonar and cylindrical targets the algorithm reduces the Probability of False Alarm (PFA) by 80% at a Probability of Detection (PD) of 85%. The algorithm is trained on target data from another model of sonar. The PD is only 6% lower even though no representative target data was used for training. The second major contribution is an adaptive ATR algorithm that uses local sea-floor characteristics to address the problem of ATR robustness with respect to the local environment. A dual-tree wavelet decomposition of the sea-floor and an Markov Random Field (MRF) based graph-cut algorithm is used to segment the terrain. A Neural Network (NN) is then trained to filter ATR results based on the local sea-floor context. It is shown, for the Haar Cascade algorithm, that the PFA can be reduced by 70% at a PD of 85%. Speed of data processing is addressed using novel pre-processing techniques. The standard three class MRF, for sonar image segmentation, is formulated using graph-cuts. Consequently, a 1.2 million pixel image is segmented in 1.2 seconds. Additionally, local estimation of class models is introduced to remove range dependent segmentation quality. Finally, an A* graph search is developed to remove the surface return, a line of saturated pixels often detected as false alarms by ATR. The A* search identifies the surface return in 199 of 220 images tested with a runtime of 2.1 seconds. The algorithm is robust to the presence of ripples and rocks

    Advances in Image Processing, Analysis and Recognition Technology

    Get PDF
    For many decades, researchers have been trying to make computers’ analysis of images as effective as the system of human vision is. For this purpose, many algorithms and systems have previously been created. The whole process covers various stages, including image processing, representation and recognition. The results of this work can be applied to many computer-assisted areas of everyday life. They improve particular activities and provide handy tools, which are sometimes only for entertainment, but quite often, they significantly increase our safety. In fact, the practical implementation of image processing algorithms is particularly wide. Moreover, the rapid growth of computational complexity and computer efficiency has allowed for the development of more sophisticated and effective algorithms and tools. Although significant progress has been made so far, many issues still remain, resulting in the need for the development of novel approaches

    Quantitative analysis with machine learning models for multi-parametric brain imaging data

    Get PDF
    Gliomas are considered to be the most common primary adult malignant brain tumor. With the dramatic increases in computational power and improvements in image analysis algorithms, computer-aided medical image analysis has been introduced into clinical applications. Precision tumor grading and genotyping play an indispensable role in clinical diagnosis, treatment and prognosis. Gliomas diagnostic procedures include histopathological imaging tests, molecular imaging scans and tumor grading. Pathologic review of tumor morphology in histologic sections is the traditional method for cancer classification and grading, yet human study has limitations that can result in low reproducibility and inter-observer agreement. Compared with histopathological images, Magnetic resonance (MR) imaging present the different structure and functional features, which might serve as noninvasive surrogates for tumor genotypes. Therefore, computer-aided image analysis has been adopted in clinical application, which might partially overcome these shortcomings due to its capacity to quantitatively and reproducibly measure multilevel features on multi-parametric medical information. Imaging features obtained from a single modal image do not fully represent the disease, so quantitative imaging features, including morphological, structural, cellular and molecular level features, derived from multi-modality medical images should be integrated into computer-aided medical image analysis. The image quality differentiation between multi-modality images is a challenge in the field of computer-aided medical image analysis. In this thesis, we aim to integrate the quantitative imaging data obtained from multiple modalities into mathematical models of tumor prediction response to achieve additional insights into practical predictive value. Our major contributions in this thesis are: 1. Firstly, to resolve the imaging quality difference and observer-dependent in histological image diagnosis, we proposed an automated machine-learning brain tumor-grading platform to investigate contributions of multi-parameters from multimodal data including imaging parameters or features from Whole Slide Images (WSI) and the proliferation marker KI-67. For each WSI, we extract both visual parameters such as morphology parameters and sub-visual parameters including first-order and second-order features. A quantitative interpretable machine learning approach (Local Interpretable Model-Agnostic Explanations) was followed to measure the contribution of features for single case. Most grading systems based on machine learning models are considered “black boxes,” whereas with this system the clinically trusted reasoning could be revealed. The quantitative analysis and explanation may assist clinicians to better understand the disease and accordingly to choose optimal treatments for improving clinical outcomes. 2. Based on the automated brain tumor-grading platform we propose, multimodal Magnetic Resonance Images (MRIs) have been introduced in our research. A new imaging–tissue correlation based approach called RA-PA-Thomics was proposed to predict the IDH genotype. Inspired by the concept of image fusion, we integrate multimodal MRIs and the scans of histopathological images for indirect, fast, and cost saving IDH genotyping. The proposed model has been verified by multiple evaluation criteria for the integrated data set and compared to the results in the prior art. The experimental data set includes public data sets and image information from two hospitals. Experimental results indicate that the model provided improves the accuracy of glioma grading and genotyping

    Intelligent Computational Transportation

    Get PDF
    Transportation is commonplace around our world. Numerous researchers dedicate great efforts to vast transportation research topics. The purpose of this dissertation is to investigate and address a couple of transportation problems with respect to geographic discretization, pavement surface automatic examination, and traffic ow simulation, using advanced computational technologies. Many applications require a discretized 2D geographic map such that local information can be accessed efficiently. For example, map matching, which aligns a sequence of observed positions to a real-world road network, needs to find all the nearby road segments to the individual positions. To this end, the map is discretized by cells and each cell retains a list of road segments coincident with this cell. An efficient method is proposed to form such lists for the cells without costly overlapping tests. Furthermore, the method can be easily extended to 3D scenarios for fast triangle mesh voxelization. Pavement surface distress conditions are critical inputs for quantifying roadway infrastructure serviceability. Existing computer-aided automatic examination techniques are mainly based on 2D image analysis or 3D georeferenced data set. The disadvantage of information losses or extremely high costs impedes their effectiveness iv and applicability. In this study, a cost-effective Kinect-based approach is proposed for 3D pavement surface reconstruction and cracking recognition. Various cracking measurements such as alligator cracking, traverse cracking, longitudinal cracking, etc., are identified and recognized for their severity examinations based on associated geometrical features. Smart transportation is one of the core components in modern urbanization processes. Under this context, the Connected Autonomous Vehicle (CAV) system presents a promising solution towards the enhanced traffic safety and mobility through state-of-the-art wireless communications and autonomous driving techniques. Due to the different nature between the CAVs and the conventional Human- Driven-Vehicles (HDVs), it is believed that CAV-enabled transportation systems will revolutionize the existing understanding of network-wide traffic operations and re-establish traffic ow theory. This study presents a new continuum dynamics model for the future CAV-enabled traffic system, realized by encapsulating mutually-coupled vehicle interactions using virtual internal and external forces. A Smoothed Particle Hydrodynamics (SPH)-based numerical simulation and an interactive traffic visualization framework are also developed

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
    corecore