20 research outputs found

    The aceToolbox: low-level audiovisual feature extraction for retrieval and classification

    Get PDF
    In this paper we present an overview of a software platform that has been developed within the aceMedia project, termed the aceToolbox, that provides global and local lowlevel feature extraction from audio-visual content. The toolbox is based on the MPEG-7 eXperimental Model (XM), with extensions to provide descriptor extraction from arbitrarily shaped image segments, thereby supporting local descriptors reflecting real image content. We describe the architecture of the toolbox as well as providing an overview of the descriptors supported to date. We also briefly describe the segmentation algorithm provided. We then demonstrate the usefulness of the toolbox in the context of two different content processing scenarios: similarity-based retrieval in large collections and scene-level classification of still images

    Bayesian Model Based Tracking with Application to Cell Segmentation and Tracking

    Get PDF
    The goal of this research is to develop a model-based tracking framework with biomedical imaging applications. This is an interdisciplinary area of research with interests in machine vision, image processing, and biology. This thesis presents methods of image modeling, tracking, and data association applied to problems in multi-cellular image analysis, especially hematopoietic stem cell (HSC) images at the current stage. The focus of this research is on the development of a robust image analysis interface capable of detecting, locating, and tracking individual hematopoietic stem cells (HSCs), which proliferate and differentiate to different blood cell types continuously during their lifetime, and are of substantial interest in gene therapy, cancer, and stem-cell research. Such a system can be potentially employed in the future to track different groups of HSCs extracted from bone marrow and recognize the best candidates based on some biomedical-biological criteria. Selected candidates can further be used for bone marrow transplantation (BMT) which is a medical procedure for the treatment of various incurable diseases such as leukemia, lymphomas, aplastic anemia, immune deficiency disorders, multiple myeloma and some solid tumors. Tracking HSCs over time is a localization-based tracking problem which is one of the most challenging tracking problems to be solved. The proposed cell tracking system consists of three inter-related stages: i) Cell detection/localization, ii) The association of detected cells, iii) Background estimation/subtraction. that will be discussed in detail

    Directional edge and texture representations for image processing

    Get PDF
    An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations

    Signal processing algorithms for enhanced image fusion performance and assessment

    Get PDF
    The dissertation presents several signal processing algorithms for image fusion in noisy multimodal conditions. It introduces a novel image fusion method which performs well for image sets heavily corrupted by noise. As opposed to current image fusion schemes, the method has no requirements for a priori knowledge of the noise component. The image is decomposed with Chebyshev polynomials (CP) being used as basis functions to perform fusion at feature level. The properties of CP, namely fast convergence and smooth approximation, renders it ideal for heuristic and indiscriminate denoising fusion tasks. Quantitative evaluation using objective fusion assessment methods show favourable performance of the proposed scheme compared to previous efforts on image fusion, notably in heavily corrupted images. The approach is further improved by incorporating the advantages of CP with a state-of-the-art fusion technique named independent component analysis (ICA), for joint-fusion processing based on region saliency. Whilst CP fusion is robust under severe noise conditions, it is prone to eliminating high frequency information of the images involved, thereby limiting image sharpness. Fusion using ICA, on the other hand, performs well in transferring edges and other salient features of the input images into the composite output. The combination of both methods, coupled with several mathematical morphological operations in an algorithm fusion framework, is considered a viable solution. Again, according to the quantitative metrics the results of our proposed approach are very encouraging as far as joint fusion and denoising are concerned. Another focus of this dissertation is on a novel metric for image fusion evaluation that is based on texture. The conservation of background textural details is considered important in many fusion applications as they help define the image depth and structure, which may prove crucial in many surveillance and remote sensing applications. Our work aims to evaluate the performance of image fusion algorithms based on their ability to retain textural details from the fusion process. This is done by utilising the gray-level co-occurrence matrix (GLCM) model to extract second-order statistical features for the derivation of an image textural measure, which is then used to replace the edge-based calculations in an objective-based fusion metric. Performance evaluation on established fusion methods verifies that the proposed metric is viable, especially for multimodal scenarios

    Automated Complexity-Sensitive Image Fusion

    Get PDF
    To construct a complete representation of a scene with environmental obstacles such as fog, smoke, darkness, or textural homogeneity, multisensor video streams captured in diferent modalities are considered. A computational method for automatically fusing multimodal image streams into a highly informative and unified stream is proposed. The method consists of the following steps: 1. Image registration is performed to align video frames in the visible band over time, adapting to the nonplanarity of the scene by automatically subdividing the image domain into regions approximating planar patches 2. Wavelet coefficients are computed for each of the input frames in each modality 3. Corresponding regions and points are compared using spatial and temporal information across various scales 4. Decision rules based on the results of multimodal image analysis are used to combine thewavelet coefficients from different modalities 5. The combined wavelet coefficients are inverted to produce an output frame containing useful information gathered from the available modalities Experiments show that the proposed system is capable of producing fused output containing the characteristics of color visible-spectrum imagery while adding information exclusive to infrared imagery, with attractive visual and informational properties

    Joint Spatial-Angular Sparse Coding, Compressed Sensing, and Dictionary Learning for Diffusion MRI

    Get PDF
    Neuroimaging provides a window into the inner workings of the human brain to diagnose and prevent neurological diseases and understand biological brain function, anatomy, and psychology. Diffusion Magnetic Resonance Imaging (dMRI) is an emerging medical imaging modality used to study the anatomical network of neurons in the brain, which form cohesive bundles, or fiber tracts, that connect various parts of the brain. Since about 73% of the brain is water, measuring the flow, or diffusion of water molecules in the presence of fiber bundles, allows researchers to estimate the orientation of fiber tracts and reconstruct the internal wiring of the brain, in vivo. Diffusion MRI signals can be modeled within two domains: the spatial domain consisting of voxels in a brain volume and the diffusion or angular domain, where fiber orientation is estimated in each voxel. Researchers aim to estimate the probability distribution of fiber orientation in every voxel of a brain volume in order to trace paths of fiber tracts from voxel to voxel over the entire brain. Therefore, the traditional framework for dMRI processing and analysis has been from a voxel-wise vantage point with added spatial regularization considered post-hoc. In contrast, we propose a new joint spatial-angular representation of dMRI data which pairs signals in each voxel with the global spatial environment, jointly. This has the ability to improve many aspects of dMRI processing and analysis and re-envision the core representation of dMRI data from a local perspective to a global one. In this thesis, we propose three main contributions which take advantage of such joint spatial-angular representations to improve major machine learning tasks applied to dMRI: sparse coding, compressed sensing, and dictionary learning. First, we will show that we can achieve sparser representations of dMRI by utilizing a global spatial-angular dictionary instead of a purely voxel-wise angular dictionary. As dMRI data is very large in size, we provide a number of novel extensions to popular spare coding algorithms that perform efficient optimization on a global-scale by exploiting the separability of our dictionaries over the spatial and angular domains. Next, compressed sensing is used to accelerate signal acquisition based on an underlying sparse representation of the data. We will show that our proposed representation has the potential to push the limits of the current state of scanner acceleration within a new compressed sensing model for dMRI. Finally, sparsity can be further increased by learning dictionaries directly from datasets of interest. Prior dictionary learning for dMRI learn angular dictionaries alone. Our third contribution is to learn spatial-angular dictionaries jointly from dMRI data directly to better represent the global structure. Traditionally, the problem of dictionary learning is non-convex with no guarantees of finding a globally optimal solution. We derive the first theoretical results of global optimality for this class of dictionary learning problems. We hope the core foundation of a joint spatial-angular representation will open a new perspective on dMRI with respect to many other processing tasks and analyses. In addition, our contributions are applicable to any general signal types that can benefit from separable dictionaries. We hope the contributions in this thesis may be adopted in the larger signal processing, computer vision, and machine learning communities. dMRI signals can be modeled within two domains: the spatial domain consisting of voxels in a brain volume and the diffusion or angular domain, where fiber orientation is estimated in each voxel. Computationally speaking, researchers aim to estimate the probability distribution of fiber orientation in every voxel of a brain volume in order to trace paths of fiber tracts from voxel to voxel over the entire brain. Therefore, the traditional framework for dMRI processing and analysis is from a voxel-wise, or angular, vantage point with post-hoc consideration of their local spatial neighborhoods. In contrast, we propose a new global spatial-angular representation of dMRI data which pairs signals in each voxel with the global spatial environment, jointly, to improve many aspects of dMRI processing and analysis, including the important need for accelerating the otherwise time-consuming acquisition of advanced dMRI protocols. In this thesis, we propose three main contributions which utilize our joint spatial-angular representation to improve major machine learning tasks applied to dMRI: sparse coding, compressed sensing, and dictionary learning. We will show that sparser codes are possible by utilizing a global dictionary instead of a voxel-wise angular dictionary. This allows for a reduction of the number of measurements needed to reconstruct a dMRI signal to increase acceleration using compressed sensing. Finally, instead of learning angular dictionaries alone, we learn spatial-angular dictionaries jointly from dMRI data directly to better represent the global structure. In addition, this problem is non-convex and so we derive the first theories to guarantee convergence to a global minimum. As dMRI data is very large in size, we provide a number of novel extensions to popular algorithms that perform efficient optimization on a global-scale by exploiting the separability of our global dictionaries over the spatial and angular domains. We hope the core foundation of a joint spatial-angular representation will open a new perspective on dMRI with respect to many other processing tasks and analyses. In addition, our contributions are applicable to any separable dictionary setting which we hope may be adopted in the larger image processing, computer vision, and machine learning communities

    Sparse image approximation with application to flexible image coding

    Get PDF
    Natural images are often modeled through piecewise-smooth regions. Region edges, which correspond to the contours of the objects, become, in this model, the main information of the signal. Contours have the property of being smooth functions along the direction of the edge, and irregularities on the perpendicular direction. Modeling edges with the minimum possible number of terms is of key importance for numerous applications, such as image coding, segmentation or denoising. Standard separable basis fail to provide sparse enough representation of contours, due to the fact that this kind of basis do not see the regularity of edges. In order to be able to detect this regularity, a new method based on (possibly redundant) sets of basis functions able to capture the geometry of images is needed. This thesis presents, in a first stage, a study about the features that basis functions should have in order to provide sparse representations of a piecewise-smooth image. This study emphasizes the need for edge-adapted basis functions, capable to accurately capture local orientation and anisotropic scaling of image structures. The need of different anisotropy degrees and orientations in the basis function set leads to the use of redundant dictionaries. However, redundant dictionaries have the inconvenience of giving no unique sparse image decompositions, and from all the possible decompositions of a signal in a redundant dictionary, just the sparsest is needed. There are several algorithms that allow to find sparse decompositions over redundant dictionaries, but most of these algorithms do not always guarantee that the optimal approximation has been recovered. To cope with this problem, a mathematical study about the properties of sparse approximations is performed. From this, a test to check whether a given sparse approximation is the sparsest is provided. The second part of this thesis presents a novel image approximation scheme, based on the use of a redundant dictionary. This scheme allows to have a good approximation of an image with a number of terms much smaller than the dimension of the signal. This novel approximation scheme is based on a dictionary formed by a combination of anisotropically refined and rotated wavelet-like mother functions and Gaussians. An efficient Full Search Matching Pursuit algorithm to perform the image decomposition in such a dictionary is designed. Finally, a geometric image coding scheme based on the image approximated over the anisotropic and rotated dictionary of basis functions is designed. The coding performances of this dictionary are studied. Coefficient quantization appears to be of crucial importance in the design of a Matching Pursuit based coding scheme. Thus, a quantization scheme for the MP coefficients has been designed, based on the theoretical energy upper bound of the MP algorithm and the empirical observations of the coefficient distribution and evolution. Thanks to this quantization, our image coder provides low to medium bit-rate image approximations, while it allows for on the fly resolution switching and several other affine image transformations to be performed directly in the transformed domain

    Unsupervised multi-scale change detection from SAR imagery for monitoring natural and anthropogenic disasters

    Get PDF
    Thesis (Ph.D.) University of Alaska Fairbanks, 2017Radar remote sensing can play a critical role in operational monitoring of natural and anthropogenic disasters. Despite its all-weather capabilities, and its high performance in mapping, and monitoring of change, the application of radar remote sensing in operational monitoring activities has been limited. This has largely been due to: (1) the historically high costs associated with obtaining radar data; (2) slow data processing, and delivery procedures; and (3) the limited temporal sampling that was provided by spaceborne radar-based satellites. Recent advances in the capabilities of spaceborne Synthetic Aperture Radar (SAR) sensors have developed an environment that now allows for SAR to make significant contributions to disaster monitoring. New SAR processing strategies that can take full advantage of these new sensor capabilities are currently being developed. Hence, with this PhD dissertation, I aim to: (i) investigate unsupervised change detection techniques that can reliably extract signatures from time series of SAR images, and provide the necessary flexibility for application to a variety of natural, and anthropogenic hazard situations; (ii) investigate effective methods to reduce the effects of speckle and other noise on change detection performance; (iii) automate change detection algorithms using probabilistic Bayesian inferencing; and (iv) ensure that the developed technology is applicable to current, and future SAR sensors to maximize temporal sampling of a hazardous event. This is achieved by developing new algorithms that rely on image amplitude information only, the sole image parameter that is available for every single SAR acquisition. The motivation and implementation of the change detection concept are described in detail in Chapter 3. In the same chapter, I demonstrated the technique's performance using synthetic data as well as a real-data application to map wildfire progression. I applied Radiometric Terrain Correction (RTC) to the data to increase the sampling frequency, while the developed multiscaledriven approach reliably identified changes embedded in largely stationary background scenes. With this technique, I was able to identify the extent of burn scars with high accuracy. I further applied the application of the change detection technology to oil spill mapping. The analysis highlights that the approach described in Chapter 3 can be applied to this drastically different change detection problem with only little modification. While the core of the change detection technique remained unchanged, I made modifications to the pre-processing step to enable change detection from scenes of continuously varying background. I introduced the Lipschitz regularity (LR) transformation as a technique to normalize the typically dynamic ocean surface, facilitating high performance oil spill detection independent of environmental conditions during image acquisition. For instance, I showed that LR processing reduces the sensitivity of change detection performance to variations in surface winds, which is a known limitation in oil spill detection from SAR. Finally, I applied the change detection technique to aufeis flood mapping along the Sagavanirktok River. Due to the complex nature of aufeis flooded areas, I substituted the resolution-preserving speckle filter used in Chapter 3 with curvelet filters. In addition to validating the performance of the change detection results, I also provide evidence of the wealth of information that can be extracted about aufeis flooding events once a time series of change detection information was extracted from SAR imagery. A summary of the developed change detection techniques is conducted and suggested future work is presented in Chapter 6
    corecore