59 research outputs found

    Use of Coherent Point Drift in computer vision applications

    Get PDF
    This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration

    Video and Imaging, 2013-2016

    Get PDF

    Video surveillance systems-current status and future trends

    Get PDF
    Within this survey an attempt is made to document the present status of video surveillance systems. The main components of a surveillance system are presented and studied thoroughly. Algorithms for image enhancement, object detection, object tracking, object recognition and item re-identification are presented. The most common modalities utilized by surveillance systems are discussed, putting emphasis on video, in terms of available resolutions and new imaging approaches, like High Dynamic Range video. The most important features and analytics are presented, along with the most common approaches for image / video quality enhancement. Distributed computational infrastructures are discussed (Cloud, Fog and Edge Computing), describing the advantages and disadvantages of each approach. The most important deep learning algorithms are presented, along with the smart analytics that they utilize. Augmented reality and the role it can play to a surveillance system is reported, just before discussing the challenges and the future trends of surveillance

    Detection of Salient Objects in Images Using Frequency Domain and Deep Convolutional Features

    Get PDF
    In image processing and computer vision tasks such as object of interest image segmentation, adaptive image compression, object based image retrieval, seam carving, and medical imaging, the cost of information storage and computational complexity is generally a great concern. Therefore, for these and other applications, identifying and focusing only on the parts of the image that are visually most informative is much desirable. These most informative parts or regions that also have more contrast with the rest of the image are called the salient regions of the image, and the process of identifying them is referred to as salient object detection. The main challenges in devising a salient object detection scheme are in extracting the image features that correctly differentiate the salient objects from the non-salient ones, and then utilizing them to detect the salient objects accurately. Several salient object detection methods have been developed in the literature using spatial domain image features. However, these methods generally cannot detect the salient objects uniformly or with clear boundaries between the salient and non-salient regions. This is due to the fact that in these methods, unnecessary frequency content of the image get retained or the useful ones from the original image get suppressed. Frequency domain features can address these limitations by providing a better representation of the image. Some salient object detection schemes have been developed based on the features extracted using the Fourier or Fourier like transforms. While these methods are more successful in detecting the entire salient object in images with small salient regions, in images with large salient regions these methods have a tendency to highlight the boundaries of the salient region rather than doing so for the entire salient region. This is due to the fact that in the Fourier transform of an image, the global contrast is more dominant than the local ones. Moreover, it is known that the Fourier transform cannot provide simultaneous spatial and frequency localization. It is known that multi-resolution feature extraction techniques can provide more accurate features for different image processing tasks, since features that might not get extracted at one resolution may be detected at another resolution. However, not much work has been done to employ multi-resolution feature extraction techniques for salient object detection. In view of this, the objective of this thesis is to develop schemes for image salient object detection using multi-resolution feature extraction techniques both in the frequency domain and the spatial domain. The first part of this thesis is concerned with developing salient object detection methods using multi-resolution frequency domain features. The wavelet transform has the ability of performing multi-resolution simultaneous spatial and frequency localized analysis, which makes it a better feature extraction tool compared to the Fourier or other Fourier like transforms. In this part of the thesis, first a salient object detection scheme is developed by extracting features from the high-pass coefficients of the wavelet decompositions of the three color channels of images, and devising a scheme for the weighted linear combination of the color channel features. Despite the advantages of the wavelet transform in image feature extraction, it is not very effective in capturing line discontinuities, which correspond to directional information in the image. In order to circumvent the lack of directional flexibility of the wavelet-based features, in this part of the thesis, another salient object detection scheme is also presented by extracting local and global features from the non-subsampled contourlet coefficients of the image color channels. The local features are extracted from the local variations of the low-pass coefficients, whereas the global features are obtained based on the distribution of the subband coefficients afforded by the directional flexibility provided by the non-subsampled contourlet transform. In the past few years, there has been a surge of interest in employing deep convolutional neural networks to extract image features for different applications. These networks provide a platform for automatically extracting low-level appearance features and high-level semantic features at different resolutions from the raw images. The second part of this thesis is, therefore, concerned with the investigation of salient object detection using multiresolution deep convolutional features. The existing deep salient object detection schemes are based on the standard convolution. However, performing the standard convolution is computationally expensive specially when the number of channels increases through the layers of a deep network. In this part of the thesis, using a lightweight depthwise separable convolution, a deep salient object detection network that exploits the fusion of multi-level and multi-resolution image features through judicious skip connections between the layers is developed. The proposed deep salient object detection network is aimed at providing good performance with a much reduced complexity compared to the existing deep salient object detection methods. Extensive experiments are conducted in order to evaluate the performance of the proposed salient object detection methods by applying them to the natural images from several datasets. It is shown that the performance of the proposed methods are superior to that of the existing methods of salient object detection

    Visual Quality Assessment and Blur Detection Based on the Transform of Gradient Magnitudes

    Get PDF
    abstract: Digital imaging and image processing technologies have revolutionized the way in which we capture, store, receive, view, utilize, and share images. In image-based applications, through different processing stages (e.g., acquisition, compression, and transmission), images are subjected to different types of distortions which degrade their visual quality. Image Quality Assessment (IQA) attempts to use computational models to automatically evaluate and estimate the image quality in accordance with subjective evaluations. Moreover, with the fast development of computer vision techniques, it is important in practice to extract and understand the information contained in blurred images or regions. The work in this dissertation focuses on reduced-reference visual quality assessment of images and textures, as well as perceptual-based spatially-varying blur detection. A training-free low-cost Reduced-Reference IQA (RRIQA) method is proposed. The proposed method requires a very small number of reduced-reference (RR) features. Extensive experiments performed on different benchmark databases demonstrate that the proposed RRIQA method, delivers highly competitive performance as compared with the state-of-the-art RRIQA models for both natural and texture images. In the context of texture, the effect of texture granularity on the quality of synthesized textures is studied. Moreover, two RR objective visual quality assessment methods that quantify the perceived quality of synthesized textures are proposed. Performance evaluations on two synthesized texture databases demonstrate that the proposed RR metrics outperforms full-reference (FR), no-reference (NR), and RR state-of-the-art quality metrics in predicting the perceived visual quality of the synthesized textures. Last but not least, an effective approach to address the spatially-varying blur detection problem from a single image without requiring any knowledge about the blur type, level, or camera settings is proposed. The evaluations of the proposed approach on a diverse sets of blurry images with different blur types, levels, and content demonstrate that the proposed algorithm performs favorably against the state-of-the-art methods qualitatively and quantitatively.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Abordagens multiescala para descrição de textura

    Get PDF
    Orientadores: Hélio Pedrini, William Robson SchwartzDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Visão computacional e processamento de imagens desempenham um papel importante em diversas áreas, incluindo detecção de objetos e classificação de imagens, tarefas muito importantes para aplicações em imagens médicas, sensoriamento remoto, análise forense, detecção de pele, entre outras. Estas tarefas dependem fortemente de informação visual extraída de imagens que possa ser utilizada para descrevê-las eficientemente. Textura é uma das principais propriedades usadas para descrever informação tal como distribuição espacial, brilho e arranjos estruturais de superfícies. Para reconhecimento e classificação de imagens, um grande grupo de descritores de textura foi investigado neste trabalho, sendo que apenas parte deles é realmente multiescala. Matrizes de coocorrência em níveis de cinza (GLCM) são amplamente utilizadas na literatura e bem conhecidas como um descritor de textura efetivo. No entanto, este descritor apenas discrimina informação em uma única escala, isto é, a imagem original. Escalas podem oferecer informações importantes em análise de imagens, pois textura pode ser percebida por meio de diferentes padrões em diferentes escalas. Dessa forma, duas estratégias diferentes para estender a matriz de coocorrência para múltiplas escalas são apresentadas: (i) uma representação de escala-espaço Gaussiana, construída pela suavização da imagem por um filtro passa-baixa e (ii) uma pirâmide de imagens, que é definida pelo amostragem de imagens em espaço e escala. Este descritor de textura é comparado com outros descritores em diferentes bases de dados. O descritor de textura proposto e então aplicado em um contexto de detecção de pele, como forma de melhorar a acurácia do processo de detecção. Resultados experimentais demonstram que a extensão multiescala da matriz de coocorrência exibe melhora considerável nas bases de dados testadas, exibindo resultados superiores em relação a diversos outros descritores, incluindo a versão original da matriz de coocorrência em escala únicaAbstract: Computer vision and image processing techniques play an important role in several fields, including object detection and image classification, which are very important tasks with applications in medical imagery, remote sensing, forensic analysis, skin detection, among others. These tasks strongly depend on visual information extracted from images that can be used to describe them efficiently. Texture is one of the main used characteristics that describes information such as spatial distribution, brightness and surface structural arrangements. For image recognition and classification, a large set of texture descriptors was investigated in this work, such that only a small fraction is actually multi-scale. Gray level co-occurrence matrices (GLCM) have been widely used in the literature and are known to be an effective texture descriptor. However, such descriptor only discriminates information on a unique scale, that is, the original image. Scales can offer important information in image analysis, since texture can be perceived as different patterns at distinct scales. For that matter, two different strategies for extending the GLCM to multiple scales are presented: (i) a Gaussian scale-space representation, constructed by smoothing the image with a low-pass filter and (ii) an image pyramid, which is defined by sampling the image both in space and scale. This texture descriptor is evaluated against others in different data sets. Then, the proposed texture descriptor is applied in skin detection context, as a mean of improving the accuracy of the detection process. Experimental results demonstrated that the GLCM multi-scale extension has remarkable improvements on tested data sets, outperforming many other feature descriptors, including the original GLCMMestradoCiência da ComputaçãoMestre em Ciência da Computaçã

    Advancements and Breakthroughs in Ultrasound Imaging

    Get PDF
    Ultrasonic imaging is a powerful diagnostic tool available to medical practitioners, engineers and researchers today. Due to the relative safety, and the non-invasive nature, ultrasonic imaging has become one of the most rapidly advancing technologies. These rapid advances are directly related to the parallel advancements in electronics, computing, and transducer technology together with sophisticated signal processing techniques. This book focuses on state of the art developments in ultrasonic imaging applications and underlying technologies presented by leading practitioners and researchers from many parts of the world
    corecore