2,674 research outputs found

    Multimodal feature fusion for video forgery detection

    Get PDF

    Detection of partial occlusions of assembled components to simplify the disassembly tasks

    Get PDF
    An automatic disassembly cell requires of a computer vision system for recognition and localization of the products and each of theirs components. The detection of occlusions adds more information to the knowledge base to identify components and products, and generate a trustworthy and precise relational model (generic graph of hierarchic relations among the different components that make up the product). In this paper, a method to detect partial occlusions in assembled components is presented. This method is based on the fusion of regions and edges information, and it offers a certain degree of simplification for the recognition and modelled of the disassembly tasks, of the set of components which compose the product. The proposed approach to detect regions is a hybrid approach between RGB and HSV spaces. The bi-dimensional histogram V/S is employed for the selection of the appropriate thresholds which serve as an aid to diminish the influence of the highlights and shadows in images. The goal of this paper is to present an approach for the detection of occlusions in assembled components from a combination of HSV and RGB spaces, a bi-dimensional histogram and an edge detector.This work was funded by the following Spanish MCYT project “DESAURO: Desensamblado Automático Selectivo para Reciclado mediante Robots Cooperativos y Sistema Multisensorial” (MCYT (DPI2002-02103)

    The Optimisation of Elementary and Integrative Content-Based Image Retrieval Techniques

    Get PDF
    Image retrieval plays a major role in many image processing applications. However, a number of factors (e.g. rotation, non-uniform illumination, noise and lack of spatial information) can disrupt the outputs of image retrieval systems such that they cannot produce the desired results. In recent years, many researchers have introduced different approaches to overcome this problem. Colour-based CBIR (content-based image retrieval) and shape-based CBIR were the most commonly used techniques for obtaining image signatures. Although the colour histogram and shape descriptor have produced satisfactory results for certain applications, they still suffer many theoretical and practical problems. A prominent one among them is the well-known “curse of dimensionality “. In this research, a new Fuzzy Fusion-based Colour and Shape Signature (FFCSS) approach for integrating colour-only and shape-only features has been investigated to produce an effective image feature vector for database retrieval. The proposed technique is based on an optimised fuzzy colour scheme and robust shape descriptors. Experimental tests were carried out to check the behaviour of the FFCSS-based system, including sensitivity and robustness of the proposed signature of the sampled images, especially under varied conditions of, rotation, scaling, noise and light intensity. To further improve retrieval efficiency of the devised signature model, the target image repositories were clustered into several groups using the k-means clustering algorithm at system runtime, where the search begins at the centres of each cluster. The FFCSS-based approach has proven superior to other benchmarked classic CBIR methods, hence this research makes a substantial contribution towards corresponding theoretical and practical fronts

    Medical Diagnosis with Multimodal Image Fusion Techniques

    Get PDF
    Image Fusion is an effective approach utilized to draw out all the significant information from the source images, which supports experts in evaluation and quick decision making. Multi modal medical image fusion produces a composite fused image utilizing various sources to improve quality and extract complementary information. It is extremely challenging to gather every piece of information needed using just one imaging method. Therefore, images obtained from different modalities are fused Additional clinical information can be gleaned through the fusion of several types of medical image pairings. This study's main aim is to present a thorough review of medical image fusion techniques which also covers steps in fusion process, levels of fusion, various imaging modalities with their pros and cons, and  the major scientific difficulties encountered in the area of medical image fusion. This paper also summarizes the quality assessments fusion metrics. The various approaches used by image fusion algorithms that are presently available in the literature are classified into four broad categories i) Spatial fusion methods ii) Multiscale Decomposition based methods iii) Neural Network based methods and iv) Fuzzy Logic based methods. the benefits and pitfalls of the existing literature are explored and Future insights are suggested. Moreover, this study is anticipated to create a solid platform for the development of better fusion techniques in medical applications

    A novel lip geometry approach for audio-visual speech recognition

    Get PDF
    By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate ,the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination ofa skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching techniqLie able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset

    Tõenäosustihedusfunktsioonil ja keskmistamisel baseeruv iirisetuvastus

    Get PDF
    In this thesis the basic concepts of iris recognition and a new algorithm using probability distribution functions were introduced. The first section familiarized the reader with the most important facts needed to be able to grasp the idea behind different iris recognition algorithms. The second part introduced some more or less popular algorithms. Daugman‟s proposed method is the most widely used one nowadays and most of the real-life applications take advantage of it. However, it is computationally rather complex. There is also the conventional principal component analysis (PCA) that creates eigenirises out of the initial database and a method proposed by Anbarjafari et al. that uses HSI colour space and majority voting to make the decision. The third part of the thesis proposed a novel iris recognition algorithm based on the mean rule. The algorithm converts iris images from traditional RGB colour space to HSI and YCbCr and creates probability distribution functions (PDF) from channels H, S, Y, Cb and Cr for both left and right iris. Kullback-Leibler divergence is used as the metric to calculate the difference between the corresponding channels. The recognition process includes calculating KLD values for all the channels for left and right irises (i.e. there are 10 channels) and then using the mean rule to get an average of them. This means that probability of compensating errors made by some channels is quite high. In order to test the algorithm, UPOL database was used. It includes three samples for both left and right iris for 64 people. The results are described in . Even though the algorithm achieved 100% recognition rate for both left and right iris, there are theoretically several ways to enhance the performance even more like using weighted average while calculating the KLD value

    JERS-1 SAR and LANDSAT-5 TM image data fusion: An application approach for lithological mapping

    Get PDF
    Satellite image data fusion is an image processing set of procedures utilise either for image optimisation for visual photointerpretation, or for automated thematic classification with low error rate and high accuracy. Lithological mapping using remote sensing image data relies on the spectral and textural information of the rock units of the area to be mapped. These pieces of information can be derived from Landsat optical TM and JERS-1 SAR images respectively. Prior to extracting such information (spectral and textural) and fusing them together, geometric image co-registration between TM and the SAR, atmospheric correction of the TM, and SAR despeckling are required. In this thesis, an appropriate atmospheric model is developed and implemented utilising the dark pixel subtraction method for atmospheric correction. For SAR despeckling, an efficient new method is also developed to test whether the SAR filter used remove the textural information or not. For image optimisation for visual photointerpretation, a new method of spectral coding of the six bands of the optical TM data is developed. The new spectral coding method is used to produce efficient colour composite with high separability between the spectral classes similar to that if the whole six optical TM bands are used together. This spectral coded colour composite is used as a spectral component, which is then fused with the textural component represented by the despeckled JERS-1 SAR using the fusion tools, including the colour transform and the PCT. The Grey Level Cooccurrence Matrix (GLCM) technique is used to build the textural data set using the speckle filtered JERS-1 SAR data making seven textural GLCM measures. For automated thematic mapping and by the use of both the six TM spectral data and the seven textural GLCM measures, a new method of classification has been developed using the Maximum Likelihood Classifier (MLC). The method is named the sequential maximum likelihood classification and works efficiently by comparison the classified textural pixels, the classified spectral pixels, and the classified textural-spectral pixels, and gives the means of utilising the textural and spectral information for automated lithological mapping

    Enhancing spatial resolution of remotely sensed data for mapping freshwater environments

    Get PDF
    Freshwater environments are important for ecosystem services and biodiversity. These environments are subject to many natural and anthropogenic changes, which influence their quality; therefore, regular monitoring is required for their effective management. High biotic heterogeneity, elongated land/water interaction zones, and logistic difficulties with access make field based monitoring on a large scale expensive, inconsistent and often impractical. Remote sensing (RS) is an established mapping tool that overcomes these barriers. However, complex and heterogeneous vegetation and spectral variability due to water make freshwater environments challenging to map using remote sensing technology. Satellite images available for New Zealand were reviewed, in terms of cost, and spectral and spatial resolution. Particularly promising image data sets for freshwater mapping include the QuickBird and SPOT-5. However, for mapping freshwater environments a combination of images is required to obtain high spatial, spectral, radiometric, and temporal resolution. Data fusion (DF) is a framework of data processing tools and algorithms that combines images to improve spectral and spatial qualities. A range of DF techniques were reviewed and tested for performance using panchromatic and multispectral QB images of a semi-aquatic environment, on the southern shores of Lake Taupo, New Zealand. In order to discuss the mechanics of different DF techniques a classification consisting of three groups was used - (i) spatially-centric (ii) spectrally-centric and (iii) hybrid. Subtract resolution merge (SRM) is a hybrid technique and this research demonstrated that for a semi aquatic QuickBird image it out performed Brovey transformation (BT), principal component substitution (PCS), local mean and variance matching (LMVM), and optimised high pass filter addition (OHPFA). However some limitations were identified with SRM, which included the requirement for predetermined band weights, and the over-representation of the spatial edges in the NIR bands due to their high spectral variance. This research developed three modifications to the SRM technique that addressed these limitations. These were tested on QuickBird (QB), SPOT-5, and Vexcel aerial digital images, as well as a scanned coloured aerial photograph. A visual qualitative assessment and a range of spectral and spatial quantitative metrics were used to evaluate these modifications. These included spectral correlation and root mean squared error (RMSE), Sobel filter based spatial edges RMSE, and unsupervised classification. The first modification addressed the issue of predetermined spectral weights and explored two alternative regression methods (Least Absolute Deviation, and Ordinary Least Squares) to derive image-specific band weights for use in SRM. Both methods were found equally effective; however, OLS was preferred as it was more efficient in processing band weights compared to LAD. The second modification used a pixel block averaging function on high resolution panchromatic images to derive spatial edges for data fusion. This eliminated the need for spectral band weights, minimised spectral infidelity, and enabled the fusion of multi-platform data. The third modification addressed the issue of over-represented spatial edges by introducing a sophisticated contrast and luminance index to develop a new normalising function. This improved the spatial representation of the NIR band, which is particularly important for mapping vegetation. A combination of the second and third modification of SRM was effective in simultaneously minimising the overall spectral infidelity and undesired spatial errors for the NIR band of the fused image. This new method has been labelled Contrast and Luminance Normalised (CLN) data fusion, and has been demonstrated to make a significant contribution in fusing multi-platform, multi-sensor, multi-resolution, and multi-temporal data. This contributes to improvements in the classification and monitoring of fresh water environments using remote sensing
    corecore