144 research outputs found

    Vector extension of monogenic wavelets for geometric representation of color images

    No full text
    14 pagesInternational audienceMonogenic wavelets offer a geometric representation of grayscale images through an AM/FM model allowing invariance of coefficients to translations and rotations. The underlying concept of local phase includes a fine contour analysis into a coherent unified framework. Starting from a link with structure tensors, we propose a non-trivial extension of the monogenic framework to vector-valued signals to carry out a non marginal color monogenic wavelet transform. We also give a practical study of this new wavelet transform in the contexts of sparse representations and invariant analysis, which helps to understand the physical interpretation of coefficients and validates the interest of our theoretical construction

    Food Recognition and Volume Estimation in a Dietary Assessment System

    Full text link
    Recently obesity has become an epidemic and one of the most serious worldwide public health concerns of the 21st century. Obesity diminishes the average life expectancy and there is now convincing evidence that poor diet, in combination with physical inactivity are key determinants of an individual s risk of developing chronic diseases such as cancer, cardiovascular disease or diabetes. Assessing what people eat is fundamental to establishing the link between diet and disease. Food records are considered the best approach for assessing energy intake. However, this method requires literate and highly motivated subjects. This is a particular problem for adolescents and young adults who are the least likely to undertake food records. The ready access of the majority of the population to mobile phones (with integrated camera, improved memory capacity, network connectivity and faster processing capability) has opened up new opportunities for dietary assessment. The dietary information extracted from dietary assessment provide valuable insights into the cause of diseases that greatly helps practicing dietitians and researchers to develop subsequent approaches for mounting intervention programs for prevention. In such systems, the camera in the mobile phone is used for capturing images of food consumed and these images are then processed to automatically estimate the nutritional content of the food. However, food objects are deformable objects that exhibit variations in appearance, shape, texture and color so the food classification and volume estimation in these systems suffer from lower accuracy. The improvement of the food recognition accuracy and volume estimation accuracy are challenging tasks. This thesis presents new techniques for food classification and food volume estimation. For food recognition, emphasis was given to texture features. The existing food recognition techniques assume that the food images will be viewed at similar scales and from the same viewpoints. However, this assumption fails in practical applications, because it is difficult to ensure that a user in a dietary assessment system will put his/her camera at the same scale and orientation to capture food images as that of the target food images in the database. A new scale and rotation invariant feature generation approach that applies Gabor filter banks is proposed. To obtain scale and rotation invariance, the proposed approach identifies the dominant orientation of the filtered coefficient and applies a circular shifting operation to place this value at the first scale of dominant direction. The advantages of this technique are it does not require the scale factor to be known in advance and it is scale/and rotation invariant separately and concurrently. This approach is modified to achieve improved accuracy by applying a Gaussian window along the scale dimension which reduces the impact of high and low frequencies of the filter outputs enabling better matching between the same classes. Besides automatic classification, semi automatic classification and group classification are also considered to have an idea about the improvement. To estimate the volume of a food item, a stereo pair is used to recover the structure as a 3D point cloud. A slice based volume estimation approach is proposed that converts the 3D point cloud to a series of 2D slices. The proposed approach eliminates the problem of knowing the distance between two cameras with the help of disparities and depth information from a fiducial marker. The experimental results show that the proposed approach can provide an accurate estimate of food volume

    A VISION-BASED QUALITY INSPECTION SYSTEM FOR FABRIC DEFECT DETECTION AND CLASSIFICATION

    Get PDF
    Published ThesisQuality inspection of textile products is an important issue for fabric manufacturers. It is desirable to produce the highest quality goods in the shortest amount of time possible. Fabric faults or defects are responsible for nearly 85% of the defects found by the garment industry. Manufacturers recover only 45 to 65% of their profits from second or off-quality goods. There is a need for reliable automated woven fabric inspection methods in the textile industry. Numerous methods have been proposed for detecting defects in textile. The methods are generally grouped into three main categories according to the techniques they use for texture feature extraction, namely statistical approaches, spectral approaches and model-based approaches. In this thesis, we study one method from each category and propose their combinations in order to get improved fabric defect detection and classification accuracy. The three chosen methods are the grey level co-occurrence matrix (GLCM) from the statistical category, the wavelet transform from the spectral category and the Markov random field (MRF) from the model-based category. We identify the most effective texture features for each of those methods and for different fabric types in order to combine them. Using GLCM, we identify the optimal number of features, the optimal quantisation level of the original image and the optimal intersample distance to use. We identify the optimal GLCM features for different types of fabrics and for three different classifiers. Using the wavelet transform, we compare the defect detection and classification performance of features derived from the undecimated discrete wavelet and those derived from the dual-tree complex wavelet transform. We identify the best features for different types of fabrics. Using the Markov random field, we study the performance for fabric defect detection and classification of features derived from different models of Gaussian Markov random fields of order from 1 through 9. For each fabric type we identify the best model order. Finally, we propose three combination schemes of the best features identified from the three methods and study their fabric detection and classification performance. They lead generally to improved performance as compared to the individual methods, but two of them need further improvement

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Directional edge and texture representations for image processing

    Get PDF
    An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations

    Sonar image interpretation for sub-sea operations

    Get PDF
    Mine Counter-Measure (MCM) missions are conducted to neutralise underwater explosives. Automatic Target Recognition (ATR) assists operators by increasing the speed and accuracy of data review. ATR embedded on vehicles enables adaptive missions which increase the speed of data acquisition. This thesis addresses three challenges; the speed of data processing, robustness of ATR to environmental conditions and the large quantities of data required to train an algorithm. The main contribution of this thesis is a novel ATR algorithm. The algorithm uses features derived from the projection of 3D boxes to produce a set of 2D templates. The template responses are independent of grazing angle, range and target orientation. Integer skewed integral images, are derived to accelerate the calculation of the template responses. The algorithm is compared to the Haar cascade algorithm. For a single model of sonar and cylindrical targets the algorithm reduces the Probability of False Alarm (PFA) by 80% at a Probability of Detection (PD) of 85%. The algorithm is trained on target data from another model of sonar. The PD is only 6% lower even though no representative target data was used for training. The second major contribution is an adaptive ATR algorithm that uses local sea-floor characteristics to address the problem of ATR robustness with respect to the local environment. A dual-tree wavelet decomposition of the sea-floor and an Markov Random Field (MRF) based graph-cut algorithm is used to segment the terrain. A Neural Network (NN) is then trained to filter ATR results based on the local sea-floor context. It is shown, for the Haar Cascade algorithm, that the PFA can be reduced by 70% at a PD of 85%. Speed of data processing is addressed using novel pre-processing techniques. The standard three class MRF, for sonar image segmentation, is formulated using graph-cuts. Consequently, a 1.2 million pixel image is segmented in 1.2 seconds. Additionally, local estimation of class models is introduced to remove range dependent segmentation quality. Finally, an A* graph search is developed to remove the surface return, a line of saturated pixels often detected as false alarms by ATR. The A* search identifies the surface return in 199 of 220 images tested with a runtime of 2.1 seconds. The algorithm is robust to the presence of ripples and rocks

    Computers in Biology and Medicine / Survey on computer aided decision support for diagnosis of celiac disease

    Get PDF
    Celiac disease (CD) is a complex autoimmune disorder in genetically predisposed individuals of all age groups triggered by the ingestion of food containing gluten. A reliable diagnosis is of high interest in view of embarking on a strict gluten-free diet, which is the CD treatment modality of first choice. The gold standard for diagnosis of CD is currently based on a histological confirmation of serology, using biopsies performed during upper endoscopy. Computer aided decision support is an emerging option in medicine and endoscopy in particular. Such systems could potentially save costs and manpower while simultaneously increasing the safety of the procedure. Research focused on computer-assisted systems in the context of automated diagnosis of CD has started in 2008. Since then, over 40 publications on the topic have appeared. In this context, data from classical flexible endoscopy as well as wireless capsule endoscopy (WCE) and confocal laser endomicrosopy (CLE) has been used. In this survey paper, we try to give a comprehensive overview of the research focused on computer-assisted diagnosis of CD.FWF 24366(VLID)223161

    Automated Complexity-Sensitive Image Fusion

    Get PDF
    To construct a complete representation of a scene with environmental obstacles such as fog, smoke, darkness, or textural homogeneity, multisensor video streams captured in diferent modalities are considered. A computational method for automatically fusing multimodal image streams into a highly informative and unified stream is proposed. The method consists of the following steps: 1. Image registration is performed to align video frames in the visible band over time, adapting to the nonplanarity of the scene by automatically subdividing the image domain into regions approximating planar patches 2. Wavelet coefficients are computed for each of the input frames in each modality 3. Corresponding regions and points are compared using spatial and temporal information across various scales 4. Decision rules based on the results of multimodal image analysis are used to combine thewavelet coefficients from different modalities 5. The combined wavelet coefficients are inverted to produce an output frame containing useful information gathered from the available modalities Experiments show that the proposed system is capable of producing fused output containing the characteristics of color visible-spectrum imagery while adding information exclusive to infrared imagery, with attractive visual and informational properties

    Directional multiresolution image representations

    Get PDF
    Efficient representation of visual information lies at the foundation of many image processing tasks, including compression, filtering, and feature extraction. Efficiency of a representation refers to the ability to capture significant information of an object of interest in a small description. For practical applications, this representation has to be realized by structured transforms and fast algorithms. Recently, it has become evident that commonly used separable transforms (such as wavelets) are not necessarily best suited for images. Thus, there is a strong motivation to search for more powerful schemes that can capture the intrinsic geometrical structure of pictorial information. This thesis focuses on the development of new "true" two-dimensional representations for images. The emphasis is on the discrete framework that can lead to algorithmic implementations. The first method constructs multiresolution, local and directional image expansions by using non-separable filter banks. This discrete transform is developed in connection with the continuous-space curvelet construction in harmonic analysis. As a result, the proposed transform provides an efficient representation for two-dimensional piecewise smooth signals that resemble images. The link between the developed filter banks and the continuous-space constructions is set up in a newly defined directional multiresolution analysis. The second method constructs a new family of block directional and orthonormal transforms based on the ridgelet idea, and thus offers an efficient representation for images that are smooth away from straight edges. Finally, directional multiresolution image representations are employed together with statistical modeling, leading to powerful texture models and successful image retrieval systems

    Introduction to frames

    Get PDF
    This survey gives an introduction to redundant signal representations called frames. These representations have recently emerged as yet another powerful tool in the signal processing toolbox and have become popular through use in numerous applications. Our aim is to familiarize a general audience with the area, while at the same time giving a snapshot of the current state-of-the-art
    corecore