29 research outputs found

    Colour-Texture Fusion In Image Segmentation For Content-Based Image Retrieval Systems

    Get PDF
    Kemajuan teknologi komputer serta kepopularan World Wide Web telah membawa kepada peningkatan bilangan gambar yang berbentuk digital. Selari dengan perkembangan itu, sistem pencapaian imej berdasarkan kandungan (content-based image retrieval, CBIR) telah menjadi satu topic kajian yang berkembang dengan pesatnya sejak kebelakangan ini. Proses segmentasi merupakan langkah prapemprosesan yang mempunyai pengaruh penting terhadap prestasi sistem CBIR. Oleh itu, dalam penyelidikan ini, satu rangka segmentasi imej yang baru, bersesuaian untuk pertanyaan kawasan (region queries) dalam CBIR, telah dipersembahkan. Teknik yang digunakan merupakan gabungan ciri-ciri warna dan tekstur gambar, dengan bantuan algoritma fuzzy c-means clustering (FCM) yang telah diubahsuai. With the advances in computer technologies and the popularity of the World Wide Web, the volume of digital images has grown rapidly. In parallel with this growth, content-based image retrieval (CBIR) is becoming a fast growing research area in recent years. Image segmentation is an important pre-processing step which has a great influence on the performance of CBIR systems. In this research, a novel image segmentation framework, dedicated to region queries in CBIR, is presented. The underlying technique is based on the fusion of colour and texture features by a modified fuzzy c-means clustering (FCM) algorithm

    Image Area Reduction for Efficient Medical Image Retrieval

    Get PDF
    Content-based image retrieval (CBIR) has been one of the most active areas in medical image analysis in the last two decades because of the steadily increase in the number of digital images used. Efficient diagnosis and treatment planning can be supported by developing retrieval systems to provide high-quality healthcare. Extensive research has attempted to improve the image retrieval efficiency. The critical factors when searching in large databases are time and storage requirements. In general, although many methods have been suggested to increase accuracy, fast retrieval has been rather sporadically investigated. In this thesis, two different approaches are proposed to reduce both time and space requirements for medical image retrieval. The IRMA data set is used to validate the proposed methods. Both methods utilized Local Binary Pattern (LBP) histogram features which are extracted from 14,410 X-ray images of IRMA dataset. The first method is image folding that operates based on salient regions in an image. Saliency is determined by a context-aware saliency algorithm which includes folding the image. After the folding process, the reduced image area is used to extract multi-block and multi-scale LBP features and to classify these features by multi-class Support vector machine (SVM). The other method consists of classification and distance-based feature similarity. Images are firstly classified into general classes by utilizing LBP features. Subsequently, the retrieval is performed within the class to locate the most similar images. Between the retrieval and classification processes, LBP features are eliminated by employing the error histogram of a shallow (n/p/n) autoencoder to quantify the retrieval relevance of image blocks. If the region is relevant, the autoencoder gives large error for its decoding. Hence, via examining the autoencoder error of image blocks, irrelevant regions can be detected and eliminated. In order to calculate similarity within general classes, the distance between the LBP features of relevant regions is calculated. The results show that the retrieval time can be reduced, and the storage requirements can be lowered without significant decrease in accuracy

    Magneettiresonanssikuvien tekstuurianalyysisovelluksen kehittäminen MATLAB-ympäristössä

    Get PDF
    This thesis was based on the need to develop a generic software application frame for texture analysis of magnetic resonance (MR) images. In collaboration with the research group at the department of Medical Imaging Centre and Hospital Pharmacy (MICHP) at Tampere University Hospital (TAUH) the goal was to improve the user experience and work flow as well as implement a completely new user interface and key functionalities. The platform was required to be complex enough to manage with image processing algorithms and to provide high level and easily modifiable software architecture. The research group having years of experience with an open-source texture analysis oriented MaZda software the focus of this thesis was to analyse and solve the restrictions based on the observations from using MaZda. MATLAB was chosen as the programming platform due the high-level syntax with powerful built-in properties e.g. Image Processing Toolbox (IPT) that would allow proficient support for computationally demanding processes. Another advantage with MATLAB was the interface support for languages like Fortran, C and C++. MATLAB being commercial software platform, it was acknowledged that achieving a standalone end product would not be possible. Computational performance was also omitted for the purpose this thesis not only due to MATLAB’s limitations but also to keep the scale contained. The improvement suggestions provided by the research group were considered as a rough specification for the software to be implemented. These requirements included extensibility in terms of texture analysis algorithms and simplified user interface to improve the work flow. Selecting MATLAB as the programming environment extended the group of people capable of contributing to the tool in the future. Implementing the frame from the beginning allowed the texture analysis parameters and features to be fully configurable instead of static. The modular visual structure of the software allowed the user to switch between image sets more easily. Removing the region of interest (ROI) limitation ensured that same image set could be utilized more efficiently. The implemented MATLAB application provides a basic frame for more convenient medical image processing flow for texture analysis of MR images but further testing and development is required to complement the tool

    Bridging semantic gap: learning and integrating semantics for content-based retrieval

    Full text link
    Digital cameras have entered ordinary homes and produced^incredibly large number of photos. As a typical example of broad image domain, unconstrained consumer photos vary significantly. Unlike professional or domain-specific images, the objects in the photos are ill-posed, occluded, and cluttered with poor lighting, focus, and exposure. Content-based image retrieval research has yet to bridge the semantic gap between computable low-level information and high-level user interpretation. In this thesis, we address the issue of semantic gap with a structured learning framework to allow modular extraction of visual semantics. Semantic image regions (e.g. face, building, sky etc) are learned statistically, detected directly from image without segmentation, reconciled across multiple scales, and aggregated spatially to form compact semantic index. To circumvent the ambiguity and subjectivity in a query, a new query method that allows spatial arrangement of visual semantics is proposed. A query is represented as a disjunctive normal form of visual query terms and processed using fuzzy set operators. A drawback of supervised learning is the manual labeling of regions as training samples. In this thesis, a new learning framework to discover local semantic patterns and to generate their samples for training with minimal human intervention has been developed. The discovered patterns can be visualized and used in semantic indexing. In addition, three new class-based indexing schemes are explored. The winnertake- all scheme supports class-based image retrieval. The class relative scheme and the local classification scheme compute inter-class memberships and local class patterns as indexes for similarity matching respectively. A Bayesian formulation is proposed to unify local and global indexes in image comparison and ranking that resulted in superior image retrieval performance over those of single indexes. Query-by-example experiments on 2400 consumer photos with 16 semantic queries show that the proposed approaches have significantly better (18% to 55%) average precisions than a high-dimension feature fusion approach. The thesis has paved two promising research directions, namely the semantics design approach and the semantics discovery approach. They form elegant dual frameworks that exploits pattern classifiers in learning and integrating local and global image semantics

    Spatial and temporal representations for multi-modal visual retrieval

    Get PDF
    This dissertation studies the problem of finding relevant content within a visual collection according to a specific query by addressing three key modalities: symmetric visual retrieval, asymmetric visual retrieval and cross-modal retrieval, depending on the kind of data to be processed. In symmetric visual retrieval, the query object and the elements in the collection are from the same kind of visual data, i.e. images or videos. Inspired by the human visual perception system, we propose new techniques to estimate visual similarity in image-to-image retrieval datasets based on non-metric functions, improving image retrieval performance on top of state-of-the-art methods. On the other hand, asymmetric visual retrieval is the problem in which queries and elements in the dataset are from different types of visual data. We propose methods to aggregate the temporal information of video segments so that imagevideo comparisons can be computed using similarity functions. When compared in image-to-video retrieval datasets, our algorithms drastically reduce memory storage while maintaining high accuracy rates. Finally, we introduce new solutions for cross-modal retrieval, which is the task in which either the queries or the elements in the collection are non-visual objects. In particular, we study text-image retrieval in the domain of art by introducing new models for semantic art understanding, obtaining results close to human performance. Overall, this thesis advances the state-of-the-art in visual retrieval by presenting novel solutions for some of the key tasks in the field. The contributions derived from this work have potential direct applications in the era of big data, as visual datasets are growing exponentially every day and new techniques for storing, accessing and managing large-scale visual collections are required

    Providing effective memory retrieval cues through automatic structuring and augmentation of a lifelog of images

    Get PDF
    Lifelogging is an area of research which is concerned with the capture of many aspects of an individual's life digitally, and within this rapidly emerging field is the significant challenge of managing images passively captured by an individual of their daily life. Possible applications vary from helping those with neurodegenerative conditions recall events from memory, to the maintenance and augmentation of extensive image collections of a tourist's trips. However, a large lifelog of images can quickly amass, with an average of 700,000 images captured each year, using a device such as the SenseCam. We address the problem of managing this vast collection of personal images by investigating automatic techniques that: 1. Identify distinct events within a full day of lifelog images (which typically consists of 2,000 images) e.g. breakfast, working on PC, meeting, etc. 2. Find similar events to a given event in a person's lifelog e.g. "show me other events where I was in the park" 3. Determine those events that are more important or unusual to the user and also select a relevant keyframe image for visual display of an event e.g. a "meeting" is more interesting to review than "working on PC" 4. Augment the images from a wearable camera with higher quality images from external "Web 2.0" sources e.g. find me pictures taken by others of the U2 concert in Croke Park In this dissertation we discuss novel techniques to realise each of these facets and how effective they are. The significance of this work is not only of benefit to the lifelogging community, but also to cognitive psychology researchers studying the potential benefits of lifelogging devices to those with neurodegenerative diseases

    Collaborative design and feasibility assessment of computational nutrient sensing for simulated food-intake tracking in a healthcare environment

    Get PDF
    One in four older adults (65 years and over) are living with some form of malnutrition. This increases their odds of hospitalization four-fold and is associated with decreased quality of life and increased mortality. In long-term care (LTC), residents have more complex care needs and the proportion affected is a staggering 54% primarily due to low intake. Tracking intake is important for monitoring whether residents are meeting their nutritional needs however current methods are time-consuming, subjective, and prone to large margins of error. This reduces the utility of tracked data and makes it challenging to identify individuals at-risk in a timely fashion. While technologies exist for tracking food-intake, they have not been designed for use within the LTC context and require a large time burden by the user. Especially in light of the machine learning boom, there is great opportunity to harness learnings from this domain and apply it to the field of nutrition for enhanced food-intake tracking. Additionally, current approaches to monitoring food-intake tracking are limited by the nutritional database to which they are linked making generalizability a challenge. Drawing inspiration from current methods, the desires of end-users (primary users: personal support workers, registered staff, dietitians), and machine learning approaches suitable for this context in which there is limited data available, we investigated novel methods for assessing needs in this environment and imagine an alternative approach. We leveraged image processing and machine learning to remove subjectivity while increasing accuracy and precision to support higher-quality food-intake tracking. This thesis presents the ideation, design, development and evaluation of a collaboratively designed, and feasibility assessment, of computational nutrient sensing for simulated food-intake tracking in the LTC environment. We sought to remove potential barriers to uptake through collaborative design and ongoing end user engagement for developing solution concepts for a novel Automated Food Imaging and Nutrient Intake Tracking (AFINI-T) system while implementing the technology in parallel. More specifically, we demonstrated the effectiveness of applying a modified participatory iterative design process modeled from the Google Sprint framework in the LTC context which identified priority areas and established functional criteria for usability and feasibility. Concurrently, we developed the novel AFINI-T system through the co-integration of image processing and machine learning and guided by the application of food-intake tracking in LTC to address three questions: (1) where is there food? (i.e., food segmentation), (2) how much food was consumed? (i.e., volume estimation) using a fully automatic imaging system for quantifying food-intake. We proposed a novel deep convolutional encoder-decoder food network with depth-refinement (EDFN-D) using an RGB-D camera for quantifying a plate’s remaining food volume relative to reference portions in whole and modified texture foods. To determine (3) what foods are present (i.e., feature extraction and classification), we developed a convolutional autoencoder to learn meaningful food-specific features and developed classifiers which leverage a priori information about when certain foods would be offered and the level of texture modification prescribed to apply real-world constraints of LTC. We sought to address real-world complexity by assessing a wide variety of food items through the construction of a simulated food-intake dataset emulating various degrees of food-intake and modified textures (regular, minced, puréed). To ensure feasibility-related barriers to uptake were mitigated, we employed a feasibility assessment using the collaboratively designed prototype. Finally, this thesis explores the feasibility of applying biophotonic principles to food as a first step to enhancing food database estimates. Motivated by a theoretical optical dilution model, a novel deep neural network (DNN) was evaluated for estimating relative nutrient density of commercially prepared purées. For deeper analysis we describe the link between color and two optically active nutrients, vitamin A, and anthocyanins, and suggest it may be feasible to utilize optical properties of foods to enhance nutritional estimation. This research demonstrates a transdisciplinary approach to designing and implementing a novel food-intake tracking system which addresses several shortcomings of the current method. Upon translation, this system may provide additional insights for supporting more timely nutritional interventions through enhanced monitoring of nutritional intake status among LTC residents

    3D multiresolution statistical approaches for accelerated medical image and volume segmentation

    Get PDF
    Medical volume segmentation got the attraction of many researchers; therefore, many techniques have been implemented in terms of medical imaging including segmentations and other imaging processes. This research focuses on an implementation of segmentation system which uses several techniques together or on their own to segment medical volumes, the system takes a stack of 2D slices or a full 3D volumes acquired from medical scanners as a data input. Two main approaches have been implemented in this research for segmenting medical volume which are multi-resolution analysis and statistical modeling. Multi-resolution analysis has been mainly employed in this research for extracting the features. Higher dimensions of discontinuity (line or curve singularity) have been extracted in medical images using a modified multi-resolution analysis transforms such as ridgelet and curvelet transforms. The second implemented approach in this thesis is the use of statistical modeling in medical image segmentation; Hidden Markov models have been enhanced here to segment medical slices automatically, accurately, reliably and with lossless results. But the problem with using Markov models here is the computational time which is too long. This has been addressed by using feature reduction techniques which has also been implemented in this thesis. Some feature reduction and dimensionality reduction techniques have been used to accelerate the slowest block in the proposed system. This includes Principle Components Analysis, Gaussian Pyramids and other methods. The feature reduction techniques have been employed efficiently with the 3D volume segmentation techniques such as 3D wavelet and 3D Hidden Markov models. The system has been tested and validated using several procedures starting at a comparison with the predefined results, crossing the specialists’ validations, and ending by validating the system using a survey filled by the end users explaining the techniques and the results. This concludes that Markovian models segmentation results has overcome all other techniques in most patients’ cases. Curvelet transform has been also proved promising segmentation results; the end users rate it better than Markovian models due to the long time required with Hidden Markov models.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore