13 research outputs found

    Biased classification for relevance feedback in content-based image retrieval.

    Get PDF
    Peng, Xiang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 98-115).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Statement --- p.3Chapter 1.2 --- Major Contributions --- p.6Chapter 1.3 --- Thesis Outline --- p.7Chapter 2 --- Background Study --- p.9Chapter 2.1 --- Content-based Image Retrieval --- p.9Chapter 2.1.1 --- Image Representation --- p.11Chapter 2.1.2 --- High Dimensional Indexing --- p.15Chapter 2.1.3 --- Image Retrieval Systems Design --- p.16Chapter 2.2 --- Relevance Feedback --- p.19Chapter 2.2.1 --- Self-Organizing Map in Relevance Feedback --- p.20Chapter 2.2.2 --- Decision Tree in Relevance Feedback --- p.22Chapter 2.2.3 --- Bayesian Classifier in Relevance Feedback --- p.24Chapter 2.2.4 --- Nearest Neighbor Search in Relevance Feedback --- p.25Chapter 2.2.5 --- Support Vector Machines in Relevance Feedback --- p.26Chapter 2.3 --- Imbalanced Classification --- p.29Chapter 2.4 --- Active Learning --- p.31Chapter 2.4.1 --- Uncertainly-based Sampling --- p.33Chapter 2.4.2 --- Error Reduction --- p.34Chapter 2.4.3 --- Batch Selection --- p.35Chapter 2.5 --- Convex Optimization --- p.35Chapter 2.5.1 --- Overview of Convex Optimization --- p.35Chapter 2.5.2 --- Linear Program --- p.37Chapter 2.5.3 --- Quadratic Program --- p.37Chapter 2.5.4 --- Quadratically Constrained Quadratic Program --- p.37Chapter 2.5.5 --- Cone Program --- p.38Chapter 2.5.6 --- Semi-definite Program --- p.39Chapter 3 --- Imbalanced Learning with BMPM for CBIR --- p.40Chapter 3.1 --- Research Motivation --- p.41Chapter 3.2 --- Background Review --- p.42Chapter 3.2.1 --- Relevance Feedback for CBIR --- p.42Chapter 3.2.2 --- Minimax Probability Machine --- p.42Chapter 3.2.3 --- Extensions of Minimax Probability Machine --- p.44Chapter 3.3 --- Relevance Feedback using BMPM --- p.45Chapter 3.3.1 --- Model Definition --- p.45Chapter 3.3.2 --- Advantages of BMPM in Relevance Feedback --- p.46Chapter 3.3.3 --- Relevance Feedback Framework by BMPM --- p.47Chapter 3.4 --- Experimental Results --- p.47Chapter 3.4.1 --- Experiment Datasets --- p.48Chapter 3.4.2 --- Performance Evaluation --- p.50Chapter 3.4.3 --- Discussions --- p.53Chapter 3.5 --- Summary --- p.53Chapter 4 --- BMPM Active Learning for CBIR --- p.55Chapter 4.1 --- Problem Statement and Motivation --- p.55Chapter 4.2 --- Background Review --- p.57Chapter 4.3 --- Relevance Feedback by BMPM Active Learning . --- p.58Chapter 4.3.1 --- Active Learning Concept --- p.58Chapter 4.3.2 --- General Approaches for Active Learning . --- p.59Chapter 4.3.3 --- Biased Minimax Probability Machine --- p.60Chapter 4.3.4 --- Proposed Framework --- p.61Chapter 4.4 --- Experimental Results --- p.63Chapter 4.4.1 --- Experiment Setup --- p.64Chapter 4.4.2 --- Performance Evaluation --- p.66Chapter 4.5 --- Summary --- p.68Chapter 5 --- Large Scale Learning with BMPM --- p.70Chapter 5.1 --- Introduction --- p.71Chapter 5.1.1 --- Motivation --- p.71Chapter 5.1.2 --- Contribution --- p.72Chapter 5.2 --- Background Review --- p.72Chapter 5.2.1 --- Second Order Cone Program --- p.72Chapter 5.2.2 --- General Methods for Large Scale Problems --- p.73Chapter 5.2.3 --- Biased Minimax Probability Machine --- p.75Chapter 5.3 --- Efficient BMPM Training --- p.78Chapter 5.3.1 --- Proposed Strategy --- p.78Chapter 5.3.2 --- Kernelized BMPM and Its Solution --- p.81Chapter 5.4 --- Experimental Results --- p.82Chapter 5.4.1 --- Experimental Testbeds --- p.83Chapter 5.4.2 --- Experimental Settings --- p.85Chapter 5.4.3 --- Performance Evaluation --- p.87Chapter 5.5 --- Summary --- p.92Chapter 6 --- Conclusion and Future Work --- p.93Chapter 6.1 --- Conclusion --- p.93Chapter 6.2 --- Future Work --- p.94Chapter A --- List of Symbols and Notations --- p.96Chapter B --- List of Publications --- p.98Bibliography --- p.10

    Indexing, learning and content-based retrieval for special purpose image databases

    Get PDF
    This chapter deals with content-based image retrieval in special purpose image databases. As image data is amassed ever more effortlessly, building efficient systems for searching and browsing of image databases becomes increasingly urgent. We provide an overview of the current state-of-the art by taking a tour along the entir

    Semantic image retrieval using relevance feedback and transaction logs

    Get PDF
    Due to the recent improvements in digital photography and storage capacity, storing large amounts of images has been made possible, and efficient means to retrieve images matching a user’s query are needed. Content-based Image Retrieval (CBIR) systems automatically extract image contents based on image features, i.e. color, texture, and shape. Relevance feedback methods are applied to CBIR to integrate users’ perceptions and reduce the gap between high-level image semantics and low-level image features. The precision of a CBIR system in retrieving semantically rich (complex) images is improved in this dissertation work by making advancements in three areas of a CBIR system: input, process, and output. The input of the system includes a mechanism that provides the user with required tools to build and modify her query through feedbacks. Users behavioral in CBIR environments are studied, and a new feedback methodology is presented to efficiently capture users’ image perceptions. The process element includes image learning and retrieval algorithms. A Long-term image retrieval algorithm (LTL), which learns image semantics from prior search results available in the system’s transaction history, is developed using Factor Analysis. Another algorithm, a short-term learner (STL) that captures user’s image perceptions based on image features and user’s feedbacks in the on-going transaction, is developed based on Linear Discriminant Analysis. Then, a mechanism is introduced to integrate these two algorithms to one retrieval procedure. Finally, a retrieval strategy that includes learning and searching phases is defined for arranging images in the output of the system. The developed relevance feedback methodology proved to reduce the effect of human subjectivity in providing feedbacks for complex images. Retrieval algorithms were applied to images with different degrees of complexity. LTL is efficient in extracting the semantics of complex images that have a history in the system. STL is suitable for query and images that can be effectively represented by their image features. Therefore, the performance of the system in retrieving images with visual and conceptual complexities was improved when both algorithms were applied simultaneously. Finally, the strategy of retrieval phases demonstrated promising results when the query complexity increases

    Interactive content-based image retrieval using relevance feedback

    Full text link

    Using biased support vector machine in image retrieval with self-organizing map.

    Get PDF
    Chan Chi Hang.Thesis submitted in: August 2004.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 105-114).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Problem Statement --- p.3Chapter 1.2 --- Major Contributions --- p.5Chapter 1.3 --- Publication List --- p.6Chapter 1.4 --- Thesis Organization --- p.7Chapter 2 --- Background Survey --- p.9Chapter 2.1 --- Relevance Feedback Framework --- p.9Chapter 2.1.1 --- Relevance Feedback Types --- p.11Chapter 2.1.2 --- Data Distribution --- p.12Chapter 2.1.3 --- Training Set Size --- p.14Chapter 2.1.4 --- Inter-Query Learning and Intra-Query Learning --- p.15Chapter 2.2 --- History of Relevance Feedback Techniques --- p.16Chapter 2.3 --- Relevance Feedback Approaches --- p.19Chapter 2.3.1 --- Vector Space Model --- p.19Chapter 2.3.2 --- Ad-hoc Re-weighting --- p.26Chapter 2.3.3 --- Distance Optimization Approach --- p.29Chapter 2.3.4 --- Probabilistic Model --- p.33Chapter 2.3.5 --- Bayesian Approach --- p.39Chapter 2.3.6 --- Density Estimation Approach --- p.42Chapter 2.3.7 --- Support Vector Machine --- p.48Chapter 2.4 --- Presentation Set Selection --- p.52Chapter 2.4.1 --- Most-probable strategy --- p.52Chapter 2.4.2 --- Most-informative strategy --- p.52Chapter 3 --- Biased Support Vector Machine for Content-Based Image Retrieval --- p.57Chapter 3.1 --- Motivation --- p.57Chapter 3.2 --- Background --- p.58Chapter 3.2.1 --- Regular Support Vector Machine --- p.59Chapter 3.2.2 --- One-class Support Vector Machine --- p.61Chapter 3.3 --- Biased Support Vector Machine --- p.63Chapter 3.4 --- Interpretation of parameters in BSVM --- p.67Chapter 3.5 --- Soft Label Biased Support Vector Machine --- p.69Chapter 3.6 --- Interpretation of parameters in Soft Label BSVM --- p.73Chapter 3.7 --- Relevance Feedback Using Biased Support Vector Machine --- p.74Chapter 3.7.1 --- Advantages of BSVM in Relevance Feedback . . --- p.74Chapter 3.7.2 --- Relevance Feedback Algorithm By BSVM --- p.75Chapter 3.8 --- Experiments --- p.78Chapter 3.8.1 --- Synthetic Dataset --- p.80Chapter 3.8.2 --- Real-World Dataset --- p.81Chapter 3.8.3 --- Experimental Results --- p.83Chapter 3.9 --- Conclusion --- p.86Chapter 4 --- Self-Organizing Map-based Inter-Query Learning --- p.88Chapter 4.1 --- Motivation --- p.88Chapter 4.2 --- Algorithm --- p.89Chapter 4.2.1 --- Initialization and Replication of SOM --- p.89Chapter 4.2.2 --- SOM Training for Inter-Query Learning --- p.90Chapter 4.2.3 --- Incorporate with Intra-Query Learning --- p.92Chapter 4.3 --- Experiments --- p.93Chapter 4.3.1 --- Synthetic Dataset --- p.95Chapter 4.3.2 --- Real-World Dataset --- p.95Chapter 4.3.3 --- Experimental Results --- p.97Chapter 4.4 --- Conclusion --- p.98Chapter 5 --- Conclusion --- p.102Bibliography --- p.10

    Multimedia

    Get PDF
    The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications

    Content-Based Image Retrieval Using Self-Organizing Maps

    Full text link

    A picture is worth a thousand words : content-based image retrieval techniques

    Get PDF
    In my dissertation I investigate techniques for improving the state of the art in content-based image retrieval. To place my work into context, I highlight the current trends and challenges in my field by analyzing over 200 recent articles. Next, I propose a novel paradigm called __artificial imagination__, which gives the retrieval system the power to imagine and think along with the user in terms of what she is looking for. I then introduce a new user interface for visualizing and exploring image collections, empowering the user to navigate large collections based on her own needs and preferences, while simultaneously providing her with an accurate sense of what the database has to offer. In the later chapters I present work dealing with millions of images and focus in particular on high-performance techniques that minimize memory and computational use for both near-duplicate image detection and web search. Finally, I show early work on a scene completion-based image retrieval engine, which synthesizes realistic imagery that matches what the user has in mind.LEI Universiteit LeidenNWOImagin

    Marc integrador de les capacitats de Soft-Computing i de Knowledge Discovery dels Mapes Autoorganitzatius en el Raonament Basat en Casos

    Get PDF
    El Raonament Basat en Casos (CBR) és un paradigma d'aprenentatge basat en establir analogies amb problemes prèviament resolts per resoldre'n de nous. Per tant, l'organització, l'accés i la utilització del coneixement previ són aspectes claus per tenir èxit en aquest procés. No obstant, la majoria dels problemes reals presenten grans volums de dades complexes, incertes i amb coneixement aproximat i, conseqüentment, el rendiment del CBR pot veure's minvat degut a la complexitat de gestionar aquest tipus de coneixement. Això ha fet que en els últims anys hagi sorgit una nova línia de recerca anomenada Soft-Computing and Intelligent Information Retrieval enfocada en mitigar aquests efectes. D'aquí neix el context d'aquesta tesi.Dins de l'ampli ventall de tècniques Soft-Computing per tractar coneixement complex, els Mapes Autoorganitzatius (SOM) destaquen sobre la resta per la seva capacitat en agrupar les dades en patrons, els quals permeten detectar relacions ocultes entre les dades. Aquesta capacitat ha estat explotada en treballs previs d'altres investigadors, on s'ha organitzat la memòria de casos del CBR amb SOM per tal de millorar la recuperació dels casos.La finalitat de la present tesi és donar un pas més enllà en la simple combinació del CBR i de SOM, de tal manera que aquí s'introdueixen les capacitats de Soft-Computing i de Knowledge Discovery de SOM en totes les fases del CBR per nodrir-les del nou coneixement descobert. A més a més, les mètriques de complexitat apareixen en aquest context com un instrument precís per modelar el funcionament de SOM segons la tipologia de les dades. L'assoliment d'aquesta integració es pot dividir principalment en quatre fites: (1) la definició d'una metodologia per determinar la millor manera de recuperar els casos tenint en compte la complexitat de les dades i els requeriments de l'usuari; (2) la millora de la fiabilitat de la proposta de solucions gràcies a les relacions entre els clústers i els casos; (3) la potenciació de les capacitats explicatives mitjançant la generació d'explicacions simbòliques; (4) el manteniment incremental i semi-supervisat de la memòria de casos organitzada per SOM.Tots aquests punts s'integren sota la plataforma SOMCBR, la qual és extensament avaluada sobre datasets provinents de l'UCI Repository i de dominis mèdics i telemàtics.Addicionalment, la tesi aborda de manera secundària dues línies de recerca fruït dels requeriments dels projectes on ha estat ubicada. D'una banda, s'aborda la definició de funcions de similitud específiques per definir com comparar un cas resolt amb un de nou mitjançant una variant de la Computació Evolutiva anomenada Evolució de Gramàtiques (GE). D'altra banda, s'estudia com definir esquemes de cooperació entre sistemes heterogenis per millorar la fiabilitat de la seva resposta conjunta mitjançant GE. Ambdues línies són integrades en dues plataformes, BRAIN i MGE respectivament, i són també avaluades amb els datasets anteriors.El Razonamiento Basado en Casos (CBR) es un paradigma de aprendizaje basado en establecer analogías con problemas previamente resueltos para resolver otros nuevos. Por tanto, la organización, el acceso y la utilización del conocimiento previo son aspectos clave para tener éxito. No obstante, la mayoría de los problemas presentan grandes volúmenes de datos complejos, inciertos y con conocimiento aproximado y, por tanto, el rendimiento del CBR puede verse afectado debido a la complejidad de gestionarlos. Esto ha hecho que en los últimos años haya surgido una nueva línea de investigación llamada Soft-Computing and Intelligent Information Retrieval focalizada en mitigar estos efectos. Es aquí donde nace el contexto de esta tesis.Dentro del amplio abanico de técnicas Soft-Computing para tratar conocimiento complejo, los Mapas Autoorganizativos (SOM) destacan por encima del resto por su capacidad de agrupar los datos en patrones, los cuales permiten detectar relaciones ocultas entre los datos. Esta capacidad ha sido aprovechada en trabajos previos de otros investigadores, donde se ha organizado la memoria de casos del CBR con SOM para mejorar la recuperación de los casos.La finalidad de la presente tesis es dar un paso más en la simple combinación del CBR y de SOM, de tal manera que aquí se introducen las capacidades de Soft-Computing y de Knowledge Discovery de SOM en todas las fases del CBR para alimentarlas del conocimiento nuevo descubierto. Además, las métricas de complejidad aparecen en este contexto como un instrumento preciso para modelar el funcionamiento de SOM en función de la tipología de los datos. La consecución de esta integración se puede dividir principalmente en cuatro hitos: (1) la definición de una metodología para determinar la mejor manera de recuperar los casos teniendo en cuenta la complejidad de los datos y los requerimientos del usuario; (2) la mejora de la fiabilidad en la propuesta de soluciones gracias a las relaciones entre los clusters y los casos; (3) la potenciación de las capacidades explicativas mediante la generación de explicaciones simbólicas; (4) el mantenimiento incremental y semi-supervisado de la memoria de casos organizada por SOM. Todos estos puntos se integran en la plataforma SOMCBR, la cual es ampliamente evaluada sobre datasets procedentes del UCI Repository y de dominios médicos y telemáticos.Adicionalmente, la tesis aborda secundariamente dos líneas de investigación fruto de los requeri-mientos de los proyectos donde ha estado ubicada la tesis. Por un lado, se aborda la definición de funciones de similitud específicas para definir como comparar un caso resuelto con otro nuevo mediante una variante de la Computación Evolutiva denominada Evolución de Gramáticas (GE). Por otro lado, se estudia como definir esquemas de cooperación entre sistemas heterogéneos para mejorar la fiabilidad de su respuesta conjunta mediante GE. Ambas líneas son integradas en dos plataformas, BRAIN y MGE, las cuales también son evaluadas sobre los datasets anteriores.Case-Based Reasoning (CBR) is an approach of machine learning based on solving new problems by identifying analogies with other previous solved problems. Thus, organization, access and management of this knowledge are crucial issues for achieving successful results. Nevertheless, the major part of real problems presents a huge amount of complex data, which also presents uncertain and partial knowledge. Therefore, CBR performance is influenced by the complex management of this knowledge. For this reason, a new research topic has appeared in the last years for tackling this problem: Soft-Computing and Intelligent Information Retrieval. This is the point where this thesis was born.Inside the wide variety of Soft-Computing techniques for managing complex data, the Self-Organizing Maps (SOM) highlight from the rest due to their capability for grouping data according to certain patterns using the relations hidden in data. This capability has been used in a wide range of works, where the CBR case memory has been organized with SOM for improving the case retrieval.The goal of this thesis is to take a step up in the simple combination of CBR and SOM. This thesis presents how to introduce the Soft-Computing and Knowledge Discovery capabilities of SOM inside all the steps of CBR to promote them with the discovered knowledge. Furthermore, complexity measures appear in this context as a mechanism to model the performance of SOM according to data topology. The achievement of this goal can be split in the next four points: (1) the definition of a methodology for setting up the best way of retrieving cases taking into account the data complexity and user requirements; (2) the improvement of the classification reliability through the relations between cases and clusters; (3) the promotion of the explaining capabilities by means of the generation of symbolic explanations; (4) the incremental and semi-supervised case-based maintenance. All these points are integrated in the SOMCBR framework, which has been widely tested in datasets from UCI Repository and from medical and telematic domains. Additionally, this thesis secondly tackles two additional research lines due to the requirements of a project in which it has been developed. First, the definition of similarity functions ad hoc a domain is analyzed using a variant of the Evolutionary Computation called Grammar Evolution (GE). Second, the definition of cooperation schemes between heterogeneous systems is also analyzed for improving the reliability from the point of view of GE. Both lines are developed in two frameworks, BRAIN and MGE respectively, which are also evaluated over the last explained datasets
    corecore