64 research outputs found
Multiple Instance Learning: A Survey of Problem Characteristics and Applications
Multiple instance learning (MIL) is a form of weakly supervised learning
where training instances are arranged in sets, called bags, and a label is
provided for the entire bag. This formulation is gaining interest because it
naturally fits various problems and allows to leverage weakly labeled data.
Consequently, it has been used in diverse application fields such as computer
vision and document classification. However, learning from bags raises
important challenges that are unique to MIL. This paper provides a
comprehensive survey of the characteristics which define and differentiate the
types of MIL problems. Until now, these problem characteristics have not been
formally identified and described. As a result, the variations in performance
of MIL algorithms from one data set to another are difficult to explain. In
this paper, MIL problem characteristics are grouped into four broad categories:
the composition of the bags, the types of data distribution, the ambiguity of
instance labels, and the task to be performed. Methods specialized to address
each category are reviewed. Then, the extent to which these characteristics
manifest themselves in key MIL application areas are described. Finally,
experiments are conducted to compare the performance of 16 state-of-the-art MIL
methods on selected problem characteristics. This paper provides insight on how
the problem characteristics affect MIL algorithms, recommendations for future
benchmarking and promising avenues for research
Effective Graph-Based Content--Based Image Retrieval Systems for Large-Scale and Small-Scale Image Databases
This dissertation proposes two novel manifold graph-based ranking systems for Content-Based Image Retrieval (CBIR). The two proposed systems exploit the synergism between relevance feedback-based transductive short-term learning and semantic feature-based long-term learning to improve retrieval performance. Proposed systems first apply the active learning mechanism to construct users\u27 relevance feedback log and extract high-level semantic features for each image. These systems then create manifold graphs by incorporating both the low-level visual similarity and the high-level semantic similarity to achieve more meaningful structures for the image space. Finally, asymmetric relevance vectors are created to propagate relevance scores of labeled images to unlabeled images via manifold graphs. The extensive experimental results demonstrate two proposed systems outperform the other state-of-the-art CBIR systems in the context of both correct and erroneous users\u27 feedback
Intelligent Image Retrieval Techniques: A Survey
AbstractIn the current era of digital communication, the use of digital images has increased for expressing, sharing and interpreting information. While working with digital images, quite often it is necessary to search for a specific image for a particular situation based on the visual contents of the image. This task looks easy if you are dealing with tens of images but it gets more difficult when the number of images goes from tens to hundreds and thousands, and the same content-based searching task becomes extremely complex when the number of images is in the millions. To deal with the situation, some intelligent way of content-based searching is required to fulfill the searching request with right visual contents in a reasonable amount of time. There are some really smart techniques proposed by researchers for efficient and robust content-based image retrieval. In this research, the aim is to highlight the efforts of researchers who conducted some brilliant work and to provide a proof of concept for intelligent content-based image retrieval techniques
Recommended from our members
Parallelizing support vector machines for scalable image annotation
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large.
In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments.
SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced.
The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers.
The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications
Recuperação multimodal e interativa de informação orientada por diversidade
Orientador: Ricardo da Silva TorresTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Os mĂ©todos de Recuperação da Informação, especialmente considerando-se dados multimĂdia, evoluĂram para a integração de mĂşltiplas fontes de evidĂŞncia na análise de relevância de itens em uma tarefa de busca. Neste contexto, para atenuar a distância semântica entre as propriedades de baixo nĂvel extraĂdas do conteĂşdo dos objetos digitais e os conceitos semânticos de alto nĂvel (objetos, categorias, etc.) e tornar estes sistemas adaptativos Ă s diferentes necessidades dos usuários, modelos interativos que consideram o usuário mais prĂłximo do processo de recuperação tĂŞm sido propostos, permitindo a sua interação com o sistema, principalmente por meio da realimentação de relevância implĂcita ou explĂcita. Analogamente, a promoção de diversidade surgiu como uma alternativa para lidar com consultas ambĂguas ou incompletas. Adicionalmente, muitos trabalhos tĂŞm tratado a ideia de minimização do esforço requerido do usuário em fornecer julgamentos de relevância, Ă medida que mantĂ©m nĂveis aceitáveis de eficácia. Esta tese aborda, propõe e analisa experimentalmente mĂ©todos de recuperação da informação interativos e multimodais orientados por diversidade. Este trabalho aborda de forma abrangente a literatura acerca da recuperação interativa da informação e discute sobre os avanços recentes, os grandes desafios de pesquisa e oportunidades promissoras de trabalho. NĂłs propusemos e avaliamos dois mĂ©todos de aprimoramento do balanço entre relevância e diversidade, os quais integram mĂşltiplas informações de imagens, tais como: propriedades visuais, metadados textuais, informação geográfica e descritores de credibilidade dos usuários. Por sua vez, como integração de tĂ©cnicas de recuperação interativa e de promoção de diversidade, visando maximizar a cobertura de mĂşltiplas interpretações/aspectos de busca e acelerar a transferĂŞncia de informação entre o usuário e o sistema, nĂłs propusemos e avaliamos um mĂ©todo multimodal de aprendizado para ranqueamento utilizando realimentação de relevância sobre resultados diversificados. Nossa análise experimental mostra que o uso conjunto de mĂşltiplas fontes de informação teve impacto positivo nos algoritmos de balanceamento entre relevância e diversidade. Estes resultados sugerem que a integração de filtragem e re-ranqueamento multimodais Ă© eficaz para o aumento da relevância dos resultados e tambĂ©m como mecanismo de potencialização dos mĂ©todos de diversificação. AlĂ©m disso, com uma análise experimental minuciosa, nĂłs investigamos várias questões de pesquisa relacionadas Ă possibilidade de aumento da diversidade dos resultados e a manutenção ou atĂ© mesmo melhoria da sua relevância em sessões interativas. Adicionalmente, nĂłs analisamos como o esforço em diversificar afeta os resultados gerais de uma sessĂŁo de busca e como diferentes abordagens de diversificação se comportam para diferentes modalidades de dados. Analisando a eficácia geral e tambĂ©m em cada iteração de realimentação de relevância, nĂłs mostramos que introduzir diversidade nos resultados pode prejudicar resultados iniciais, enquanto que aumenta significativamente a eficácia geral em uma sessĂŁo de busca, considerando-se nĂŁo apenas a relevância e diversidade geral, mas tambĂ©m o quĂŁo cedo o usuário Ă© exposto ao mesmo montante de itens relevantes e nĂvel de diversidadeAbstract: Information retrieval methods, especially considering multimedia data, have evolved towards the integration of multiple sources of evidence in the analysis of the relevance of items considering a given user search task. In this context, for attenuating the semantic gap between low-level features extracted from the content of the digital objects and high-level semantic concepts (objects, categories, etc.) and making the systems adaptive to different user needs, interactive models have brought the user closer to the retrieval loop allowing user-system interaction mainly through implicit or explicit relevance feedback. Analogously, diversity promotion has emerged as an alternative for tackling ambiguous or underspecified queries. Additionally, several works have addressed the issue of minimizing the required user effort on providing relevance assessments while keeping an acceptable overall effectiveness. This thesis discusses, proposes, and experimentally analyzes multimodal and interactive diversity-oriented information retrieval methods. This work, comprehensively covers the interactive information retrieval literature and also discusses about recent advances, the great research challenges, and promising research opportunities. We have proposed and evaluated two relevance-diversity trade-off enhancement work-flows, which integrate multiple information from images, such as: visual features, textual metadata, geographic information, and user credibility descriptors. In turn, as an integration of interactive retrieval and diversity promotion techniques, for maximizing the coverage of multiple query interpretations/aspects and speeding up the information transfer between the user and the system, we have proposed and evaluated a multimodal learning-to-rank method trained with relevance feedback over diversified results. Our experimental analysis shows that the joint usage of multiple information sources positively impacted the relevance-diversity balancing algorithms. Our results also suggest that the integration of multimodal-relevance-based filtering and reranking was effective on improving result relevance and also boosted diversity promotion methods. Beyond it, with a thorough experimental analysis we have investigated several research questions related to the possibility of improving result diversity and keeping or even improving relevance in interactive search sessions. Moreover, we analyze how much the diversification effort affects overall search session results and how different diversification approaches behave for the different data modalities. By analyzing the overall and per feedback iteration effectiveness, we show that introducing diversity may harm initial results whereas it significantly enhances the overall session effectiveness not only considering the relevance and diversity, but also how early the user is exposed to the same amount of relevant items and diversityDoutoradoCiĂŞncia da ComputaçãoDoutor em CiĂŞncia da ComputaçãoP-4388/2010140977/2012-0CAPESCNP
Discriminative learning with application to interactive facial image retrieval
The amount of digital images is growing drastically and advanced tools for searching in large image collections are therefore becoming urgently needed. Content-based image retrieval is advantageous for such a task in terms of automatic feature extraction and indexing without human labor and subjectivity in image annotations. The semantic gap between high-level semantics and low-level visual features can be reduced by the relevance feedback technique. However, most existing interactive content-based image retrieval (ICBIR) systems require a substantial amount of human evaluation labor, which leads to the evaluation fatigue problem that heavily restricts the application of ICBIR.
In this thesis a solution based on discriminative learning is presented. It extends an existing ICBIR system, PicSOM, towards practical applications. The enhanced ICBIR system allows users to input partial relevance which includes not only relevance extent but also relevance reason. A multi-phase retrieval with partial relevance can adapt to the user's searching intention in a from-coarse-to-fine manner.
The retrieval performance can be improved by employing supervised learning as a preprocessing step before unsupervised content-based indexing. In this work, Parzen Discriminant Analysis (PDA) is proposed to extract discriminative components from images. PDA regularizes the Informative Discriminant Analysis (IDA) objective with a greatly accelerated optimization algorithm. Moreover, discriminative Self-Organizing Maps trained with resulting features can easily handle fuzzy categorizations.
The proposed techniques have been applied to interactive facial image retrieval. Both a query example and a benchmark simulation study are presented, which indicate that the first image depicting the target subject can be retrieved in a small number of rounds
- …