9 research outputs found

    Indexation sémantique des images et des vidéos par apprentissage actif

    Get PDF
    Le cadre général de cette thèse est l'indexation sémantique et la recherche d'informations, appliquée à des documents multimédias. Plus précisément, nous nous intéressons à l'indexation sémantique des concepts dans des images et vidéos par les approches d'apprentissage actif, que nous utilisons pour construire des corpus annotés. Tout au long de cette thèse, nous avons montré que les principales difficultés de cette tâche sont souvent liées, en général, à l'fossé sémantique. En outre, elles sont liées au problème de classe-déséquilibre dans les ensembles de données à grande échelle, où les concepts sont pour la plupart rares. Pour l'annotation de corpus, l'objectif principal de l'utilisation de l'apprentissage actif est d'augmenter la performance du système en utilisant que peu d'échantillons annotés que possible, ainsi minimisant les coûts de l'annotations des données (par exemple argent et temps). Dans cette thèse, nous avons contribué à plusieurs niveaux de l'indexation multimédia et nous avons proposé trois approches qui succèdent des systèmes de l'état de l'art: i) l'approche multi-apprenant (ML) qui surmonte le problème de classe-déséquilibre dans les grandes bases de données, ii) une méthode de reclassement qui améliore l'indexation vidéo, iii) nous avons évalué la normalisation en loi de puissance et de l'APC et a montré son efficacité dans l'indexation multimédia. En outre, nous avons proposé l'approche ALML qui combine le multi-apprenant avec l'apprentissage actif, et nous avons également proposé une méthode incrémentale qui accélère l'approche proposé (ALML). En outre, nous avons proposé l'approche de nettoyage actif, qui aborde la qualité des annotations. Les méthodes proposées ont été tous validées par plusieurs expériences, qui ont été menées et évaluées sur des collections à grande échelle de l'indice de benchmark internationale bien connue, appelés TRECVID. Enfin, nous avons présenté notre système d'annotation dans le monde réel basé sur l'apprentissage actif, qui a été utilisé pour mener les annotations de l'ensemble du développement de la campagne TRECVID en 2011, et nous avons présenté notre participation à la tâche d'indexation sémantique de cette campagne, dans laquelle nous nous sommes classés à la 3ème place sur 19 participants.The general framework of this thesis is semantic indexing and information retrieval, applied to multimedia documents. More specifically, we are interested in the semantic indexing of concepts in images and videos by the active learning approaches that we use to build annotated corpus. Throughout this thesis, we have shown that the main difficulties of this task are often related, in general, to the semantic-gap. Furthermore, they are related to the class-imbalance problem in large scale datasets, where concepts are mostly sparse. For corpus annotation, the main objective of using active learning is to increase the system performance by using as few labeled samples as possible, thereby minimizing the cost of labeling data (e.g. money and time). In this thesis, we have contributed in several levels of multimedia indexing and proposed three approaches that outperform state-of-the-art systems: i) the multi-learner approach (ML) that overcomes the class-imbalance problem in large-scale datasets, ii) a re-ranking method that improves the video indexing, iii) we have evaluated the power-law normalization and the PCA and showed its effectiveness in multimedia indexing. Furthermore, we have proposed the ALML approach that combines the multi-learner with active learning, and also proposed an incremental method that speeds up ALML approach. Moreover, we have proposed the active cleaning approach, which tackles the quality of annotations. The proposed methods were validated through several experiments, which were conducted and evaluated on large-scale collections of the well-known international benchmark, called TrecVid. Finally, we have presented our real-world annotation system based on active learning, which was used to lead the annotations of the development set of TrecVid 2011 campaign, and we have presented our participation at the semantic indexing task of the mentioned campaign, in which we were ranked at the 3rd place out of 19 participants.SAVOIE-SCD - Bib.électronique (730659901) / SudocGRENOBLE1/INP-Bib.électronique (384210012) / SudocGRENOBLE2/3-Bib.électronique (384219901) / SudocSudocFranceF

    Evaluation Methodologies for Visual Information Retrieval and Annotation

    Get PDF
    Die automatisierte Evaluation von Informations-Retrieval-Systemen erlaubt Performanz und Qualität der Informationsgewinnung zu bewerten. Bereits in den 60er Jahren wurden erste Methodologien für die system-basierte Evaluation aufgestellt und in den Cranfield Experimenten überprüft. Heutzutage gehören Evaluation, Test und Qualitätsbewertung zu einem aktiven Forschungsfeld mit erfolgreichen Evaluationskampagnen und etablierten Methoden. Evaluationsmethoden fanden zunächst in der Bewertung von Textanalyse-Systemen Anwendung. Mit dem rasanten Voranschreiten der Digitalisierung wurden diese Methoden sukzessive auf die Evaluation von Multimediaanalyse-Systeme übertragen. Dies geschah häufig, ohne die Evaluationsmethoden in Frage zu stellen oder sie an die veränderten Gegebenheiten der Multimediaanalyse anzupassen. Diese Arbeit beschäftigt sich mit der system-basierten Evaluation von Indizierungssystemen für Bildkollektionen. Sie adressiert drei Problemstellungen der Evaluation von Annotationen: Nutzeranforderungen für das Suchen und Verschlagworten von Bildern, Evaluationsmaße für die Qualitätsbewertung von Indizierungssystemen und Anforderungen an die Erstellung visueller Testkollektionen. Am Beispiel der Evaluation automatisierter Photo-Annotationsverfahren werden relevante Konzepte mit Bezug zu Nutzeranforderungen diskutiert, Möglichkeiten zur Erstellung einer zuverlässigen Ground Truth bei geringem Kosten- und Zeitaufwand vorgestellt und Evaluationsmaße zur Qualitätsbewertung eingeführt, analysiert und experimentell verglichen. Traditionelle Maße zur Ermittlung der Performanz werden in vier Dimensionen klassifiziert. Evaluationsmaße vergeben üblicherweise binäre Kosten für korrekte und falsche Annotationen. Diese Annahme steht im Widerspruch zu der Natur von Bildkonzepten. Das gemeinsame Auftreten von Bildkonzepten bestimmt ihren semantischen Zusammenhang und von daher sollten diese auch im Zusammenhang auf ihre Richtigkeit hin überprüft werden. In dieser Arbeit wird aufgezeigt, wie semantische Ähnlichkeiten visueller Konzepte automatisiert abgeschätzt und in den Evaluationsprozess eingebracht werden können. Die Ergebnisse der Arbeit inkludieren ein Nutzermodell für die konzeptbasierte Suche von Bildern, eine vollständig bewertete Testkollektion und neue Evaluationsmaße für die anforderungsgerechte Qualitätsbeurteilung von Bildanalysesystemen.Performance assessment plays a major role in the research on Information Retrieval (IR) systems. Starting with the Cranfield experiments in the early 60ies, methodologies for the system-based performance assessment emerged and established themselves, resulting in an active research field with a number of successful benchmarking activities. With the rise of the digital age, procedures of text retrieval evaluation were often transferred to multimedia retrieval evaluation without questioning their direct applicability. This thesis investigates the problem of system-based performance assessment of annotation approaches in generic image collections. It addresses three important parts of annotation evaluation, namely user requirements for the retrieval of annotated visual media, performance measures for multi-label evaluation, and visual test collections. Using the example of multi-label image annotation evaluation, I discuss which concepts to employ for indexing, how to obtain a reliable ground truth to moderate costs, and which evaluation measures are appropriate. This is accompanied by a thorough analysis of related work on system-based performance assessment in Visual Information Retrieval (VIR). Traditional performance measures are classified into four dimensions and investigated according to their appropriateness for visual annotation evaluation. One of the main ideas in this thesis adheres to the common assumption on the binary nature of the score prediction dimension in annotation evaluation. However, the predicted concepts and the set of true indexed concepts interrelate with each other. This work will show how to utilise these semantic relationships for a fine-grained evaluation scenario. Outcomes of this thesis result in a user model for concept-based image retrieval, a fully assessed image annotation test collection, and a number of novel performance measures for image annotation evaluation

    Evaluation Methodologies for Visual Information Retrieval and Annotation

    Get PDF
    Die automatisierte Evaluation von Informations-Retrieval-Systemen erlaubt Performanz und Qualität der Informationsgewinnung zu bewerten. Bereits in den 60er Jahren wurden erste Methodologien für die system-basierte Evaluation aufgestellt und in den Cranfield Experimenten überprüft. Heutzutage gehören Evaluation, Test und Qualitätsbewertung zu einem aktiven Forschungsfeld mit erfolgreichen Evaluationskampagnen und etablierten Methoden. Evaluationsmethoden fanden zunächst in der Bewertung von Textanalyse-Systemen Anwendung. Mit dem rasanten Voranschreiten der Digitalisierung wurden diese Methoden sukzessive auf die Evaluation von Multimediaanalyse-Systeme übertragen. Dies geschah häufig, ohne die Evaluationsmethoden in Frage zu stellen oder sie an die veränderten Gegebenheiten der Multimediaanalyse anzupassen. Diese Arbeit beschäftigt sich mit der system-basierten Evaluation von Indizierungssystemen für Bildkollektionen. Sie adressiert drei Problemstellungen der Evaluation von Annotationen: Nutzeranforderungen für das Suchen und Verschlagworten von Bildern, Evaluationsmaße für die Qualitätsbewertung von Indizierungssystemen und Anforderungen an die Erstellung visueller Testkollektionen. Am Beispiel der Evaluation automatisierter Photo-Annotationsverfahren werden relevante Konzepte mit Bezug zu Nutzeranforderungen diskutiert, Möglichkeiten zur Erstellung einer zuverlässigen Ground Truth bei geringem Kosten- und Zeitaufwand vorgestellt und Evaluationsmaße zur Qualitätsbewertung eingeführt, analysiert und experimentell verglichen. Traditionelle Maße zur Ermittlung der Performanz werden in vier Dimensionen klassifiziert. Evaluationsmaße vergeben üblicherweise binäre Kosten für korrekte und falsche Annotationen. Diese Annahme steht im Widerspruch zu der Natur von Bildkonzepten. Das gemeinsame Auftreten von Bildkonzepten bestimmt ihren semantischen Zusammenhang und von daher sollten diese auch im Zusammenhang auf ihre Richtigkeit hin überprüft werden. In dieser Arbeit wird aufgezeigt, wie semantische Ähnlichkeiten visueller Konzepte automatisiert abgeschätzt und in den Evaluationsprozess eingebracht werden können. Die Ergebnisse der Arbeit inkludieren ein Nutzermodell für die konzeptbasierte Suche von Bildern, eine vollständig bewertete Testkollektion und neue Evaluationsmaße für die anforderungsgerechte Qualitätsbeurteilung von Bildanalysesystemen.Performance assessment plays a major role in the research on Information Retrieval (IR) systems. Starting with the Cranfield experiments in the early 60ies, methodologies for the system-based performance assessment emerged and established themselves, resulting in an active research field with a number of successful benchmarking activities. With the rise of the digital age, procedures of text retrieval evaluation were often transferred to multimedia retrieval evaluation without questioning their direct applicability. This thesis investigates the problem of system-based performance assessment of annotation approaches in generic image collections. It addresses three important parts of annotation evaluation, namely user requirements for the retrieval of annotated visual media, performance measures for multi-label evaluation, and visual test collections. Using the example of multi-label image annotation evaluation, I discuss which concepts to employ for indexing, how to obtain a reliable ground truth to moderate costs, and which evaluation measures are appropriate. This is accompanied by a thorough analysis of related work on system-based performance assessment in Visual Information Retrieval (VIR). Traditional performance measures are classified into four dimensions and investigated according to their appropriateness for visual annotation evaluation. One of the main ideas in this thesis adheres to the common assumption on the binary nature of the score prediction dimension in annotation evaluation. However, the predicted concepts and the set of true indexed concepts interrelate with each other. This work will show how to utilise these semantic relationships for a fine-grained evaluation scenario. Outcomes of this thesis result in a user model for concept-based image retrieval, a fully assessed image annotation test collection, and a number of novel performance measures for image annotation evaluation

    Utilisation du contexte pour l’indexation sémantique des images et vidéos

    Get PDF
    The automated indexing of image and video is a difficult problem because of the``distance'' between the arrays of numbers encoding these documents and the concepts (e.g. people, places, events or objects) with which we wish to annotate them. Methods exist for this but their results are far from satisfactory in terms of generality and accuracy. Existing methods typically use a single set of such examples and consider it as uniform. This is not optimal because the same concept may appear in various contexts and its appearance may be very different depending upon these contexts. In this thesis, we considered the use of context for indexing multimedia documents. The context has been widely used in the state of the art to treat various problems. In our work, we use relationships between concepts as a source of semantic context. For the case of videos, we exploit the temporal context that models relationships between the shots of the same video. We propose several approaches using both types of context and their combination, in different levels of an indexing system. We also present the problem of multiple concept detection. We assume that it is related to the context use problematic. We consider that detecting simultaneously a set of concepts is equivalent to detecting one or more concepts forming the group in a context where the others are present. To do that, we studied and compared two types of approaches. All our proposals are generic and can be applied to any system for the detection of any concept. We evaluated our contributions on TRECVID and VOC collections, which are of international standards and recognized by the community. We achieved good results comparable to those of the best indexing systems evaluated in recent years in the evaluation campaigns cited previously.L'indexation automatisée des documents image fixe et vidéo est un problème difficile en raison de la ``distance'' existant entre les tableaux de nombres codant ces documents et les concepts avec lesquels on souhaite les annoter (personnes, lieux, événements ou objets, par exemple). Des méthodes existent pour cela mais leurs résultats sont loin d'être satisfaisants en termes de généralité et de précision. Elles utilisent en général un ensemble unique de tels exemples et le considère d'une manière uniforme. Ceci n'est pas optimal car un même concept peut apparaître dans des contextes très divers et son apparence peut être très différente en fonction de ces contextes. Dans le cadre de cette thèse, nous avons considéré l'utilisation du contexte pour l'indexation des documents multimédia. Le contexte a largement été utilisé dans l'état de l'art pour traiter diverses problématiques. Dans notre travail, nous retenons les relations entre les concepts comme source de contexte sémantique. Pour le cas des vidéos, nous exploitons le contexte temporel qui modélise les relations entre les plans d'une même vidéo. Nous proposons plusieurs approches utilisant les deux types de contexte ainsi que leur combinaison, dans différents niveaux d'un système d'indexation. Nous présentons également le problème de détection simultanée de groupes de concepts que nous jugeons lié à la problématique de l'utilisation du contexte. Nous considérons que la détection d'un groupe de concepts revient à détecter un ou plusieurs concepts formant le groupe dans un contexte ou les autres sont présents. Nous avons étudié et comparé pour cela deux catégories d'approches. Toutes nos propositions sont génériques et peuvent être appliquées à n'importe quel système pour la détection de n'importe quel concept. Nous avons évalué nos contributions sur les collections de données TRECVid et VOC, qui sont des standards internationaux et reconnues par la communauté. Nous avons obtenu de bons résultats, comparables à ceux des meilleurs systèmes d'indexation évalués ces dernières années dans les compagnes d'évaluation précédemment citées

    Local selection of features and its applications to image search and annotation

    Get PDF
    In multimedia applications, direct representations of data objects typically involve hundreds or thousands of features. Given a query object, the similarity between the query object and a database object can be computed as the distance between their feature vectors. The neighborhood of the query object consists of those database objects that are close to the query object. The semantic quality of the neighborhood, which can be measured as the proportion of neighboring objects that share the same class label as the query object, is crucial for many applications, such as content-based image retrieval and automated image annotation. However, due to the existence of noisy or irrelevant features, errors introduced into similarity measurements are detrimental to the neighborhood quality of data objects. One way to alleviate the negative impact of noisy features is to use feature selection techniques in data preprocessing. From the original vector space, feature selection techniques select a subset of features, which can be used subsequently in supervised or unsupervised learning algorithms for better performance. However, their performance on improving the quality of data neighborhoods is rarely evaluated in the literature. In addition, most traditional feature selection techniques are global, in the sense that they compute a single set of features across the entire database. As a consequence, the possibility that the feature importance may vary across different data objects or classes of objects is neglected. To compute a better neighborhood structure for objects in high-dimensional feature spaces, this dissertation proposes several techniques for selecting features that are important to the local neighborhood of individual objects. These techniques are then applied to image applications such as content-based image retrieval and image label propagation. Firstly, an iterative K-NN graph construction method for image databases is proposed. A local variant of the Laplacian Score is designed for the selection of features for individual images. Noisy features are detected and sparsified iteratively from the original standardized feature vectors. This technique is incorporated into an approximate K-NN graph construction method so as to improve the semantic quality of the graph. Secondly, in a content-based image retrieval system, a generalized version of the Laplacian Score is used to compute different feature subspaces for images in the database. For online search, a query image is ranked in the feature spaces of database images. Those database images for which the query image is ranked highly are selected as the query results. Finally, a supervised method for the local selection of image features is proposed, for refining the similarity graph used in an image label propagation framework. By using only the selected features to compute the edges leading from labeled image nodes to unlabeled image nodes, better annotation accuracy can be achieved. Experimental results on several datasets are provided in this dissertation, to demonstrate the effectiveness of the proposed techniques for the local selection of features, and for the image applications under consideration

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe
    corecore