7 research outputs found

    Tools for image annotation Using context-awareness, NFC and image clustering

    Get PDF
    Annotation of images is crucial for enabling keyword based imagesearch. However, the enormous amount of available digital photos makesmanual annotation impractical, and requires methods for automaticimage annotation. This paper describes two complementary approachesto automatic annotation of images depicting some public attraction. TheLoTagr system provides annotation information for already captured,geo-positioned images, by selecting nearby, previously tagged imagesfrom a source image collection, and subsequently collect the mostfrequently used tags from these images. The NfcAnnotate systemenables annotation at image capture time, by using NFC (Near FieldCommunication) and NFC information tags provided at the site ofthe attraction. NfcAnnotate enables clustering of topically relatedimages, which makes it possible annotate a set of images in oneannotation operation. In cases when NFC information tags are notavailable, NfcAnnotate image clustering can be combined with LoTagrto conveniently annotate every image in the cluster in a single operation

    āļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāđ€āļŦāļĄāļ·āļ­āļ‡āļ”āļąāļŠāļ™āļĩāļ–āđ‰āļ­āļĒāļ„āļģ āļˆāļēāļāļ‚āđ‰āļ­āļĄāļđāļĨāļĢāļ°āļšāļļāļ•āļģāđāļŦāļ™āđˆāļ‡āđ€āļŠāļīāļ‡āļžāļ·āđ‰āļ™āļ—āļĩāđˆāļœāđˆāļēāļ™āļŠāļ·āđˆāļ­āļŠāļąāļ‡āļ„āļĄāļ­āļ­āļ™āđ„āļĨāļ™āđŒ

    Get PDF
    āļ‡āļēāļ™āļ§āļīāļˆāļąāļĒāļ™āļĩāđ‰āļĄāļĩāļ§āļąāļ•āļ–āļļāļ›āļĢāļ°āļŠāļ‡āļ„āđŒāđ€āļžāļ·āđˆāļ­āļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāđ€āļŦāļĄāļ·āļ­āļ‡āļ”āļąāļŠāļ™āļĩāļ–āđ‰āļ­āļĒāļ„āļģāļˆāļēāļāļ‚āđ‰āļ­āļĄāļđāļĨāļĢāļ°āļšāļļāļ•āļģāđāļŦāļ™āđˆāļ‡āđ€āļŠāļīāļ‡āļžāļ·āđ‰āļ™āļ—āļĩāđˆāļœāđˆāļēāļ™āļŠāļ·āđˆāļ­āļŠāļąāļ‡āļ„āļĄāļ­āļ­āļ™āđ„āļĨāļ™āđŒ āđāļ­āļ›āļžāļĨāļīāđ€āļ„āļŠāļąāļ™āļŸāļĨāļīāļāļ„āđŒāđ€āļāļ­āļĢāđŒ (flickr) āđƒāļ™āļ›āļĢāļ°āđ€āļ—āļĻāđ„āļ—āļĒ āļ‚āđ‰āļ­āļĄāļđāļĨāļ—āļĩāđˆāđ„āļ”āđ‰āļˆāļēāļāļāļēāļĢāļĢāļ°āļšāļļāļ•āļģāđāļŦāļ™āđˆāļ‡ (geo-tagged) āļĄāļēāļˆāļēāļāļāļēāļĢāļ—āļĩāđˆāļœāļđāđ‰āđƒāļŠāđ‰āļ‡āļēāļ™āđāļšāđˆāļ‡āļ›āļąāļ™āļ‚āđ‰āļ­āļĄāļđāļĨ āļĢāļđāļ›āļ āļēāļžāļŦāļĢāļ·āļ­āđāļŠāļ”āļ‡āļ„āļ§āļēāļĄāļ„āļīāļ”āđ€āļŦāđ‡āļ™āļ•āđˆāļēāļ‡āđ† āļ—āļĩāđˆāļĄāļĩāļ„āļ§āļēāļĄāļŠāļąāļĄāļžāļąāļ™āļ˜āđŒāļāļąāļšāļŠāļ–āļēāļ™āļ—āļĩāđˆ āļ‡āļēāļ™āļ§āļīāļˆāļąāļĒāļ™āļĩāđ‰āļ™āļģāđ€āļ—āļ„āļ™āļīāļ„āđ€āļŦāļĄāļ·āļ­āļ‡āļ”āļąāļŠāļ™āļĩāļ‚āđ‰āļ­āļ„āļ§āļēāļĄ (tags mining) āđ‚āļ”āļĒāļāļēāļĢāļ™āļģāļ‚āđ‰āļ­āļĄāļđāļĨāļœāđˆāļēāļ™āļāļĢāļ°āļšāļ§āļ™āļāļēāļĢāļŠāļāļąāļ”āļ„āļ§āļēāļĄāļĢāļđāđ‰ āļĄāļēāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāđ€āļžāļ·āđˆāļ­āļ„āđ‰āļ™āļŦāļēāļĢāļđāļ›āđāļšāļšāļŦāļĢāļ·āļ­āļ„āļ§āļēāļĄāļŠāļąāļĄāļžāļąāļ™āļ˜āđŒāļ‚āļ­āļ‡āļ‚āđ‰āļ­āļĄāļđāļĨ āļāļēāļĢāļ”āļģāđ€āļ™āļīāļ™āļāļēāļĢāļ§āļīāļˆāļąāļĒāđāļšāđˆāļ‡āļ­āļ­āļāđ€āļ›āđ‡āļ™ 3 āļ‚āļąāđ‰āļ™āļ•āļ­āļ™āļŦāļĨāļąāļāļ„āļ·āļ­ 1) āļ”āļķāļ‡āļ‚āđ‰āļ­āļĄāļđāļĨāļĢāļ°āļšāļļāļ•āļģāđāļŦāļ™āđˆāļ‡āđāļĨāļ°āļŠāļĢāđ‰āļēāļ‡āļ„āļĨāļąāļ‡āļāļēāļ™āļ‚āđ‰āļ­āļĄāļđāļĨ 2) āļ›āļĢāļ°āļĄāļ§āļĨāļœāļĨāļ‚āđ‰āļ­āļ„āļ§āļēāļĄ 3) āļˆāļąāļ”āļāļĨāļļāđˆāļĄāđāļšāļšāļĨāļģāļ”āļąāļšāļ‚āļąāđ‰āļ™āđāļĨāļ°āļŦāļēāļāļāļ‚āļ­āļ‡āļ„āļ§āļēāļĄāļŠāļąāļĄāļžāļąāļ™āļ˜āđŒāļĢāļ°āļŦāļ§āđˆāļēāļ‡āļ‚āđ‰āļ­āļ„āļ§āļēāļĄāđāļĨāļ°āļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļ„āļ§āļēāļĄāļŦāļ™āļēāđāļ™āđˆāļ™āđ€āļŠāļīāļ‡āļžāļ·āđ‰āļ™āļ—āļĩāđˆ āļœāļĨāļāļēāļĢāļ§āļīāļˆāļąāļĒāļ—āļĩāđˆāđ„āļ”āđ‰āļˆāļēāļāļāļēāļĢāļˆāļąāļ”āļāļĨāļļāđˆāļĄāđāļĨāļ°āļāļēāļĢāļŦāļēāļāļāļ‚āļ­āļ‡āļ„āļ§āļēāļĄāļŠāļąāļĄāļžāļąāļ™āļ˜āđŒāļ”āļąāļŠāļ™āļĩāļ‚āđ‰āļ­āļ„āļ§āļēāļĄāđāļĨāļ°āļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļ„āļ§āļēāļĄāļŦāļ™āļēāđāļ™āđˆāļ™āļ—āļĩāđˆāļ­āļ˜āļīāļšāļēāļĒāļ–āļķāļ‡āļĨāļąāļāļĐāļ“āļ°āļāļēāļĢāļāļĢāļ°āļˆāļēāļĒāđ€āļŠāļīāļ‡āļžāļ·āđ‰āļ™āļ—āļĩāđˆāļ‚āļ­āļ‡āļāļēāļĢāļĢāļ°āļšāļļāļ•āļģāđāļŦāļ™āđˆāļ‡ flickr āļŠāļĢāļļāļ›āđ„āļ”āđ‰āļ§āđˆāļē āļ”āļąāļŠāļ™āļĩāļ–āđ‰āļ­āļĒāļ„āļģ flickr āļ—āļĩāđˆāļ„āļ§āļēāļĄāļ„āļĨāđ‰āļēāļĒāļ„āļĨāļķāļ‡āļāļąāļ™āļĄāļēāļāđƒāļ™āđāļ•āđˆāļĨāļ°āļāļĨāļļāđˆāļĄāļˆāļ°āļĄāļĩāļ„āđˆāļēāļ„āļ§āļēāļĄāđ€āļŠāļ·āđˆāļ­āļĄāļąāđˆāļ™āļŠāļđāļ‡ āļ‹āļķāđˆāļ‡āļŠāļēāļĄāļēāļĢāļ–āđāļšāđˆāļ‡āļāļĨāļļāđˆāļĄāļ‚āļ­āļ‡āļ‚āđ‰āļ­āļ„āļ§āļēāļĄāļ­āļ­āļāđ€āļ›āđ‡āļ™ 3 āļ›āļĢāļ°āđ€āļ āļ—āļ—āļĩāđˆāļŠāļģāļ„āļąāļ āđ„āļ”āđ‰āđāļāđˆ āļ›āļĢāļ°āđ€āļ āļ—āļŠāļ–āļēāļ™āļ—āļĩāđˆāļ—āđˆāļ­āļ‡āđ€āļ—āļĩāđˆāļĒāļ§āđ€āļŠāļīāļ‡āļ˜āļĢāļĢāļĄāļŠāļēāļ•āļī āļŠāļ–āļēāļ™āļ—āļĩāđˆāļ—āđˆāļ­āļ‡āđ€āļ—āļĩāđˆāļĒāļ§āđ€āļŠāļīāļ‡āļ§āļąāļ’āļ™āļ˜āļĢāļĢāļĄ āđāļĨāļ°āļāļīāļˆāļāļĢāļĢāļĄāļžāļīāđ€āļĻāļĐāļ­āļ·āđˆāļ™āđ† āļ‹āļķāđˆāļ‡āđ€āļŠāļ·āđˆāļ­āļĄāđ‚āļĒāļ‡āļāļąāļšāļĨāļąāļāļĐāļ“āļ°āļāļēāļĢāđƒāļŠāđ‰āļ‡āļēāļ™āļŠāļ·āđˆāļ­āļŠāļąāļ‡āļ„āļĄāļ­āļ­āļ™āđ„āļĨāļ™āđŒ āđāļĨāļ°āđāļŠāļ”āļ‡āđƒāļŦāđ‰āđ€āļŦāđ‡āļ™āļ–āļķāļ‡āļ›āļąāļˆāļˆāļąāļĒāļ•āđˆāļēāļ‡āđ† āļ—āļĩāđˆāļĄāļĩāļœāļĨāļ•āđˆāļ­āļžāļĪāļ•āļīāļāļĢāļĢāļĄāđāļĨāļ°āļāļēāļĢāļ•āļ­āļšāļŠāļ™āļ­āļ‡āļ‚āļ­āļ‡āļœāļđāđ‰āđƒāļŠāđ‰āļ‡āļēāļ™ āđ„āļ”āđ‰āđāļāđˆ āļ„āļ§āļēāļĄāļ™āļīāļĒāļĄāļ‚āļ­āļ‡āļŠāļ–āļēāļ™āļ—āļĩāđˆāļ—āđˆāļ­āļ‡āđ€āļ—āļĩāđˆāļĒāļ§ āļ•āļēāļĄāļĨāļąāļāļĐāļ“āļ°āļ āļđāļĄāļīāļ›āļĢāļ°āđ€āļ—āļĻāđāļĨāļ°āļĪāļ”āļđāļāļēāļĨ āļ‹āļķāđˆāļ‡āļŠāļēāļĄāļēāļĢāļ–āļ™āļģāđ„āļ›āđƒāļŠāđ‰āļ›āļĢāļ°āđ‚āļĒāļŠāļ™āđŒāđƒāļ™āļāļēāļĢāļ§āļēāļ‡āđāļœāļ™āđāļĨāļ°āļžāļąāļ’āļ™āļēāđƒāļ™āļ”āđ‰āļēāļ™āļ•āđˆāļēāļ‡āđ† āļāļēāļĢāđ€āļ‚āđ‰āļēāļ–āļķāļ‡āļžāļ·āđ‰āļ™āļ—āļĩāđˆ āđ‚āļ„āļĢāļ‡āļŠāļĢāđ‰āļēāļ‡āļžāļ·āđ‰āļ™āļāļēāļ™ āļāļēāļĢāđƒāļŦāđ‰āļšāļĢāļīāļāļēāļĢāđ€āļ„āļĢāļ·āļ­āļ‚āđˆāļēāļĒāļ­āļīāļ™āđ€āļ—āļ­āļĢāđŒāđ€āļ™āđ‡āļ• āļ„āļĄāļ™āļēāļ„āļĄāļ‚āļ™āļŠāđˆāļ‡ āđāļĨāļ°āļāļīāļˆāļāļĢāļĢāļĄāļ•āđˆāļēāļ‡āđ† āļ—āļĩāđˆāļĄāļĩāļ„āļ§āļēāļĄāļŠāļģāļ„āļąāļāļ•āđˆāļ­āļŠāļ–āļēāļ™āļ—āļĩāđˆāļ™āļąāđ‰āļ™āđ† āļ„āļģāļŠāļģāļ„āļąāļ: āđ€āļŦāļĄāļ·āļ­āļ‡āļ”āļąāļŠāļ™āļĩāļ‚āđ‰āļ­āļ„āļ§āļēāļĄ āļāļēāļĢāļĢāļ°āļšāļļāļ•āļģāđāļŦāļ™āđˆāļ‡ āļ āļđāļĄāļīāļŠāļēāļĢāļŠāļ™āđ€āļ—āļĻāļĻāļēāļŠāļ•āļĢ

    Potential Indirect Relationships in Productive Networks

    Get PDF
    Productive Networks, such as Social Networks Services, organize evidence about human behavior. This evidence is independent of the network content type, and may support the discovery of new relationships between users and content, or with other users. These indirect relationships are important for recommendation systems, and systems where potential relationships between users and content (e.g., locations) is relevant, such as with the emergency management domain, where the discovery of relationships between users and locations on productive networks may enable the identification of population density variations, increasing the accuracy of emergency alerts. This thesis presents a Productive Networks model, which enables the development of a methodology for indirect relationships discovery, using the metadata on the network, and avoiding the computational cost of content analysis. We designed and conducted a set of experiments to evaluate our proposals. Our results are twofold: firstly, the productive network model is sufficiently robust to represent a wide range of networks; secondly, the indirect relationship discovery methodology successfully identifies relevant relationships between users and content. We also present applications of the model and methodology in several contexts

    Explorando a localizaçÃĢo e orientaçÃĢo de fotograas pessoais para descoberta de pontos de interesse baseada em agrupamento.

    Get PDF
    A descoberta de conhecimento a partir de grandes repositÃģrios online de fotograas tem sido uma ÃĄrea de pesquisa bastante ativa nos Últimos anos. Isso se deve principalmente a trÊs fatores: incorporaçÃĢo de cÃĒmeras digitais e sensores de geolocalizaçÃĢo aos dispositivos mÃģveis; avanços na conectividade com a Internet; e evoluçÃĢo das redes sociais. As fotograas armazenadas nesses repositÃģrios possuem metadados contextuais que podem ser utilizados em aplicaçÃĩes de descoberta de conhecimento, tais como: detecçÃĢo de pontos de interesse (POIs); geraçÃĢo de roteiros de viagens; e organizaçÃĢo automÃĄtica de fotograas. A maioria das abordagens para detecçÃĢo de POIs parte do princípio que as ÃĄreas geogrÃĄïŽcas onde uma grande quantidade de pessoas capturou fotograas indica a existÊncia de um ponto de interesse. PorÃĐm, em muitos casos, os POIs estÃĢo localizados a uma certa distÃĒncia desse local na orientaçÃĢo em que a cÃĒmera estava direcionada, e nÃĢo no ponto exato da captura da fotograa. A maioria das tÃĐcnicas propostas na literatura nÃĢo consideram o uso da orientaçÃĢo no processo de detecçÃĢo de pontos de interesses. Dessa forma, este trabalho propÃĩe novos algoritmos e tÃĐcnicas para detecçÃĢo de pontos de interesse em cidades turísticas a partir de coleçÃĩes de fotograas orientadas e georreferenciadas explorando de diversas formas a orientaçÃĢo geogrÃĄïŽca. Esta pesquisa comprovou a importÃĒncia do uso da orientaçÃĢo nos novos algoritmos voltados para detecçÃĢo de pontos de interesses. Os experimentos, utilizando uma base de dados real de grandes cidades, demonstraram que os algoritmos considerando a orientaçÃĢo conseguem, em alguns cenÃĄrios, superar os que nÃĢo a consideram. TambÃĐm foram propostas novas mÃĐtricas de avaliaçÃĢo e uma ferramenta para auxiliar as atividades de descoberta de conhecimento baseada em grandes massas de fotograas.The knowledge discovery from huge photo repositories has been a very active area of research in the last years. This is due to three facts: the incorporation of digital cameras and geolocation sensors in mobile devices; the advances in Internet connectivity; and the evolution of social networks. The photos stored on those repositories have contextual metadata. Those metadata could be used for many applications of knowledge discovering, such as: Point of Interest (POI) detection; generating of tourist guides; and automatic photo organization. Most approaches for POI detection assume that geographic areas with high density of photos indicate the existence of a point of interest in that area. However, in many cases, the POIs are located in a certain distance of that position in direction where camera was aiming, and not in the exact point of photo shooting. Most of related work do not consider the use of orientation in the process of POI detection. In this way, we propose a set of algorithms and techniques for POI discovery in touristic cities using geotagged and oriented photos collection exploring the geographic orientation in different ways. This research has proven the importance of the usage of orientation in the new algorithms for POI detection. In the experiments with collections related to big cities, the algorithms considering orientation, in several scenarios, have beaten those that do not consider. Also, new metrics of evaluation have been proposed and a new framework to assist all the tasks for knowledge discovery based on huge photo collections.Cape

    Suchbasierte automatische Bildannotation anhand geokodierter Community-Fotos

    Get PDF
    In the Web 2.0 era, platforms for sharing and collaboratively annotating images with keywords, called tags, became very popular. Tags are a powerful means for organizing and retrieving photos. However, manual tagging is time consuming. Recently, the sheer amount of user-tagged photos available on the Web encouraged researchers to explore new techniques for automatic image annotation. The idea is to annotate an unlabeled image by propagating the labels of community photos that are visually similar to it. Most recently, an ever increasing amount of community photos is also associated with location information, i.e., geotagged. In this thesis, we aim at exploiting the location context and propose an approach for automatically annotating geotagged photos. Our objective is to address the main limitations of state-of-the-art approaches in terms of the quality of the produced tags and the speed of the complete annotation process. To achieve these goals, we, first, deal with the problem of collecting images with the associated metadata from online repositories. Accordingly, we introduce a strategy for data crawling that takes advantage of location information and the social relationships among the contributors of the photos. To improve the quality of the collected user-tags, we present a method for resolving their ambiguity based on tag relatedness information. In this respect, we propose an approach for representing tags as probability distributions based on the algorithm of Laplacian score feature selection. Furthermore, we propose a new metric for calculating the distance between tag probability distributions by extending Jensen-Shannon Divergence to account for statistical fluctuations. To efficiently identify the visual neighbors, the thesis introduces two extensions to the state-of-the-art image matching algorithm, known as Speeded Up Robust Features (SURF). To speed up the matching, we present a solution for reducing the number of compared SURF descriptors based on classification techniques, while the accuracy of SURF is improved through an efficient method for iterative image matching. Furthermore, we propose a statistical model for ranking the mined annotations according to their relevance to the target image. This is achieved by combining multi-modal information in a statistical framework based on Bayes' rule. Finally, the effectiveness of each of mentioned contributions as well as the complete automatic annotation process are evaluated experimentally.Seit der EinfÞhrung von Web 2.0 steigt die PopularitÃĪt von Plattformen, auf denen Bilder geteilt und durch die Gemeinschaft mit SchlagwÃķrtern, sogenannten Tags, annotiert werden. Mit Tags lassen sich Fotos leichter organisieren und auffinden. Manuelles Taggen ist allerdings sehr zeitintensiv. Animiert von der schieren Menge an im Web zugÃĪnglichen, von Usern getaggten Fotos, erforschen Wissenschaftler derzeit neue Techniken der automatischen Bildannotation. Dahinter steht die Idee, ein noch nicht beschriftetes Bild auf der Grundlage visuell ÃĪhnlicher, bereits beschrifteter Community-Fotos zu annotieren. UnlÃĪngst wurde eine immer grÃķßere Menge an Community-Fotos mit geographischen Koordinaten versehen (geottagged). Die Arbeit macht sich diesen geographischen Kontext zunutze und prÃĪsentiert einen Ansatz zur automatischen Annotation geogetaggter Fotos. Ziel ist es, die wesentlichen Grenzen der bisher bekannten AnsÃĪtze in Hinsicht auf die QualitÃĪt der produzierten Tags und die Geschwindigkeit des gesamten Annotationsprozesses aufzuzeigen. Um dieses Ziel zu erreichen, wurden zunÃĪchst Bilder mit entsprechenden Metadaten aus den Online-Quellen gesammelt. Darauf basierend, wird eine Strategie zur Datensammlung eingefÞhrt, die sich sowohl der geographischen Informationen als auch der sozialen Verbindungen zwischen denjenigen, die die Fotos zur VerfÞgung stellen, bedient. Um die QualitÃĪt der gesammelten User-Tags zu verbessern, wird eine Methode zur AuflÃķsung ihrer AmbiguitÃĪt vorgestellt, die auf der Information der Tag-Ähnlichkeiten basiert. In diesem Zusammenhang wird ein Ansatz zur Darstellung von Tags als Wahrscheinlichkeitsverteilungen vorgeschlagen, der auf den Algorithmus der sogenannten Laplacian Score (LS) aufbaut. Des Weiteren wird eine Erweiterung der Jensen-Shannon-Divergence (JSD) vorgestellt, die statistische Fluktuationen berÞcksichtigt. Zur effizienten Identifikation der visuellen Nachbarn werden in der Arbeit zwei Erweiterungen des Speeded Up Robust Features (SURF)-Algorithmus vorgestellt. Zur Beschleunigung des Abgleichs wird eine LÃķsung auf der Basis von Klassifikationstechniken prÃĪsentiert, die die Anzahl der miteinander verglichenen SURF-Deskriptoren minimiert, wÃĪhrend die SURF-Genauigkeit durch eine effiziente Methode des schrittweisen Bildabgleichs verbessert wird. Des Weiteren wird ein statistisches Modell basierend auf der Baye'schen Regel vorgeschlagen, um die erlangten Annotationen entsprechend ihrer Relevanz in Bezug auf das Zielbild zu ranken. Schließlich wird die Effizienz jedes einzelnen, erwÃĪhnten Beitrags experimentell evaluiert. DarÞber hinaus wird die Performanz des vorgeschlagenen automatischen Annotationsansatzes durch umfassende experimentelle Studien als Ganzes demonstriert

    Annotation d'images via leur contexte spatio-temporel et les mÃĐtadonnÃĐes du Web

    Get PDF
    En Recherche d'Information (RI), les documents sont classiquement indexÃĐs en fonction de leur contenu, qu'il soit textuel ou multimÃĐdia. Les moteurs de recherche s'appuyant sur ces index sont aujourd'hui des outils performants, rÃĐpandus et indispensables. Ils visent à fournir des rÃĐponses pertinentes selon le besoin de l'utilisateur, sous forme de textes, images, sons, vidÃĐos, etc. Nos travaux de thÃĻse s'inscrivent dans le contexte des documents de type image. Plus prÃĐcisÃĐment, nous nous sommes intÃĐressÃĐs aux systÃĻmes d'annotation automatique d'images qui permettent d'associer automatiquement des mots-clÃĐs à des images afin de pouvoir ensuite les rechercher par requÊte textuelle. Ce type d'annotation cherche à combler les lacunes des approches d'annotation manuelle et semi-automatique. Celles-ci ne sont plus envisageables dans le contexte actuel qui permet à chacun de prendre de nombreuses photos à faible coÃŧt (en lien avec la dÃĐmocratisation des appareils photo numÃĐriques et l'intÃĐgration de capteurs numÃĐriques dans les tÃĐlÃĐphones mobiles). Parmi les diffÃĐrents types de collections d'images existantes (par exemple, mÃĐdicales, satellitaires) dans le cadre de cette thÃĻse nous nous sommes intÃĐressÃĐs aux collections d'images de type paysage (c.-à-d. des images qui illustrent des points d'intÃĐrÊt touristiques) pour lesquelles nous avons identifiÃĐ des dÃĐfis, tels que l'identification des nouveaux descripteurs pour les dÃĐcrire et de nouveaux modÃĻles pour fusionner ces derniers, l'identification des sources d'information pertinentes et le passage à l'ÃĐchelle. Nos contributions portent sur trois principaux volets. En premier lieu, nous nous sommes attachÃĐs à exploiter diffÃĐrents descripteurs qui peuvent influencer la description des images de type paysage : le descripteur de spatialisation (caractÃĐrisÃĐ par la latitude et la longitude des images), le descripteur de temporalitÃĐ (caractÃĐrisÃĐ par la date et l'heure de la prise de vue) et le descripteur de thÃĐmatique (caractÃĐrisÃĐ par les tags issus des plate formes de partage d'images). Ensuite, nous avons proposÃĐ des approches pour modÃĐliser ces descripteurs au regard de statistiques de tags liÃĐes à leur frÃĐquence et raretÃĐ et sur des similaritÃĐs spatiale et temporelle. DeuxiÃĻmement, nous avons proposÃĐ un nouveau processus d'annotation d'images qui vise à identifier les mots-clÃĐs qui dÃĐcrivent le mieux les images-requÊtes donnÃĐes en entrÃĐe d'un systÃĻme d'annotation par un utilisateur. Pour ce faire, pour chaque image-requÊte nous avons mis en œuvre des filtres spatial, temporel et spatio-temporel afin d'identifier les images similaires ainsi que leurs tags associÃĐs. Ensuite, nous avons fÃĐdÃĐrÃĐ les diffÃĐrents descripteurs dans un modÃĻle probabiliste afin de dÃĐterminer les termes qui dÃĐcrivent le mieux chaque image-requÊte. Enfin, le fait que les contributions prÃĐsentÃĐes ci-dessus s'appuient uniquement sur des informations issues des plateformes de partage d'images (c.-à-d. des informations subjectives) a suscitÃĐ la question suivante : les informations issues du Web peuvent-elles fournir des termes objectifs pour enrichir les descriptions initiales des images. À cet effet, nous avons proposÃĐ une approche basÃĐe sur les techniques d'expansion de requÊtes du domaine de la RI. Elle porte essentiellement sur l'ÃĐtude de l'impact des diffÃĐrents algorithmes d'expansion, ainsi que sur l'agrÃĐgation des rÃĐsultats fournis par le meilleur algorithme et les rÃĐsultats fournis par le processus d'annotation d'images. Vu qu'il n'existe pas de cadre d'ÃĐvaluation standard d'annotation automatique d'images, plus particuliÃĻrement adaptÃĐ aux collections d'images de type paysage, nous avons proposÃĐ des cadres d'ÃĐvaluation appropriÃĐs afin de valider nos contributions. En particulier, les diffÃĐrentes approches proposÃĐes sont ÃĐvaluÃĐes au regard de la modÃĐlisation des descripteur de spatialisation, de temporalitÃĐ et de thÃĐmatique. De plus, nous avons validÃĐ le processus d'annotation d'images, et nous avons montrÃĐ qu'il surpasse en qualitÃĐ deux approches d'annotation d'images de la littÃĐrature. Nous avons comparÃĐ ÃĐgalement l'approche d'enrichissement avec le processus d'annotation d'image pour souligner son efficacitÃĐ et l'apport des informations issues du Web. Ces expÃĐrimentations ont nÃĐcessitÃĐ le prototypage du logiciel AnnoTaGT, qui offre aux utilisateurs un cadre technique pour l'annotation automatique d'images.The documents processed by Information Retrieval (IR) systems are typically indexed according to their contents: Text or multimedia. Search engines based on these indexes aim to provide relevant answers to users' needs in the form of texts, images, sounds, videos, and so on. Our work is related to "image" documents. We are specifically interested in automatic image annotation systems that automatically associate keywords to images. Keywords are subsequently used for search purposes via textual queries. The automatic image annotation task intends to overcome the issues of manual and semi-automatic annotation tasks, as they are no longer feasible in nowadays' context (i.e., the development of digital technologies and the advent of devices, such as smartphones, allowing anyone to take images with a minimal cost). Among the different types of existing image collections (e.g., medical, satellite) in our work we are interested in landscape image collections for which we identified the following challenges: What are the most discriminant features for this type of images ? How to model and how to merge these features ? What are the sources of information that should be considered ? How to manage scalability issues ? The proposed contribution is threefold. First, we use different factors that influence the description of landscape images: The spatial factor (i.e., latitude and longitude of images), the temporal factor (i.e., the time when the images were taken), and the thematic factor (i.e., tags crowdsourced and contributed to image sharing platforms). We propose various techniques to model these factors based on tag frequency, as well as spatial and temporal similarities. The choice of these factors is based on the following assumptions: A tag is all the more relevant for a query-image as it is associated with images located in its close geographical area ; A tag is all the more relevant for a query-image as it is associated with images captured close in time to it ; sourcing concept). Second, we introduce a new image annotation process that recommends the terms that best describe a given query-image provided by a user. For each query-image we rely on spatial, temporal, and spatio-temporal filters to identify similar images along with their tags. Then, the different factors are merged through a probabilistic model to boost the terms best describing each query-image. Third, the contributions presented above are only based on information extracted from image photo sharing platforms (i.e., subjective information). This raised the following research question: Can the information extracted from the Web provide objective terms useful to enrich the initial description of images? We tackle this question by introducing an approach relying on query expansion techniques developed in IR. As there is no standard evaluation protocol for the automatic image annotation task tailored to landscape images, we designed various evaluation protocols to validate our contributions. We first evaluated the approaches defined to model the spatial, temporal, and thematic factors. Then, we validated the annotation image process and we showed that it yields significant improvement over two state-of-the-art baselines. Finally, we assessed the effectiveness of tag expansion through Web sources and showed its contribution to the image annotation process. These experiments are complemented by the image annotation prototype AnnoTaGT, which provides users with an operational framework for automatic image annotation

    Geo-based automatic image annotation

    No full text
    A huge number of user-tagged images are daily uploaded to the web. Recently, a growing number of those images are also geotagged. These provide new opportunities for solutions to automatically tag images so that efficient image management and retrieval can be achieved. In this paper an automatic image annotation approach is proposed. It is based on a statistical model that combines two different kinds of information: high level information represented by user tags of images captured in the same location as a new unlabeled image (input image); and low level information represented by the visual similarity between the input image and the collection of geographically similar images. To maximize the number of images that are visually similar to the input image, an iterative visual matching approach is proposed and evaluated. The results show that a significant recall improvement can be achieved with an increasing number of iterations. The quality of the recommended tags has also been evaluated and an overall good performance has been observed
    corecore