204 research outputs found

    Semantic multimedia analysis using knowledge and context

    Get PDF
    PhDThe difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media

    Smartphone picture organization: a hierarchical approach

    Get PDF
    We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.Peer ReviewedPreprin

    Tag-Aware Recommender Systems: A State-of-the-art Survey

    Get PDF
    In the past decade, Social Tagging Systems have attracted increasing attention from both physical and computer science communities. Besides the underlying structure and dynamics of tagging systems, many efforts have been addressed to unify tagging information to reveal user behaviors and preferences, extract the latent semantic relations among items, make recommendations, and so on. Specifically, this article summarizes recent progress about tag-aware recommender systems, emphasizing on the contributions from three mainstream perspectives and approaches: network-based methods, tensor-based methods, and the topic-based methods. Finally, we outline some other tag-related works and future challenges of tag-aware recommendation algorithms.Comment: 19 pages, 3 figure

    Context based multimedia information retrieval

    Get PDF

    Learning Contextualized Semantics from Co-occurring Terms via a Siamese Architecture

    Get PDF
    One of the biggest challenges in Multimedia information retrieval and understanding is to bridge the semantic gap by properly modeling concept semantics in context. The presence of out of vocabulary (OOV) concepts exacerbates this difficulty. To address the semantic gap issues, we formulate a problem on learning contextualized semantics from descriptive terms and propose a novel Siamese architecture to model the contextualized semantics from descriptive terms. By means of pattern aggregation and probabilistic topic models, our Siamese architecture captures contextualized semantics from the co-occurring descriptive terms via unsupervised learning, which leads to a concept embedding space of the terms in context. Furthermore, the co-occurring OOV concepts can be easily represented in the learnt concept embedding space. The main properties of the concept embedding space are demonstrated via visualization. Using various settings in semantic priming, we have carried out a thorough evaluation by comparing our approach to a number of state-of-the-art methods on six annotation corpora in different domains, i.e., MagTag5K, CAL500 and Million Song Dataset in the music domain as well as Corel5K, LabelMe and SUNDatabase in the image domain. Experimental results on semantic priming suggest that our approach outperforms those state-of-the-art methods considerably in various aspects

    Building a semantic search engine with games and crowdsourcing

    Get PDF
    Semantic search engines aim at improving conventional search with semantic information, or meta-data, on the data searched for and/or on the searchers. So far, approaches to semantic search exploit characteristics of the searchers like age, education, or spoken language for selecting and/or ranking search results. Such data allow to build up a semantic search engine as an extension of a conventional search engine. The crawlers of well established search engines like Google, Yahoo! or Bing can index documents but, so far, their capabilities to recognize the intentions of searchers are still rather limited. Indeed, taking into account characteristics of the searchers considerably extend both, the quantity of data to analyse and the dimensionality of the search problem. Well established search engines therefore still focus on general search, that is, "search for all", not on specialized search, that is, "search for a few". This thesis reports on techniques that have been adapted or conceived, deployed, and tested for building a semantic search engine for the very specific context of artworks. In contrast to, for example, the interpretation of X-ray images, the interpretation of artworks is far from being fully automatable. Therefore artwork interpretation has been based on Human Computation, that is, a software-based gathering of contributions by many humans. The approach reported about in this thesis first relies on so called Games With A Purpose, or GWAPs, for this gathering: Casual games provide an incentive for a potentially unlimited community of humans to contribute with their appreciations of artworks. Designing convenient incentives is less trivial than it might seem at first. An ecosystem of games is needed so as to collect the meta-data on artworks intended for. One game generates the data that can serve as input of another game. This results in semantically rich meta-data that can be used for building up a successful semantic search engine. Thus, a first part of this thesis reports on a "game ecosystem" specifically designed from one known game and including several novel games belonging to the following game classes: (1) Description Games for collecting obvious and trivial meta-data, basically the well-known ESP (for extra-sensorial perception) game of Luis von Ahn, (2) the Dissemination Game Eligo generating translations, (3) the Diversification Game Karido aiming at sharpening differences between the objects, that is, the artworks, interpreted and (3) the Integration Games Combino, Sentiment and TagATag that generate structured meta-data. Secondly, the approach to building a semantic search engine reported about in this thesis relies on Higher-Order Singular Value Decomposition (SVD). More precisely, the data and meta-data on artworks gathered with the afore mentioned GWAPs are collected in a tensor, that is a mathematical structure generalising matrices to more than only two dimensions, columns and rows. The dimensions considered are the artwork descriptions, the players, and the artwork themselves. A Higher-Order SVD of this tensor is first used for noise reduction in This thesis reports also on deploying a Higher-Order LSA. The parallel Higher-Order SVD algorithm applied for the Higher-Order LSA and its implementation has been validated on an application related to, but independent from, the semantic search engine for artworks striven for: image compression. This thesis reports on the surprisingly good image compression which can be achieved with Higher-Order SVD. While compression methods based on matrix SVD for each color, the approach reported about in this thesis relies on one single (higher-order) SVD of the whole tensor. This results in both, better quality of the compressed image and in a significant reduction of the memory space needed. Higher-Order SVD is extremely time-consuming what calls for parallel computation. Thus, a step towards automatizing the construction of a semantic search engine for artworks was parallelizing the higher-order SVD method used and running the resulting parallel algorithm on a super-computer. This thesis reports on using Hestenes’ method and R-SVD for parallelising the higher-order SVD. This method is an unconventional choice which is explained and motivated. As of the super-computer needed, this thesis reports on turning the web browsers of the players or searchers into a distributed parallel computer. This is done by a novel specific system and a novel implementation of the MapReduce data framework to data parallelism. Harnessing the web browsers of the players or searchers saves computational power on the server-side. It also scales extremely well with the number of players or searchers because both, playing with and searching for artworks, require human reflection and therefore results in idle local processors that can be brought together into a distributed super-computer.Semantische Suchmaschinen dienen der Verbesserung konventioneller Suche mit semantischen Informationen, oder Metadaten, zu Daten, nach denen gesucht wird, oder zu den Suchenden. Bisher nutzt Semantische Suche Charakteristika von Suchenden wie Alter, Bildung oder gesprochene Sprache fĂŒr die Auswahl und/oder das Ranking von Suchergebnissen. Solche Daten erlauben den Aufbau einer Semantischen Suchmaschine als Erweiterung einer konventionellen Suchmaschine. Die Crawler der fest etablierten Suchmaschinen wie Google, Yahoo! oder Bing können Dokumente indizieren, bisher sind die FĂ€higkeiten eher beschrĂ€nkt, die Absichten von Suchenden zu erkennen. TatsĂ€chlich erweitert die BerĂŒcksichtigung von Charakteristika von Suchenden betrĂ€chtlich beides, die Menge an zu analysierenden Daten und die DimensionalitĂ€t des Such-Problems. Fest etablierte Suchmaschinen fokussieren deswegen stark auf allgemeine Suche, also "Suche fĂŒr alle", nicht auf spezialisierte Suche, also "Suche fĂŒr wenige". Diese Arbeit berichtet von Techniken, die adaptiert oder konzipiert, eingesetzt und getestet wurden, um eine semantische Suchmaschine fĂŒr den sehr speziellen Kontext von Kunstwerken aufzubauen. Im Gegensatz beispielsweise zur Interpretation von Röntgenbildern ist die Interpretation von Kunstwerken weit weg davon gĂ€nzlich automatisiert werden zu können. Deswegen basiert die Interpretation von Kunstwerken auf menschlichen Berechnungen, also Software-basiertes Sammeln von menschlichen BeitrĂ€gen. Der Ansatz, ĂŒber den in dieser Arbeit berichtet wird, beruht auf sogenannten "Games With a Purpose" oder GWAPs die folgendes sammeln: Zwanglose Spiele bieten einen Anreiz fĂŒr eine potenziell unbeschrĂ€nkte Gemeinde von Menschen, mit Ihrer WertschĂ€tzung von Kunstwerken beizutragen. Geeignete Anreize zu entwerfen in weniger trivial als es zuerst scheinen mag. Ein Ökosystem von Spielen wird benötigt, um Metadaten gedacht fĂŒr Kunstwerke zu sammeln. Ein Spiel erzeugt Daten, die als Eingabe fĂŒr ein anderes Spiel dienen können. Dies resultiert in semantisch reichhaltigen Metadaten, die verwendet werden können, um eine erfolgreiche Semantische Suchmaschine aufzubauen. Deswegen berichtet der erste Teil dieser Arbeit von einem "Spiel-Ökosystem", entwickelt auf Basis eines bekannten Spiels und verschiedenen neuartigen Spielen, die zu verschiedenen Spiel-Klassen gehören. (1) Beschreibungs-Spiele zum Sammeln offensichtlicher und trivialer Metadaten, vor allem dem gut bekannten ESP-Spiel (Extra Sensorische Wahrnehmung) von Luis von Ahn, (2) dem Verbreitungs-Spiel Eligo zur Erzeugung von Übersetzungen, (3) dem Diversifikations-Spiel Karido, das Unterschiede zwischen Objekten, also interpretierten Kunstwerken, schĂ€rft und (3) Integrations-Spiele Combino, Sentiment und Tag A Tag, die strukturierte Metadaten erzeugen. Zweitens beruht der Ansatz zum Aufbau einer semantischen Suchmaschine, wie in dieser Arbeit berichtet, auf SingulĂ€rwertzerlegung (SVD) höherer Ordnung. PrĂ€ziser werden die Daten und Metadaten ĂŒber Kunstwerk gesammelt mit den vorher genannten GWAPs in einem Tensor gesammelt, einer mathematischen Struktur zur Generalisierung von Matrizen zu mehr als zwei Dimensionen, Spalten und Zeilen. Die betrachteten Dimensionen sind die Beschreibungen der Kunstwerke, die Spieler, und die Kunstwerke selbst. Eine SingulĂ€rwertzerlegung höherer Ordnung dieses Tensors wird zuerst zur Rauschreduktion verwendet nach der Methode der sogenannten Latenten Semantischen Analyse (LSA). Diese Arbeit berichtet auch ĂŒber die Anwendung einer LSA höherer Ordnung. Der parallele Algorithmus fĂŒr SingulĂ€rwertzerlegungen höherer Ordnung, der fĂŒr LSA höherer Ordnung verwendet wird, und seine Implementierung wurden validiert an einer verwandten aber von der semantischen Suche unabhĂ€ngig angestrebten Anwendung: Bildkompression. Diese Arbeit berichtet von ĂŒberraschend guter Kompression, die mit SingulĂ€rwertzerlegung höherer Ordnung erzielt werden kann. Neben Matrix-SVD-basierten Kompressionsverfahren fĂŒr jede Farbe, beruht der Ansatz wie in dieser Arbeit berichtet auf einer einzigen SVD (höherer Ordnung) auf dem gesamten Tensor. Dies resultiert in beidem, besserer QualitĂ€t von komprimierten Bildern und einer signifikant geringeren des benötigten Speicherplatzes. SingulĂ€rwertzerlegung höherer Ordnung ist extrem zeitaufwĂ€ndig, was parallele Berechnung verlangt. Deswegen war ein Schritt in Richtung Aufbau einer semantischen Suchmaschine fĂŒr Kunstwerke eine Parallelisierung der verwendeten SVD höherer Ordnung auf einem Super-Computer. Diese Arbeit berichtet vom Einsatz der Hestenes’-Methode und R-SVD zur Parallelisierung der SVD höherer Ordnung. Diese Methode ist eine unkonventionell Wahl, die erklĂ€rt und motiviert wird. Ab nun wird ein Super-Computer benötigt. Diese Arbeit berichtet ĂŒber die Wandlung der Webbrowser von Spielern oder Suchenden in einen verteilten Super-Computer. Dies leistet ein neuartiges spezielles System und eine neuartige Implementierung des MapReduce Daten-Frameworks fĂŒr Datenparallelismus. Das Einspannen der Webbrowser von Spielern und Suchenden spart server-seitige Berechnungskraft. Ebenso skaliert die Berechnungskraft so extrem gut mit der Spieleranzahl oder Suchenden, denn beides, Spiel mit oder Suche nach Kunstwerken, benötigt menschliche Reflektion, was deswegen zu ungenutzten lokalen Prozessoren fĂŒhrt, die zu einem verteilten Super-Computer zusammengeschlossen werden können

    Infinite factorization of multiple non-parametric views

    Get PDF
    Combined analysis of multiple data sources has increasing application interest, in particular for distinguishing shared and source-specific aspects. We extend this rationale of classical canonical correlation analysis into a flexible, generative and non-parametric clustering setting, by introducing a novel non-parametric hierarchical mixture model. The lower level of the model describes each source with a flexible non-parametric mixture, and the top level combines these to describe commonalities of the sources. The lower-level clusters arise from hierarchical Dirichlet Processes, inducing an infinite-dimensional contingency table between the views. The commonalities between the sources are modeled by an infinite block model of the contingency table, interpretable as non-negative factorization of infinite matrices, or as a prior for infinite contingency tables. With Gaussian mixture components plugged in for continuous measurements, the model is applied to two views of genes, mRNA expression and abundance of the produced proteins, to expose groups of genes that are co-regulated in either or both of the views. Cluster analysis of co-expression is a standard simple way of screening for co-regulation, and the two-view analysis extends the approach to distinguishing between pre- and post-translational regulation

    Co-occurrence Models for Image Annotation and Retrieval

    Get PDF
    We present two models for content-based automatic image annotation and retrieval in web image repositories, based on the co-occurrence of tags and visual features in the images. In particular, we show how additional measures can be taken to address the noisy and limited tagging problems, in datasets such as Flickr, to improve performance. As in many state-of-the-art works, an image is represented as a bag of visual terms computed using edge and color information. The cooccurrence information of visual terms and tags is used to create models for image annotation and retrieval. The first model begins with a naive Bayes approach and then improves upon it by using image pairs as single documents to significantly reduce the noise and increase annotation performance. The second method models the visual terms and tags as a graph, and uses query expansion techniques to improve the retrieval performance. We evaluate our methods on the commonly used 150 concept Corel dataset, and a much harder 2000 concept Flickr dataset

    Tag-Aware Recommender Systems: A State-of-the-Art Survey

    Get PDF
    In the past decade, Social Tagging Systems have attracted increasing attention from both physical and computer science communities. Besides the underlying structure and dynamics of tagging systems, many efforts have been addressed to unify tagging information to reveal user behaviors and preferences, extract the latent semantic relations among items, make recommendations, and so on. Specifically, this article summarizes recent progress about tag-aware recommender systems, emphasizing on the contributions from three mainstream perspectives and approaches: network-based methods, tensor-based methods, and the topic-based methods. Finally, we outline some other tag-related studies and future challenges of tag-aware recommendation algorithm
    • 

    corecore