481 research outputs found

    Identification of user interests in social media

    Get PDF
    Social media has taken an important part in our lives in a short amount of time. People share parts of their experiences, opinions, and interests with others in a timely-fashion on these platforms. In recent years, fast growth of user population in social media is not only driving the research towards analyzing its inhabitants for fulfilling their expectations but also making it a very crucial information source for decision making processes in societies and in businesses. In this work, we propose methods for identifying users and their interests by using the multimedia data shared in social media. We show effectiveness of these methods in three applications. Our first application considers extracting political interests of Turkish Twitter users. We collect tweets that include a set of predefined words representing two different political stances in Turkey. We extract profile images of the users who wrote those tweets and apply a computer vision technique called image context extraction on this set of images to obtain some textual explanations for each picture. The main goal of this work is inferring proportions of two different political stances to forecast results of March 2014 local elections. Our results show that the proportions obtained from our method are almost the same as the vote percentages of two parties. In our second application, we find Facebook profiles of people whose identification information (Name, surname and location) is given by querying Facebook Graph API. Each query result returns a number of profiles due to people having same name. We refine these results by checking location in profile pages online. Our method achieves a successful match rate of 88% (1332/1500 people). The third application deals with building a community about a given topic of interest by condensing existing communities in a social media platform. We collect members of the communities about the given topic in a set and apply our relevance scoring method on these members. Those who receive a score below a threshold value are assumed to be irrelevant to given topic and they are eliminated so that remaining users in the set are the ones relevant to given topic. We validated the results of our framework by a user-study. There is a 76% of match between user labelled and automated results

    Semantically-enhanced image tagging system

    Get PDF
    In multimedia databases, data are images, audio, video, texts, etc. Research interests in these types of databases have increased in the last decade or so, especially with the advent of the Internet and Semantic Web. Fundamental research issues vary from unified data modelling, retrieval of data items and dynamic nature of updates. The thesis builds on findings in Semantic Web and retrieval techniques and explores novel tagging methods for identifying data items. Tagging systems have become popular which enable the users to add tags to Internet resources such as images, video and audio to make them more manageable. Collaborative tagging is concerned with the relationship between people and resources. Most of these resources have metadata in machine processable format and enable users to use free- text keywords (so-called tags) as search techniques. This research references some tagging systems, e.g. Flicker, delicious and myweb2.0. The limitation with such techniques includes polysemy (one word and different meaning), synonymy (different words and one meaning), different lexical forms (singular, plural, and conjugated words) and misspelling errors or alternate spellings. The work presented in this thesis introduces semantic characterization of web resources that describes the structure and organization of tagging, aiming to extend the existing Multimedia Query using similarity measures to cater for collaborative tagging. In addition, we discuss the semantic difficulties of tagging systems, suggesting improvements in their accuracies. The scope of our work is classified as follows: (i) Increase the accuracy and confidence of multimedia tagging systems. (ii) Increase the similarity measures of images by integrating varieties of measures. To address the first shortcoming, we use the WordNet based on a tagging system for social sharing and retrieval of images as a semantic lingual ontology resource. For the second shortcoming we use the similarity measures in different ways to recognise the multimedia tagging system. Fundamental to our work is the novel information model that we have constructed for our computation. This is based on the fact that an image is a rich object that can be characterised and formulated in n-dimensions, each dimension contains valuable information that will help in increasing the accuracy of the search. For example an image of a tree in a forest contains more information than an image of the same tree but in a different environment. In this thesis we characterise a data item (an image) by a primary description, followed by n-secondary descriptions. As n increases, the accuracy of the search improves. We give various techniques to analyse data and its associated query. To increase the accuracy of the tagging system we have performed different experiments on many images using similarity measures and various techniques from VoI (Value of Information). The findings have shown the linkage/integration between similarity measures and that VoI improves searches and helps/guides a tagger in choosing the most adequate of tags

    Adapting information retrieval to user needs in an evolving web environment

    Get PDF
    [no abstract

    Text–to–Video: Image Semantics and NLP

    Get PDF
    When aiming at automatically translating an arbitrary text into a visual story, the main challenge consists in finding a semantically close visual representation whereby the displayed meaning should remain the same as in the given text. Besides, the appearance of an image itself largely influences how its meaningful information is transported towards an observer. This thesis now demonstrates that investigating in both, image semantics as well as the semantic relatedness between visual and textual sources enables us to tackle the challenging semantic gap and to find a semantically close translation from natural language to a corresponding visual representation. Within the last years, social networking became of high interest leading to an enormous and still increasing amount of online available data. Photo sharing sites like Flickr allow users to associate textual information with their uploaded imagery. Thus, this thesis exploits this huge knowledge source of user generated data providing initial links between images and words, and other meaningful data. In order to approach visual semantics, this work presents various methods to analyze the visual structure as well as the appearance of images in terms of meaningful similarities, aesthetic appeal, and emotional effect towards an observer. In detail, our GPU-based approach efficiently finds visual similarities between images in large datasets across visual domains and identifies various meanings for ambiguous words exploring similarity in online search results. Further, we investigate in the highly subjective aesthetic appeal of images and make use of deep learning to directly learn aesthetic rankings from a broad diversity of user reactions in social online behavior. To gain even deeper insights into the influence of visual appearance towards an observer, we explore how simple image processing is capable of actually changing the emotional perception and derive a simple but effective image filter. To identify meaningful connections between written text and visual representations, we employ methods from Natural Language Processing (NLP). Extensive textual processing allows us to create semantically relevant illustrations for simple text elements as well as complete storylines. More precisely, we present an approach that resolves dependencies in textual descriptions to arrange 3D models correctly. Further, we develop a method that finds semantically relevant illustrations to texts of different types based on a novel hierarchical querying algorithm. Finally, we present an optimization based framework that is capable of not only generating semantically relevant but also visually coherent picture stories in different styles.Bei der automatischen Umwandlung eines beliebigen Textes in eine visuelle Geschichte, besteht die größte Herausforderung darin eine semantisch passende visuelle Darstellung zu finden. Dabei sollte die Bedeutung der Darstellung dem vorgegebenen Text entsprechen. Darüber hinaus hat die Erscheinung eines Bildes einen großen Einfluß darauf, wie seine bedeutungsvollen Inhalte auf einen Betrachter übertragen werden. Diese Dissertation zeigt, dass die Erforschung sowohl der Bildsemantik als auch der semantischen Verbindung zwischen visuellen und textuellen Quellen es ermöglicht, die anspruchsvolle semantische Lücke zu schließen und eine semantisch nahe Übersetzung von natürlicher Sprache in eine entsprechend sinngemäße visuelle Darstellung zu finden. Des Weiteren gewann die soziale Vernetzung in den letzten Jahren zunehmend an Bedeutung, was zu einer enormen und immer noch wachsenden Menge an online verfügbaren Daten geführt hat. Foto-Sharing-Websites wie Flickr ermöglichen es Benutzern, Textinformationen mit ihren hochgeladenen Bildern zu verknüpfen. Die vorliegende Arbeit nutzt die enorme Wissensquelle von benutzergenerierten Daten welche erste Verbindungen zwischen Bildern und Wörtern sowie anderen aussagekräftigen Daten zur Verfügung stellt. Zur Erforschung der visuellen Semantik stellt diese Arbeit unterschiedliche Methoden vor, um die visuelle Struktur sowie die Wirkung von Bildern in Bezug auf bedeutungsvolle Ähnlichkeiten, ästhetische Erscheinung und emotionalem Einfluss auf einen Beobachter zu analysieren. Genauer gesagt, findet unser GPU-basierter Ansatz effizient visuelle Ähnlichkeiten zwischen Bildern in großen Datenmengen quer über visuelle Domänen hinweg und identifiziert verschiedene Bedeutungen für mehrdeutige Wörter durch die Erforschung von Ähnlichkeiten in Online-Suchergebnissen. Des Weiteren wird die höchst subjektive ästhetische Anziehungskraft von Bildern untersucht und "deep learning" genutzt, um direkt ästhetische Einordnungen aus einer breiten Vielfalt von Benutzerreaktionen im sozialen Online-Verhalten zu lernen. Um noch tiefere Erkenntnisse über den Einfluss des visuellen Erscheinungsbildes auf einen Betrachter zu gewinnen, wird erforscht, wie alleinig einfache Bildverarbeitung in der Lage ist, tatsächlich die emotionale Wahrnehmung zu verändern und ein einfacher aber wirkungsvoller Bildfilter davon abgeleitet werden kann. Um bedeutungserhaltende Verbindungen zwischen geschriebenem Text und visueller Darstellung zu ermitteln, werden Methoden des "Natural Language Processing (NLP)" verwendet, die der Verarbeitung natürlicher Sprache dienen. Der Einsatz umfangreicher Textverarbeitung ermöglicht es, semantisch relevante Illustrationen für einfache Textteile sowie für komplette Handlungsstränge zu erzeugen. Im Detail wird ein Ansatz vorgestellt, der Abhängigkeiten in Textbeschreibungen auflöst, um 3D-Modelle korrekt anzuordnen. Des Weiteren wird eine Methode entwickelt die, basierend auf einem neuen hierarchischen Such-Anfrage Algorithmus, semantisch relevante Illustrationen zu Texten verschiedener Art findet. Schließlich wird ein optimierungsbasiertes Framework vorgestellt, das nicht nur semantisch relevante, sondern auch visuell kohärente Bildgeschichten in verschiedenen Bildstilen erzeugen kann

    Semantic Interaction in Web-based Retrieval Systems : Adopting Semantic Web Technologies and Social Networking Paradigms for Interacting with Semi-structured Web Data

    Get PDF
    Existing web retrieval models for exploration and interaction with web data do not take into account semantic information, nor do they allow for new forms of interaction by employing meaningful interaction and navigation metaphors in 2D/3D. This thesis researches means for introducing a semantic dimension into the search and exploration process of web content to enable a significantly positive user experience. Therefore, an inherently dynamic view beyond single concepts and models from semantic information processing, information extraction and human-machine interaction is adopted. Essential tasks for semantic interaction such as semantic annotation, semantic mediation and semantic human-computer interaction were identified and elaborated for two general application scenarios in web retrieval: Web-based Question Answering in a knowledge-based dialogue system and semantic exploration of information spaces in 2D/3D

    Information search and similarity based on Web 2.0 and semantic technologies

    Get PDF
    The World Wide Web provides a huge amount of information described in natural language at the current society’s disposal. Web search engines were born from the necessity of finding a particular piece of that information. Their ease of use and their utility have turned these engines into one of the most used web tools at a daily basis. To make a query, users just have to introduce a set of words - keywords - in natural language and the engine answers with a list of ordered resources which contain those words. The order is given by ranking algorithms. These algorithms use basically two types of features: dynamic and static factors. The dynamic factor has into account the query; that is, those documents which contain the keywords used to describe the query are more relevant for that query. The hyperlinks structure among documents is an example of a static factor of most current algorithms. For example, if most documents link to a particular document, this document may have more relevance than others because it is more popular. Even though currently there is a wide consensus on the good results that the majority of web search engines provides, these tools still suffer from some limitations, basically 1) the loneliness of the searching activity itself; and 2) the simple recovery process, based mainly on offering the documents that contains the exact terms used to describe the query. Considering the first problem, there is no doubt in the lonely and time-consuming process of searching relevant information in the World Wide Web. There are thousands of users out there that repeat previously executed queries, spending time in taking decisions of which documents are relevant or not; decisions that may have been taken previously and that may be do the job for similar or identical queries for other users. Considering the second problem, the textual nature of the current Web makes the reasoning capability of web search engines quite restricted; queries and web resources are described in natural language that, in some cases, can lead to ambiguity or other semantic-related difficulties. Computers do not know text; however, if semantics is incorporated to the text, meaning and sense is incorporated too. This way, queries and web resources will not be mere sets of terms, but lists of well-defined concepts. This thesis proposes a semantic layer, known as Itaca, which joins simplicity and effectiveness in order to endow with semantics both the resources stored in the World Wide Web and the queries used by users to find those resources. This is achieved through collaborative annotations and relevance feedback made by the users themselves, which describe both the queries and the web resources by means of Wikipedia concepts. Itaca extends the functional capabilities of current web search engines, providing a new ranking algorithm without dispensing traditional ranking models. Experiments show that this new architecture offers more precision in the final results obtained, keeping the simplicity and usability of the web search engines existing so far. Its particular design as a layer makes feasible its inclusion to current engines in a simple way.Internet pone a disposición de la sociedad una enorme cantidad de información descrita en lenguaje natural. Los buscadores web nacieron de la necesidad de encontrar un fragmento de información entre tanto volumen de datos. Su facilidad de manejo y su utilidad los han convertido en herramientas de uso diario entre la población. Para realizar una consulta, el usuario sólo tiene que introducir varias palabras clave en lenguaje natural y el buscador responde con una lista de recursos que contienen dichas palabras, ordenados en base a algoritmos de ranking. Estos algoritmos usan dos tipos de factores básicos: factores dinámicos y estáticos. El factor dinámico tiene en cuenta la consulta en sí; es decir, aquellos documentos donde estén las palabras utilizadas para describir la consulta serán más relevantes para dicha consulta. La estructura de hiperenlaces en los documentos electrónicos es un ejemplo de factor estático. Por ejemplo, si muchos documentos enlazan a otro documento, éste último documento podrá ser más relevante que otros. Si bien es cierto que actualmente hay consenso entre los buenos resultados de estos buscadores, todavía adolecen de ciertos problemas, destacando 1) la soledad en la que un usuario realiza una consulta; y 2) el modelo simple de recuperación, basado en ver si un documento contiene o no las palabras exactas usadas para describir la consulta. Con respecto al primer problema, no hay duda de que navegar en busca de cierta información relevante es una práctica solitaria y que consume mucho tiempo. Hay miles de usuarios ahí fuera que repiten sin saberlo una misma consulta, y las decisiones que toman muchos de ellos, descartando la información irrelevante y quedándose con la que realmente es útil, podrían servir de guía para otros muchos. Con respecto al segundo, el carácter textual de la Web actual hace que la capacidad de razonamiento en los buscadores se vea limitada, pues las consultas y los recursos están descritos en lenguaje natural que en ocasiones da origen a la ambigüedad. Los equipos informáticos no comprenden el texto que se incluye. Si se incorpora semántica al lenguaje, se incorpora significado, de forma que las consultas y los recursos electrónicos no son meros conjuntos de términos, sino una lista de conceptos claramente diferenciados. La presente tesis desarrolla una capa semántica, Itaca, que dota de significado tanto a los recursos almacenados en la Web como a las consultas que pueden formular los usuarios para encontrar dichos recursos. Todo ello se consigue a través de anotaciones colaborativas y de relevancia realizadas por los propios usuarios, que describen tanto consultas como recursos electrónicos mediante conceptos extraídos de Wikipedia. Itaca extiende las características funcionales de los buscadores web actuales, aportando un nuevo modelo de ranking sin tener que prescindir de los modelos actualmente en uso. Los experimentos demuestran que aporta una mayor precisión en los resultados finales, manteniendo la simplicidad y usabilidad de los buscadores que se conocen hasta ahora. Su particular diseño, a modo de capa, hace que su incorporación a buscadores ya existentes sea posible y sencilla.Programa Oficial de Posgrado en Ingeniería TelemáticaPresidente: Asunción Gómez Pérez.- Secretario: Mario Muñoz Organero.- Vocal: Anselmo Peñas Padill

    Exploiting tag information for search and personalization

    Get PDF
    [no abstract

    Social and Semantic Contexts in Tourist Mobile Applications

    Get PDF
    The ongoing growth of the World Wide Web along with the increase possibility of access information through a variety of devices in mobility, has defi nitely changed the way users acquire, create, and personalize information, pushing innovative strategies for annotating and organizing it. In this scenario, Social Annotation Systems have quickly gained a huge popularity, introducing millions of metadata on di fferent Web resources following a bottom-up approach, generating free and democratic mechanisms of classi cation, namely folksonomies. Moving away from hierarchical classi cation schemas, folksonomies represent also a meaningful mean for identifying similarities among users, resources and tags. At any rate, they suff er from several limitations, such as the lack of specialized tools devoted to manage, modify, customize and visualize them as well as the lack of an explicit semantic, making di fficult for users to bene fit from them eff ectively. Despite appealing promises of Semantic Web technologies, which were intended to explicitly formalize the knowledge within a particular domain in a top-down manner, in order to perform intelligent integration and reasoning on it, they are still far from reach their objectives, due to di fficulties in knowledge acquisition and annotation bottleneck. The main contribution of this dissertation consists in modeling a novel conceptual framework that exploits both social and semantic contextual dimensions, focusing on the domain of tourism and cultural heritage. The primary aim of our assessment is to evaluate the overall user satisfaction and the perceived quality in use thanks to two concrete case studies. Firstly, we concentrate our attention on contextual information and navigation, and on authoring tool; secondly, we provide a semantic mapping of tags of the system folksonomy, contrasted and compared to the expert users' classi cation, allowing a bridge between social and semantic knowledge according to its constantly mutual growth. The performed user evaluations analyses results are promising, reporting a high level of agreement on the perceived quality in use of both the applications and of the speci c analyzed features, demonstrating that a social-semantic contextual model improves the general users' satisfactio
    • …
    corecore