809 research outputs found
Text–to–Video: Image Semantics and NLP
When aiming at automatically translating an arbitrary text into a visual story, the main challenge consists in finding a semantically close visual representation whereby the displayed meaning should remain the same as in the given text. Besides, the appearance of an image itself largely influences how its meaningful information is transported towards an observer. This thesis now demonstrates that investigating in both, image semantics as well as the semantic relatedness between visual and textual sources enables us to tackle the challenging semantic gap and to find a semantically close translation from natural language to a corresponding visual representation.
Within the last years, social networking became of high interest leading to an enormous and still increasing amount of online available data. Photo sharing sites like Flickr allow users to associate textual information with their uploaded imagery. Thus, this thesis exploits this huge knowledge source of user generated data providing initial links between images and words, and other meaningful data.
In order to approach visual semantics, this work presents various methods to analyze the visual structure as well as the appearance of images in terms of meaningful similarities, aesthetic appeal, and emotional effect towards an observer. In detail, our GPU-based approach efficiently finds visual similarities between images in large datasets across visual domains and identifies various meanings for ambiguous words exploring similarity in online search results. Further, we investigate in the highly subjective aesthetic appeal of images and make use of deep learning to directly learn aesthetic rankings from a broad diversity of user reactions in social online behavior. To gain even deeper insights into the influence of visual appearance towards an observer, we explore how simple image processing is capable of actually changing the emotional perception and derive a simple but effective image filter.
To identify meaningful connections between written text and visual representations, we employ methods from Natural Language Processing (NLP). Extensive textual processing allows us to create semantically relevant illustrations for simple text elements as well as complete storylines. More precisely, we present an approach that resolves dependencies in textual descriptions to arrange 3D models correctly. Further, we develop a method that finds semantically relevant illustrations to texts of different types based on a novel hierarchical querying algorithm. Finally, we present an optimization based framework that is capable of not only generating semantically relevant but also visually coherent picture stories in different styles.Bei der automatischen Umwandlung eines beliebigen Textes in eine visuelle Geschichte, besteht die größte Herausforderung darin eine semantisch passende visuelle Darstellung zu finden. Dabei sollte die Bedeutung der Darstellung dem vorgegebenen Text entsprechen. Darüber hinaus hat die Erscheinung eines Bildes einen großen Einfluß darauf, wie seine bedeutungsvollen Inhalte auf einen Betrachter übertragen werden. Diese Dissertation zeigt, dass die Erforschung sowohl der Bildsemantik als auch der semantischen Verbindung zwischen visuellen und textuellen Quellen es ermöglicht, die anspruchsvolle semantische Lücke zu schließen und eine semantisch nahe Übersetzung von natürlicher Sprache in eine entsprechend sinngemäße visuelle Darstellung zu finden.
Des Weiteren gewann die soziale Vernetzung in den letzten Jahren zunehmend an Bedeutung, was zu einer enormen und immer noch wachsenden Menge an online verfügbaren Daten geführt hat. Foto-Sharing-Websites wie Flickr ermöglichen es Benutzern, Textinformationen mit ihren hochgeladenen Bildern zu verknüpfen. Die vorliegende Arbeit nutzt die enorme Wissensquelle von benutzergenerierten Daten welche erste Verbindungen zwischen Bildern und Wörtern sowie anderen aussagekräftigen Daten zur Verfügung stellt.
Zur Erforschung der visuellen Semantik stellt diese Arbeit unterschiedliche Methoden vor, um die visuelle Struktur sowie die Wirkung von Bildern in Bezug auf bedeutungsvolle Ähnlichkeiten, ästhetische Erscheinung und emotionalem Einfluss auf einen Beobachter zu analysieren. Genauer gesagt, findet unser GPU-basierter Ansatz effizient visuelle Ähnlichkeiten zwischen Bildern in großen Datenmengen quer über visuelle Domänen hinweg und identifiziert verschiedene Bedeutungen für mehrdeutige Wörter durch die Erforschung von Ähnlichkeiten in Online-Suchergebnissen. Des Weiteren wird die höchst subjektive ästhetische Anziehungskraft von Bildern untersucht und "deep learning" genutzt, um direkt ästhetische Einordnungen aus einer breiten Vielfalt von Benutzerreaktionen im sozialen Online-Verhalten zu lernen. Um noch tiefere Erkenntnisse über den Einfluss des visuellen Erscheinungsbildes auf einen Betrachter zu gewinnen, wird erforscht, wie alleinig einfache Bildverarbeitung in der Lage ist, tatsächlich die emotionale Wahrnehmung zu verändern und ein einfacher aber wirkungsvoller Bildfilter davon abgeleitet werden kann.
Um bedeutungserhaltende Verbindungen zwischen geschriebenem Text und visueller Darstellung zu ermitteln, werden Methoden des "Natural Language Processing (NLP)" verwendet, die der Verarbeitung natürlicher Sprache dienen. Der Einsatz umfangreicher Textverarbeitung ermöglicht es, semantisch relevante Illustrationen für einfache Textteile sowie für komplette Handlungsstränge zu erzeugen. Im Detail wird ein Ansatz vorgestellt, der Abhängigkeiten in Textbeschreibungen auflöst, um 3D-Modelle korrekt anzuordnen. Des Weiteren wird eine Methode entwickelt die, basierend auf einem neuen hierarchischen Such-Anfrage Algorithmus, semantisch relevante Illustrationen zu Texten verschiedener Art findet. Schließlich wird ein optimierungsbasiertes Framework vorgestellt, das nicht nur semantisch relevante, sondern auch visuell kohärente Bildgeschichten in verschiedenen Bildstilen erzeugen kann
Leveraging Mobile App Classification and User Context Information for Improving Recommendation Systems
Mobile apps play a significant role in current online environments where there is an overwhelming supply of information. Although mobile apps are part of our daily routine, searching and finding mobile apps is becoming a nontrivial task due to the current volume, velocity and variety of information. Therefore, app recommender systems provide users’ desired apps based on their preferences. However, current recommender systems and their underlying techniques are limited in effectively leveraging app classification schemes and context information. In this thesis, I attempt to address this gap by proposing a text analytics framework for mobile app recommendation by leveraging an app classification scheme that incorporates the needs of users as well as the complexity of the user-item-context information in mobile app usage pattern. In this recommendation framework, I adopt and empirically test an app classification scheme based on textual information about mobile apps using data from Google Play store. In addition, I demonstrate how context information such as user social media status can be matched with app classification categories using tree-based and rule-based prediction algorithms. Methodology wise, my research attempts to show the feasibility of textual data analysis in profiling apps based on app descriptions and other structured attributes, as well as explore mechanisms for matching user preferences and context information with app usage categories. Practically, the proposed text analytics framework can allow app developers reach a wider usage base through better understanding of user motivation and context information
Understanding and modeling of aesthetic response to shape and color in car body design
This study explored the phenomenon that a consumer's preference on color of car body may vary depending on shape of the car body. First, the study attempted to establish a theoretical framework that can account for this phenomenon. This framework is based on the (modern-) Darwinism approach to the so-called evolutionary psychology and aesthetics. It assumes that human's aesthetic sense works like an agent that seeks for environmental patterns that potentially afford to benefit the underlying needs of the agent, and this seeking process is evolutionary fitting. Second, by adopting the framework, a pattern called “fundamental aesthetic dimensions” was developed for identifying and modeling consumer’s aesthetic response to car body shape and color. Next, this study developed an effective tool that is capable in capturing and accommodating consumer’s color preference on a given car body shape. This tool was implemented by incorporating classic color theories and advanced digital technologies; it was named “Color-Shape Synthesizer”. Finally, an experiment was conducted to verify some of the theoretical developments.
This study concluded (1) the fundamental aesthetics dimensions can be used for describing aesthetics in terms of shape and color; (2) the Color-Shape Synthesizer tool can be well applied in practicing car body designs; and (3) mapping between semantic representations of aesthetic response to the fundamental aesthetics dimensions can likely be a multiple-network structure
A Semantics-based User Interface Model for Content Annotation, Authoring and Exploration
The Semantic Web and Linked Data movements with the aim of creating, publishing and interconnecting machine readable information have gained traction in the last years.
However, the majority of information still is contained in and exchanged using unstructured documents, such as Web pages, text documents, images and videos.
This can also not be expected to change, since text, images and videos are the natural way in which humans interact with information.
Semantic structuring of content on the other hand provides a wide range of advantages compared to unstructured information.
Semantically-enriched documents facilitate information search and retrieval, presentation, integration, reusability, interoperability and personalization.
Looking at the life-cycle of semantic content on the Web of Data, we see quite some progress on the backend side in storing structured content or for linking data and schemata.
Nevertheless, the currently least developed aspect of the semantic content life-cycle is from our point of view the user-friendly manual and semi-automatic creation of rich semantic content.
In this thesis, we propose a semantics-based user interface model, which aims to reduce the complexity of underlying technologies for semantic enrichment of content by Web users.
By surveying existing tools and approaches for semantic content authoring, we extracted a set of guidelines for designing efficient and effective semantic authoring user interfaces.
We applied these guidelines to devise a semantics-based user interface model called WYSIWYM (What You See Is What You Mean) which enables integrated authoring, visualization and exploration of unstructured and (semi-)structured content.
To assess the applicability of our proposed WYSIWYM model, we incorporated the model into four real-world use cases comprising two general and two domain-specific applications.
These use cases address four aspects of the WYSIWYM implementation:
1) Its integration into existing user interfaces,
2) Utilizing it for lightweight text analytics to incentivize users,
3) Dealing with crowdsourcing of semi-structured e-learning content,
4) Incorporating it for authoring of semantic medical prescriptions
Recommended from our members
Human-Centered Technologies for Inclusive Collection and Analysis of Public-Generated Data
The meteoric rise in the popularity of public engagement platforms such as social media, customer review websites, and public input solicitation efforts strives for establishing an inclusive environment for the public to share their thoughts, ideas, opinions, and experiences. Many decisions made at a personal, local, or national scale are often fueled by data generated by the public. As such, inclusive collection, analysis, sensemaking, and utilization of pubic-generated data are crucial to support the exercise of successful decision-making processes. However, people often struggle to engage, participate, and share their opinions due to inaccessibility, the rigidity of traditional public engagement methods, and the lack of options to provide opinions while avoiding potential confrontations. Concurrently, data analysts and decision-makers grapple with the challenges of analyzing, sensemaking, and making informed decisions based on public-generated data, which includes high dimensionality, ambiguity present in human language, and a lack of tools and techniques catered to their needs. Novel technological interventions are therefore necessary to enable the public to share their input without barriers and allow decision-makers to capture, forage, peruse, and sublimate public-generated data into concrete and actionable insights.
The goal of this dissertation is to demonstrate how human-centered approaches involve the stakeholders in the design, development, and evaluation of tools and techniques that can lead to inclusive, effective, and efficient approaches to public-generated data collection and analysis to support informed decision-making. To that end, in this dissertation, I first addressed the challenges of empowering the public to share their opinions by exploring two major opinion-sharing avenues --- social media and public consultation. To learn more about people\u27s social media experiences and challenges, I built two technology probes and conducted a qualitative exploratory study with 16 participants. This study is followed up by exploring the challenges of inclusive participation during public consultations such as town halls. Based on a formative study with 66 participants and 20 organizers, I designed and developed CommunityClick to enable reticent share their opinions silently and anonymously during town halls. Equipped with the knowledge and experiences from these works, I designed, developed, and evaluated technologies and methods to facilitate and accelerate informed data-driven decision-making based on increased public-generated data. Based on interviews with 14 analysts and decision-makers in the civic domain, I built a visual analytics system CommunityClick that can facilitate public input analysis by surfacing hidden insights, people\u27s reflections, and priorities. Leveraging the lessons learned during this work, I created a visual text analytics system that supports serendipitous discovery and balanced analysis of textual data to help make informed decisions.
In this work, I contribute an understanding of how people collect and analyze public-generated data to fuel their decisions when they have increased exposure to alternative avenues for opinion-sharing. Through a series of human-centered studies, I highlight the challenges that inhibit inclusivity in opinion sharing and shortcomings of existing methods that prevent decision-makers to account for comprehensive public input that includes marginalized or unpopular opinions. To address these challenges, I designed, developed, and evaluated a collection of interactive systems including CommunityClick, CommunityPulse, and Serendyze. Through a rigorous set of evaluation strategies which include creativity sessions, controlled lab studies, in-the-wild deployment, and field experiments, I involved stakeholders to assess the effectiveness and utility of the built systems. Through the empirical evidence from these studies, I demonstrate how alternative designs for social media could enhance people\u27s social media experiences and enable them to make new connections with others to share opinions. In addition, I show how CommunityClick can be utilized to enable reticent attendees during public consultation to share their opinions while avoiding unwanted confrontation and allowing organizers to capture and account for silent feedback. I highlight how CommunityPulse allowed analysts and decision-makers to examine public input from multiple angles for an accelerated analysis and more informed decision-making. Furthermore, I demonstrate how supporting serendipitous discovery and balanced analysis using Serendyze can lead to more informed data-driven decision-making. I conclude the dissertation with a discussion on future avenues to expand this research including the facilitation of multi-user collaborative analysis, integration of multi-modal signals in the analysis of public-generated data, and potential adoption strategies for decision-support systems designed for inclusive collection and analysis of public-generated data
Spatial and Temporal Sentiment Analysis of Twitter data
The public have used Twitter world wide for expressing opinions. This study focuses on spatio-temporal variation of georeferenced Tweets’ sentiment polarity, with a view to understanding how opinions evolve on Twitter over space and time and across communities of users. More specifically, the question this study tested is whether sentiment polarity on Twitter exhibits specific time-location patterns. The aim of the study is to investigate the spatial and temporal distribution of georeferenced Twitter sentiment polarity within the area of 1 km buffer around the Curtin Bentley campus boundary in Perth, Western Australia. Tweets posted in campus were assigned into six spatial zones and four time zones. A sentiment analysis was then conducted for each zone using the sentiment analyser tool in the Starlight Visual Information System software. The Feature Manipulation Engine was employed to convert non-spatial files into spatial and temporal feature class. The spatial and temporal distribution of Twitter sentiment polarity patterns over space and time was mapped using Geographic Information Systems (GIS). Some interesting results were identified. For example, the highest percentage of positive Tweets occurred in the social science area, while science and engineering and dormitory areas had the highest percentage of negative postings. The number of negative Tweets increases in the library and science and engineering areas as the end of the semester approaches, reaching a peak around an exam period, while the percentage of negative Tweets drops at the end of the semester in the entertainment and sport and dormitory area. This study will provide some insights into understanding students and staff ’s sentiment variation on Twitter, which could be useful for university teaching and learning management
- …