301 research outputs found
Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings
In this paper we present a novel interactive multimodal learning system,
which facilitates search and exploration in large networks of social multimedia
users. It allows the analyst to identify and select users of interest, and to
find similar users in an interactive learning setting. Our approach is based on
novel multimodal representations of users, words and concepts, which we
simultaneously learn by deploying a general-purpose neural embedding model. We
show these representations to be useful not only for categorizing users, but
also for automatically generating user and community profiles. Inspired by
traditional summarization approaches, we create the profiles by selecting
diverse and representative content from all available modalities, i.e. the
text, image and user modality. The usefulness of the approach is evaluated
using artificial actors, which simulate user behavior in a relevance feedback
scenario. Multiple experiments were conducted in order to evaluate the quality
of our multimodal representations, to compare different embedding strategies,
and to determine the importance of different modalities. We demonstrate the
capabilities of the proposed approach on two different multimedia collections
originating from the violent online extremism forum Stormfront and the
microblogging platform Twitter, which are particularly interesting due to the
high semantic level of the discussions they feature
Artificial Intelligence-Enabled Intelligent Assistant for Personalized and Adaptive Learning in Higher Education
This paper presents a novel framework, Artificial Intelligence-Enabled
Intelligent Assistant (AIIA), for personalized and adaptive learning in higher
education. The AIIA system leverages advanced AI and Natural Language
Processing (NLP) techniques to create an interactive and engaging learning
platform. This platform is engineered to reduce cognitive load on learners by
providing easy access to information, facilitating knowledge assessment, and
delivering personalized learning support tailored to individual needs and
learning styles. The AIIA's capabilities include understanding and responding
to student inquiries, generating quizzes and flashcards, and offering
personalized learning pathways. The research findings have the potential to
significantly impact the design, implementation, and evaluation of AI-enabled
Virtual Teaching Assistants (VTAs) in higher education, informing the
development of innovative educational tools that can enhance student learning
outcomes, engagement, and satisfaction. The paper presents the methodology,
system architecture, intelligent services, and integration with Learning
Management Systems (LMSs) while discussing the challenges, limitations, and
future directions for the development of AI-enabled intelligent assistants in
education.Comment: 29 pages, 10 figures, 9659 word
Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models
Text-to-image generative models have demonstrated remarkable capabilities in
generating high-quality images based on textual prompts. However, crafting
prompts that accurately capture the user's creative intent remains challenging.
It often involves laborious trial-and-error procedures to ensure that the model
interprets the prompts in alignment with the user's intention. To address the
challenges, we present Promptify, an interactive system that supports prompt
exploration and refinement for text-to-image generative models. Promptify
utilizes a suggestion engine powered by large language models to help users
quickly explore and craft diverse prompts. Our interface allows users to
organize the generated images flexibly, and based on their preferences,
Promptify suggests potential changes to the original prompt. This feedback loop
enables users to iteratively refine their prompts and enhance desired features
while avoiding unwanted ones. Our user study shows that Promptify effectively
facilitates the text-to-image workflow and outperforms an existing baseline
tool widely used for text-to-image generation
Human Resources Recommender system based on discrete variables
Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceNatural Language Processing and Understanding has become one of the most exciting and challenging
fields in the area of Artificial Intelligence and Machine Learning. With the rapidly changing business
environment and surroundings, the importance of having the data transformed in such a way that
makes it easy to interpret is the greatest competitive advantage a company can have. Having said this,
the purpose of this thesis dissertation is to implement a recommender system for the Human
Resources department in a company that will aid the decision-making process of filling a specific job
position with the right candidate. The recommender system fill be fed with applicants, each being
represented by their skills, and will produce a subset of most adequate candidates given a job position.
This work uses StarSpace, a novelty neural embedding model, whose aim is to represent entities in a
common vectorial space and further perform similarity measures amongst them
Improving fairness in machine learning systems: What do industry practitioners need?
The potential for machine learning (ML) systems to amplify social inequities
and unfairness is receiving increasing popular and academic attention. A surge
of recent work has focused on the development of algorithmic tools to assess
and mitigate such unfairness. If these tools are to have a positive impact on
industry practice, however, it is crucial that their design be informed by an
understanding of real-world needs. Through 35 semi-structured interviews and an
anonymous survey of 267 ML practitioners, we conduct the first systematic
investigation of commercial product teams' challenges and needs for support in
developing fairer ML systems. We identify areas of alignment and disconnect
between the challenges faced by industry practitioners and solutions proposed
in the fair ML research literature. Based on these findings, we highlight
directions for future ML and HCI research that will better address industry
practitioners' needs.Comment: To appear in the 2019 ACM CHI Conference on Human Factors in
Computing Systems (CHI 2019
Text-image synergy for multimodal retrieval and annotation
Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text and images are the two most common data modalities found on the Internet. Understanding the synergy between text and images, that is, seamlessly analyzing information from these modalities may be trivial for humans, but is challenging for software systems. In this dissertation we study problems where deciphering text-image synergy is crucial for finding solutions. We propose methods and ideas that establish semantic connections between text and images in multimodal contents, and empirically show their effectiveness in four interconnected problems: Image Retrieval, Image Tag Refinement, Image-Text Alignment, and Image Captioning. Our promising results and observations open up interesting scopes for future research involving text-image data understanding.Text und Bild sind die beiden häufigsten Arten von Inhalten im Internet. Während es für Menschen einfach ist, gerade aus dem Zusammenspiel von Text- und Bildinhalten Informationen zu erfassen, stellt diese kombinierte Darstellung von Inhalten Softwaresysteme vor große Herausforderungen. In dieser Dissertation werden Probleme studiert, für deren Lösung das Verständnis des Zusammenspiels von Text- und Bildinhalten wesentlich ist. Es werden Methoden und Vorschläge präsentiert und empirisch bewertet, die semantische Verbindungen zwischen Text und Bild in multimodalen Daten herstellen. Wir stellen in dieser Dissertation vier miteinander verbundene Text- und Bildprobleme vor: • Bildersuche. Ob Bilder anhand von textbasierten Suchanfragen gefunden werden, hängt stark davon ab, ob der Text in der Nähe des Bildes mit dem der Anfrage übereinstimmt. Bilder ohne textuellen Kontext, oder sogar mit thematisch passendem Kontext, aber ohne direkte Übereinstimmungen der vorhandenen Schlagworte zur Suchanfrage, können häufig nicht gefunden werden. Zur Abhilfe schlagen wir vor, drei Arten von Informationen in Kombination zu nutzen: visuelle Informationen (in Form von automatisch generierten Bildbeschreibungen), textuelle Informationen (Stichworte aus vorangegangenen Suchanfragen), und Alltagswissen. • Verbesserte Bildbeschreibungen. Bei der Objekterkennung durch Computer Vision kommt es des Öfteren zu Fehldetektionen und Inkohärenzen. Die korrekte Identifikation von Bildinhalten ist jedoch eine wichtige Voraussetzung für die Suche nach Bildern mittels textueller Suchanfragen. Um die Fehleranfälligkeit bei der Objekterkennung zu minimieren, schlagen wir vor Alltagswissen einzubeziehen. Durch zusätzliche Bild-Annotationen, welche sich durch den gesunden Menschenverstand als thematisch passend erweisen, können viele fehlerhafte und zusammenhanglose Erkennungen vermieden werden. • Bild-Text Platzierung. Auf Internetseiten mit Text- und Bildinhalten (wie Nachrichtenseiten, Blogbeiträge, Artikel in sozialen Medien) werden Bilder in der Regel an semantisch sinnvollen Positionen im Textfluss platziert. Wir nutzen dies um ein Framework vorzuschlagen, in dem relevante Bilder ausgesucht werden und mit den passenden Abschnitten eines Textes assoziiert werden. • Bildunterschriften. Bilder, die als Teil von multimodalen Inhalten zur Verbesserung der Lesbarkeit von Texten dienen, haben typischerweise Bildunterschriften, die zum Kontext des umgebenden Texts passen. Wir schlagen vor, den Kontext beim automatischen Generieren von Bildunterschriften ebenfalls einzubeziehen. Üblicherweise werden hierfür die Bilder allein analysiert. Wir stellen die kontextbezogene Bildunterschriftengenerierung vor. Unsere vielversprechenden Beobachtungen und Ergebnisse eröffnen interessante Möglichkeiten für weitergehende Forschung zur computergestützten Erfassung des Zusammenspiels von Text- und Bildinhalten
Semiotic Shortcuts. The Graphical Abstract Strategies of Engineering Students
Graphical abstracts are representative of the rising promotionalism, interdisciplinarity and changing researcher roles in the current dissemination of science and technology. Their design, moreover, amalgamates a number of transdisciplinary skills much valued in higher education, such as critical and lateral thinking, and cultural and audience awareness. In this study, I investigate a corpus of 56 samples of graphical abstracts devised by my aeronautical engineering students, to find out the ‘semiotic shortcuts’ or encoding strategies they deploy, without any previous instruction, to pack information and translate the verbal into the visual. Findings suggest that their ‘natural digital-native graphicacy’ is conservative as to the medium, format and type of representation, but versatile regarding particular meanings, although not always unambiguous or register-appropriate. Consequently, I claim the convenience of including graphicacy/visual literacy and some basic training on graphical abstract design in the English for Specific Purposes and the disciplinary English-medium curriculum.
- …