18 research outputs found

    Users' Traces for Enhancing Arabic Facebook Search

    Get PDF
    International audienceThis paper proposes an approach on Facebook search in Arabic, which exploits several users' traces (e.g. comment, share, reactions) left on Facebook posts to estimate their social importance. Our goal is to show how these social traces (signals) can play a vital role in improving Arabic Facebook search. Firstly, we identify polarities (positive or negative) carried by the textual signals (e.g. comments) and non-textual ones (e.g. the reactions love and sad) for a given Facebook post. Therefore, the polarity of each comment expressed on a given Facebook post, is estimated on the basis of a neural sentiment model in Arabic language. Secondly, we group signals according to their complementarity using features selection algorithms. Thirdly, we apply learning to rank (LTR) algorithms to re-rank Facebook search results based on the selected groups of signals. Finally, experiments are carried out on 13,500 Facebook posts, collected from 45 topics in Arabic language. Experiments results reveal that Random Forests combined with ReliefFAttributeEval (RLF) was the most effective LTR approach for this task

    Detecting toxicity triggers in online discussions

    Get PDF

    Novella 2.0: A Hypertextual Architecture for Interactive Narrative in Games

    Get PDF
    The hypertext community has a history of research in Interactive Digital Narrative (IDN), including experimental works and systems to support authoring. Arguably the most prevalent contemporary form of IDN is within the world of computer games where a mixture of large-scale commercial works and smaller indie experimental pieces continue to develop new forms of interactive storytelling. We can explore these pieces through the lens of hyper- textual theory and support them with hypertextual architectures, but there are unique challenges within modern game-based storytelling that these frameworks sometimes struggle to capture on a content level, leaving us in some cases with insufficient models and vocabulary. In this paper, we build upon previous work by presenting a discussion on techniques of modeling video game narrative. This is followed by thorough presentation and demonstration of our game-centric theoretical model of interactive narrative, Novella 2.0, which builds upon our previous contributions. This model is then positioned within a novel architecture for the authoring, interchange, integration, and simulation of video game narrative. We present alongside the architecture four key innovations towards supporting game narrative. We include support for Discoverable Narrative and other game narrative content alongside structural features in a deference of responsibility to game engines and our own approach to mixing calligraphic and sculptural hyper- text structure

    CrowdCO-OP : sharing risks and rewards in crowdsourcing

    Get PDF
    Paid micro-task crowdsourcing has gained in popularity partly due to the increasing need for large-scale manually labelled datasets which are often used to train and evaluate Artificial Intelligence systems. Modern paid crowdsourcing platforms use a piecework approach to rewards, meaning that workers are paid for each task they complete, given that their work quality is considered sufficient by the requester or the platform. Such an approach creates risks for workers; their work may be rejected without being rewarded, and they may be working on poorly rewarded tasks, in light of the disproportionate time required to complete them. As a result, recent research has shown that crowd workers may tend to choose specific, simple, and familiar tasks and avoid new requesters to manage these risks. In this paper, we propose a novel crowdsourcing reward mechanism that allows workers to share these risks and achieve a standardized hourly wage equal for all participating workers. Reward-focused workers can thereby take up challenging and complex HITs without bearing the financial risk of not being rewarded for completed work. We experimentally compare different crowd reward schemes and observe their impact on worker performance and satisfaction. Our results show that 1) workers clearly perceive the benefits of the proposed reward scheme, 2) work effectiveness and efficiency are not impacted as compared to those of the piecework scheme, and 3) the presence of slow workers is limited and does not disrupt the proposed cooperation-based approaches

    An automatic participant detection framework for event tracking on twitter

    Get PDF
    Topic Detection and Tracking (TDT) on Twitter emulates human identifying developments in events from a stream of tweets, but while event participants are important for humans to understand what happens during events, machines have no knowledge of them. Our evaluation on football matches and basketball games shows that identifying event participants from tweets is a difficult problem exacerbated by Twitter’s noise and bias. As a result, traditional Named Entity Recognition (NER) approaches struggle to identify participants from the pre-event Twitter stream. To overcome these challenges, we describe Automatic Participant Detection (APD) to detect an event’s participants before the event starts and improve the machine understanding of events. We propose a six-step framework to identify participants and present our implementation, which combines information from Twitter’s pre-event stream and Wikipedia. In spite of the difficulties associated with Twitter and NER in the challenging context of events, our approach manages to restrict noise and consistently detects the majority of the participants. By empowering machines with some of the knowledge that humans have about events, APD lays the foundation not just for improved TDT.peer-reviewe

    Predicting Knowledge Gain during Web Search based on Eye-movement Patterns

    Get PDF
    The content on the internet is expanding exponentially, and the virtual space has become a messy place. Therefore, acquiring information to fulfill the learning need is a difficult task. Search as Learning (SAL) is a new domain that investigates the importance of the learning process and supports individuals in acquiring information. Therefore, a solution to make obtaining information easier for knowledge seekers from a web search. Prior work in this field focused extensively on resource data (e.g., text and multimedia resources) and behavioral data (e.g., search interactions) to make a knowledge gain (KG) prediction during a web search. However, eye movement and reading pattern data are yet to be explored. Thereby, in this work, we introduce a set of features related to eye movements that would help us predict knowledge gain based on the reading pattern of the participants. For this purpose, we relied on data from a prior work-study, in which 114 participants had to acquire information about the foundation of lightning and thunder from a web search. We used a cutting-edge approach for the evaluation. Moreover, we extended with a word-level mapping to eye fixations of web pages, unlike prior work that attempted to rely on the eye’s central vision to map the eye fixations. Experimental results demonstrate the ability to predict knowledge gain based on the reading pattern and eye movements

    Automatic understanding of multimodal content for Web-based learning

    Get PDF
    Web-based learning has become an integral part of everyday life for all ages and backgrounds. On the one hand, the advantages of this learning type, such as availability, accessibility, flexibility, and cost, are apparent. On the other hand, the oversupply of content can lead to learners struggling to find optimal resources efficiently. The interdisciplinary research field Search as Learning is concerned with the analysis and improvement of Web-based learning processes, both on the learner and the computer science side. So far, automatic approaches that assess and recommend learning resources in Search as Learning (SAL) focus on textual, resource, and behavioral features. However, these approaches commonly ignore multimodal aspects. This work addresses this research gap by proposing several approaches that address the question of how multimodal retrieval methods can help support learning on the Web. First, we evaluate whether textual metadata of the TIB AV-Portal can be exploited and enriched by semantic word embeddings to generate video recommendations and, in addition, a video summarization technique to improve exploratory search. Then we turn to the challenging task of knowledge gain prediction that estimates the potential learning success given a specific learning resource. We used data from two user studies for our approaches. The first one observes the knowledge gain when learning with videos in a Massive Open Online Course (MOOC) setting, while the second one provides an informal Web-based learning setting where the subjects have unrestricted access to the Internet. We then extend the purely textual features to include visual, audio, and cross-modal features for a holistic representation of learning resources. By correlating these features with the achieved knowledge gain, we can estimate the impact of a particular learning resource on learning success. We further investigate the influence of multimodal data on the learning process by examining how the combination of visual and textual content generally conveys information. For this purpose, we draw on work from linguistics and visual communications, which investigated the relationship between image and text by means of different metrics and categorizations for several decades. We concretize these metrics to enable their compatibility for machine learning purposes. This process includes the derivation of semantic image-text classes from these metrics. We evaluate all proposals with comprehensive experiments and discuss their impacts and limitations at the end of the thesis.Web-basiertes Lernen ist ein fester Bestandteil des Alltags aller Alters- und Bevölkerungsschichten geworden. Einerseits liegen die Vorteile dieser Art des Lernens wie Verfügbarkeit, Zugänglichkeit, Flexibilität oder Kosten auf der Hand. Andererseits kann das Überangebot an Inhalten auch dazu führen, dass Lernende nicht in der Lage sind optimale Ressourcen effizient zu finden. Das interdisziplinäre Forschungsfeld Search as Learning beschäftigt sich mit der Analyse und Verbesserung von Web-basierten Lernprozessen. Bisher sind automatische Ansätze bei der Bewertung und Empfehlung von Lernressourcen fokussiert auf monomodale Merkmale, wie Text oder Dokumentstruktur. Die multimodale Betrachtung ist hingegen noch nicht ausreichend erforscht. Daher befasst sich diese Arbeit mit der Frage wie Methoden des Multimedia Retrievals dazu beitragen können das Lernen im Web zu unterstützen. Zunächst wird evaluiert, ob textuelle Metadaten des TIB AV-Portals genutzt werden können um in Verbindung mit semantischen Worteinbettungen einerseits Videoempfehlungen zu generieren und andererseits Visualisierungen zur Inhaltszusammenfassung von Videos abzuleiten. Anschließend wenden wir uns der anspruchsvollen Aufgabe der Vorhersage des Wissenszuwachses zu, die den potenziellen Lernerfolg einer Lernressource schätzt. Wir haben für unsere Ansätze Daten aus zwei Nutzerstudien verwendet. In der ersten wird der Wissenszuwachs beim Lernen mit Videos in einem MOOC-Setting beobachtet, während die zweite eine informelle web-basierte Lernumgebung bietet, in der die Probanden uneingeschränkten Internetzugang haben. Anschließend erweitern wir die rein textuellen Merkmale um visuelle, akustische und cross-modale Merkmale für eine ganzheitliche Darstellung der Lernressourcen. Durch die Korrelation dieser Merkmale mit dem erzielten Wissenszuwachs können wir den Einfluss einer Lernressource auf den Lernerfolg vorhersagen. Weiterhin untersuchen wir wie verschiedene Kombinationen von visuellen und textuellen Inhalten Informationen generell vermitteln. Dazu greifen wir auf Arbeiten aus der Linguistik und der visuellen Kommunikation zurück, die seit mehreren Jahrzehnten die Beziehung zwischen Bild und Text untersucht haben. Wir konkretisieren vorhandene Metriken, um ihre Verwendung für maschinelles Lernen zu ermöglichen. Dieser Prozess beinhaltet die Ableitung semantischer Bild-Text-Klassen. Wir evaluieren alle Ansätze mit umfangreichen Experimenten und diskutieren ihre Auswirkungen und Limitierungen am Ende der Arbeit
    corecore