22 research outputs found

    Challenges and perspectives of hate speech research

    Get PDF
    This book is the result of a conference that could not take place. It is a collection of 26 texts that address and discuss the latest developments in international hate speech research from a wide range of disciplinary perspectives. This includes case studies from Brazil, Lebanon, Poland, Nigeria, and India, theoretical introductions to the concepts of hate speech, dangerous speech, incivility, toxicity, extreme speech, and dark participation, as well as reflections on methodological challenges such as scraping, annotation, datafication, implicity, explainability, and machine learning. As such, it provides a much-needed forum for cross-national and cross-disciplinary conversations in what is currently a very vibrant field of research

    Software-based dialogue systems: Survey, taxonomy and challenges

    Get PDF
    The use of natural language interfaces in the field of human-computer interaction is undergoing intense study through dedicated scientific and industrial research. The latest contributions in the field, including deep learning approaches like recurrent neural networks, the potential of context-aware strategies and user-centred design approaches, have brought back the attention of the community to software-based dialogue systems, generally known as conversational agents or chatbots. Nonetheless, and given the novelty of the field, a generic, context-independent overview on the current state of research of conversational agents covering all research perspectives involved is missing. Motivated by this context, this paper reports a survey of the current state of research of conversational agents through a systematic literature review of secondary studies. The conducted research is designed to develop an exhaustive perspective through a clear presentation of the aggregated knowledge published by recent literature within a variety of domains, research focuses and contexts. As a result, this research proposes a holistic taxonomy of the different dimensions involved in the conversational agents’ field, which is expected to help researchers and to lay the groundwork for future research in the field of natural language interfaces.With the support from the Secretariat for Universities and Research of the Ministry of Business and Knowledge of the Government of Catalonia and the European Social Fund. The corresponding author gratefully acknowledges the Universitat Politècnica de Catalunya and Banco Santander for the inancial support of his predoctoral grant FPI-UPC. This paper has been funded by the Spanish Ministerio de Ciencia e Innovación under project / funding scheme PID2020-117191RB-I00 / AEI/10.13039/501100011033.Peer ReviewedPostprint (author's final draft

    Money & Trust in Digital Society, Bitcoin and Stablecoins in ML enabled Metaverse Telecollaboration

    Full text link
    We present a state of the art and positioning book, about Digital society tools, namely; Web3, Bitcoin, Metaverse, AI/ML, accessibility, safeguarding and telecollaboration. A high level overview of Web3 technologies leads to a description of blockchain, and the Bitcoin network is specifically selected for detailed examination. Suitable components of the extended Bitcoin ecosystem are described in more depth. Other mechanisms for native digital value transfer are described, with a focus on `money'. Metaverse technology is over-viewed, primarily from the perspective of Bitcoin and extended reality. Bitcoin is selected as the best contender for value transfer in metaverses because of it's free and open source nature, and network effect. Challenges and risks of this approach are identified. A cloud deployable virtual machine based technology stack deployment guide with a focus on cybersecurity best practice can be downloaded from GitHub to experiment with the technologies. This deployable lab is designed to inform development of secure value transaction, for small and medium sized companies

    Scalable and Quality-Aware Training Data Acquisition for Conversational Cognitive Services

    Full text link
    Dialog Systems (or simply bots) have recently become a popular human-computer interface for performing user's tasks, by invoking the appropriate back-end APIs (Application Programming Interfaces) based on the user's request in natural language. Building task-oriented bots, which aim at performing real-world tasks (e.g., booking flights), has become feasible with the continuous advances in Natural Language Processing (NLP), Artificial Intelligence (AI), and the countless number of devices which allow third-party software systems to invoke their back-end APIs. Nonetheless, bot development technologies are still in their preliminary stages, with several unsolved theoretical and technical challenges stemming from the ambiguous nature of human languages. Given the richness of natural language, supervised models require a large number of user utterances paired with their corresponding tasks -- called intents. To build a bot, developers need to manually translate APIs to utterances (called canonical utterances) and paraphrase them to obtain a diverse set of utterances. Crowdsourcing has been widely used to obtain such datasets, by paraphrasing the initial utterances generated by the bot developers for each task. However, there are several unsolved issues. First, generating canonical utterances requires manual efforts, making bot development both expensive and hard to scale. Second, since crowd workers may be anonymous and are asked to provide open-ended text (paraphrases), crowdsourced paraphrases may be noisy and incorrect (not conveying the same intent as the given task). This thesis first surveys the state-of-the-art approaches for collecting large training utterances for task-oriented bots. Next, we conduct an empirical study to identify quality issues of crowdsourced utterances (e.g., grammatical errors, semantic completeness). Moreover, we propose novel approaches for identifying unqualified crowd workers and eliminating malicious workers from crowdsourcing tasks. Particularly, we propose a novel technique to promote the diversity of crowdsourced paraphrases by dynamically generating word suggestions while crowd workers are paraphrasing a particular utterance. Moreover, we propose a novel technique to automatically translate APIs to canonical utterances. Finally, we present our platform to automatically generate bots out of API specifications. We also conduct thorough experiments to validate the proposed techniques and models

    Supporting Source Code Search with Context-Aware and Semantics-Driven Query Reformulation

    Get PDF
    Software bugs and failures cost trillions of dollars every year, and could even lead to deadly accidents (e.g., Therac-25 accident). During maintenance, software developers fix numerous bugs and implement hundreds of new features by making necessary changes to the existing software code. Once an issue report (e.g., bug report, change request) is assigned to a developer, she chooses a few important keywords from the report as a search query, and then attempts to find out the exact locations in the software code that need to be either repaired or enhanced. As a part of this maintenance, developers also often select ad hoc queries on the fly, and attempt to locate the reusable code from the Internet that could assist them either in bug fixing or in feature implementation. Unfortunately, even the experienced developers often fail to construct the right search queries. Even if the developers come up with a few ad hoc queries, most of them require frequent modifications which cost significant development time and efforts. Thus, construction of an appropriate query for localizing the software bugs, programming concepts or even the reusable code is a major challenge. In this thesis, we overcome this query construction challenge with six studies, and develop a novel, effective code search solution (BugDoctor) that assists the developers in localizing the software code of interest (e.g., bugs, concepts and reusable code) during software maintenance. In particular, we reformulate a given search query (1) by designing novel keyword selection algorithms (e.g., CodeRank) that outperform the traditional alternatives (e.g., TF-IDF), (2) by leveraging the bug report quality paradigm and source document structures which were previously overlooked and (3) by exploiting the crowd knowledge and word semantics derived from Stack Overflow Q&A site, which were previously untapped. Our experiment using 5000+ search queries (bug reports, change requests, and ad hoc queries) suggests that our proposed approach can improve the given queries significantly through automated query reformulations. Comparison with 10+ existing studies on bug localization, concept location and Internet-scale code search suggests that our approach can outperform the state-of-the-art approaches with a significant margin

    Calibración de un algoritmo de detección de anomalías marítimas basado en la fusión de datos satelitales

    Get PDF
    La fusión de diferentes fuentes de datos aporta una ayuda significativa en el proceso de toma de decisiones. El presente artículo describe el desarrollo de una plataforma que permite detectar anomalías marítimas por medio de la fusión de datos del Sistema de Información Automática (AIS) para seguimiento de buques y de imágenes satelitales de Radares de Apertura Sintética (SAR). Estas anomalías son presentadas al operador como un conjunto de detecciones que requieren ser monitoreadas para descubrir su naturaleza. El proceso de detección se lleva adelante primero identificando objetos dentro de las imágenes SAR a través de la aplicación de algoritmos CFAR, y luego correlacionando los objetos detectados con los datos reportados mediante el sistema AIS. En este trabajo reportamos las pruebas realizadas con diferentes configuraciones de los parámetros para los algoritmos de detección y asociación, analizamos la respuesta de la plataforma y reportamos la combinación de parámetros que reporta mejores resultados para las imágenes utilizadas. Este es un primer paso en nuestro objetivo futuro de desarrollar un sistema que ajuste los parámetros en forma dinámica dependiendo de las imágenes disponibles.XVI Workshop Computación Gráfica, Imágenes y Visualización (WCGIV)Red de Universidades con Carreras en Informática (RedUNCI

    Meme theory

    Get PDF
    The internet meme is one of contemporary online culture’s definitive media. They’re widely distributed online and, in the past few years, have had an increasingly large impact on offline culture as well. The premise of this thesis is that the internet meme poses a theoretical problem for media theory, because they’re difficult to conceptualise as media. This thesis uses this premise as the basis for a wide-ranging epistemological analysis of how we practice media theory in the present. The internet meme, it argues, exemplifies a wide-ranging problem in media theory: that the discipline has yet to adequately conceptualise circulation. This is problematic for the internet meme, because it’s defined by its capacity to mutate as it’s circulated by users. It’s also problematic more broadly, because the circulation of media is central to our contemporary media situation. This thesis frames this problem by arguing that our contemporary media situation is “indeterminate”; that is, that massive distribution and ubiquitous media challenge our capacity to think media in the present. In response, it uses the internet meme as the fulcrum for a series of propositions about how media theory might respond. To think circulation, it adopts a method from the history and philosophy of science known as “historical epistemology”. It uses this method to analyse circulation as a concept—rather than through its theoretical frameworks—and to establish why it remains undertheorised in media theory. It uses this analysis to argue that circulation is a foundational media theoretical concept; to reconstruct this concept; and to posit an approach to thinking media in the present that it calls “meme theory”. This approach is characterised by emphasising the epistemological influence that media exercise over our theories of them. By positing a new concept of circulation, a new method of analysis—media-historical epistemology—and a new approach to practicing media theory, this thesis argues that to think media in the present, we have to understand how they shape our media-theoretical epistemologies in turn. The circulating internet meme helps us to understand how we might do this
    corecore