71 research outputs found

    Artificial general intelligence: Proceedings of the Second Conference on Artificial General Intelligence, AGI 2009, Arlington, Virginia, USA, March 6-9, 2009

    Get PDF
    Artificial General Intelligence (AGI) research focuses on the original and ultimate goal of AI – to create broad human-like and transhuman intelligence, by exploring all available paths, including theoretical and experimental computer science, cognitive science, neuroscience, and innovative interdisciplinary methodologies. Due to the difficulty of this task, for the last few decades the majority of AI researchers have focused on what has been called narrow AI – the production of AI systems displaying intelligence regarding specific, highly constrained tasks. In recent years, however, more and more researchers have recognized the necessity – and feasibility – of returning to the original goals of the field. Increasingly, there is a call for a transition back to confronting the more difficult issues of human level intelligence and more broadly artificial general intelligence

    ‘IMPLICIT CREATION’ – NON-PROGRAMMER CONCEPTUAL MODELS FOR AUTHORING IN INTERACTIVE DIGITAL STORYTELLING

    Get PDF
    Interactive Digital Storytelling (IDS) constitutes a research field that emerged from several areas of art, creation and computer science. It inquires technologies and possible artefacts that allow ‘highly-interactive’ experiences of digital worlds with compelling stories. However, the situation for story creators approaching ‘highly-interactive’ storytelling is complex. There is a gap between the available technology, which requires programming and prior knowledge in Artificial Intelligence, and established models of storytelling, which are too linear to have the potential to be highly interactive. This thesis reports on research that lays the ground for bridging this gap, leading to novel creation philosophies in future work. A design research process has been pursued, which centred on the suggestion of conceptual models, explaining a) process structures of interdisciplinary development, b) interactive story structures including the user of the interactive story system, and c) the positioning of human authors within semi-automated creative processes. By means of ‘implicit creation’, storytelling and modelling of simulated worlds are reconciled. The conceptual models are informed by exhaustive literature review in established neighbouring disciplines. These are a) creative principles in different storytelling domains, such as screenwriting, video game writing, role playing and improvisational theatre, b) narratological studies of story grammars and structures, and c) principles of designing interactive systems, in the areas of basic HCI design and models, discourse analysis in conversational systems, as well as game- and simulation design. In a case study of artefact building, the initial models have been put into practice, evaluated and extended. These artefacts are a) a conceived authoring tool (‘Scenejo’) for the creation of digital conversational stories, and b) the development of a serious game (‘The Killer Phrase Game’) as an application development. The study demonstrates how starting out from linear storytelling, iterative steps of ‘implicit creation’ can lead to more variability and interactivity in the designed interactive story. In the concrete case, the steps included abstraction of dialogues into conditional actions, and creating a dynamic world model of the conversation. This process and artefact can be used as a model illustrating non-programmer approaches to ‘implicit creation’ in a learning process. Research demonstrates that the field of Interactive Digital Storytelling still has to be further advanced until general creative principles can be fully established, which is a long-term endeavour, dependent upon environmental factors. It also requires further technological developments. The gap is not yet closed, but it can be better explained. The research results build groundwork for education of prospective authors. Concluding the thesis, IDS-specific creative principles have been proposed for evaluation in future work

    On challenges in training recurrent neural networks

    Full text link
    Dans un problème de prédiction à multiples pas discrets, la prédiction à chaque instant peut dépendre de l’entrée à n’importe quel moment dans un passé lointain. Modéliser une telle dépendance à long terme est un des problèmes fondamentaux en apprentissage automatique. En théorie, les Réseaux de Neurones Récurrents (RNN) peuvent modéliser toute dépendance à long terme. En pratique, puisque la magnitude des gradients peut croître ou décroître exponentiellement avec la durée de la séquence, les RNNs ne peuvent modéliser que les dépendances à court terme. Cette thèse explore ce problème dans les réseaux de neurones récurrents et propose de nouvelles solutions pour celui-ci. Le chapitre 3 explore l’idée d’utiliser une mémoire externe pour stocker les états cachés d’un réseau à Mémoire Long et Court Terme (LSTM). En rendant l’opération d’écriture et de lecture de la mémoire externe discrète, l’architecture proposée réduit le taux de décroissance des gradients dans un LSTM. Ces opérations discrètes permettent également au réseau de créer des connexions dynamiques sur de longs intervalles de temps. Le chapitre 4 tente de caractériser cette décroissance des gradients dans un réseau de neurones récurrent et propose une nouvelle architecture récurrente qui, grâce à sa conception, réduit ce problème. L’Unité Récurrente Non-saturante (NRUs) proposée n’a pas de fonction d’activation saturante et utilise la mise à jour additive de cellules au lieu de la mise à jour multiplicative. Le chapitre 5 discute des défis de l’utilisation de réseaux de neurones récurrents dans un contexte d’apprentissage continuel, où de nouvelles tâches apparaissent au fur et à mesure. Les dépendances dans l’apprentissage continuel ne sont pas seulement contenues dans une tâche, mais sont aussi présentes entre les tâches. Ce chapitre discute de deux problèmes fondamentaux dans l’apprentissage continuel: (i) l’oubli catastrophique d’anciennes tâches et (ii) la capacité de saturation du réseau. De plus, une solution est proposée pour régler ces deux problèmes lors de l’entraînement d’un réseau de neurones récurrent.In a multi-step prediction problem, the prediction at each time step can depend on the input at any of the previous time steps far in the past. Modelling such long-term dependencies is one of the fundamental problems in machine learning. In theory, Recurrent Neural Networks (RNNs) can model any long-term dependency. In practice, they can only model short-term dependencies due to the problem of vanishing and exploding gradients. This thesis explores the problem of vanishing gradient in recurrent neural networks and proposes novel solutions for the same. Chapter 3 explores the idea of using external memory to store the hidden states of a Long Short Term Memory (LSTM) network. By making the read and write operations of the external memory discrete, the proposed architecture reduces the rate of gradients vanishing in an LSTM. These discrete operations also enable the network to create dynamic skip connections across time. Chapter 4 attempts to characterize all the sources of vanishing gradients in a recurrent neural network and proposes a new recurrent architecture which has significantly better gradient flow than state-of-the-art recurrent architectures. The proposed Non-saturating Recurrent Units (NRUs) have no saturating activation functions and use additive cell updates instead of multiplicative cell updates. Chapter 5 discusses the challenges of using recurrent neural networks in the context of lifelong learning. In the lifelong learning setting, the network is expected to learn a series of tasks over its lifetime. The dependencies in lifelong learning are not just within a task, but also across the tasks. This chapter discusses the two fundamental problems in lifelong learning: (i) catastrophic forgetting of old tasks, and (ii) network capacity saturation. Further, it proposes a solution to solve both these problems while training a recurrent neural network

    First steps in the study of cyber-psycho-cognitive operations

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Relações Internacionais, Programa de Pós-Graduação em Relações Internacionais, 2019.O presente trabalho é uma análise dos mecanismos informáticos e tecno-comunicacionais envolvidos na articulação de mundos da vida orientados estrategicamente para estimular, prever ou minar o desenvolvimento das condições psico-cognitivas adequadas para a construção e sustento da legitimidade racional de uma autoridade ou ação política. A aplicação de instrumentos “arqueológicos” Foucauldianos ao estudo das narrativas políticas que engendraram e surgiram de “Russiagate” permitiu situar a teoria num contexto histórico e validar a premissa da convergência e incorporação de tendências de agendamento comuns e de práticas típicas de operações psicológicas tradicionais. Contudo, os efeitos tanto da disponibilidade comercial das TICs com capacidade de “deep learning”, quanto da estruturação baseada em conhecimento permitida pela ubiquidade e centralidade econômica dessas tecnologias, tornam o conjunto de mecanismos analisados num fenômeno que merece uma conceptualização e marco investigativo únicos. A obra é uma contribuição a esse empreendimento.This is an analysis of the ICT-based mechanisms involved in the articulation of lifeworlds that are strategically oriented to foster, prevent or undermine the development of psycho-cognitive conditions adequate for the construction or sustainability of an authority’s or a political action’s rational legitimacy. While grounding theory to a historical context, the application of Foucauldian “archeological” instruments to the study of the political narratives giving birth and springing from “Russiagate” also served to validate the premised convergence and incorporation of common agenda-setting trends and practices typical of traditional psychological operations. However, the effects of both the commercial availability of deep-learning ICTs and the cognition-based structuration afforded by their ubiquity and economic centrality set this “dispositif” apart, thereby deserving a unique conceptualization and research framework. This study is a contribution to such endeavor

    Designing Embodied Interactive Software Agents for E-Learning: Principles, Components, and Roles

    Get PDF
    Embodied interactive software agents are complex autonomous, adaptive, and social software systems with a digital embodiment that enables them to act on and react to other entities (users, objects, and other agents) in their environment through bodily actions, which include the use of verbal and non-verbal communicative behaviors in face-to-face interactions with the user. These agents have been developed for various roles in different application domains, in which they perform tasks that have been assigned to them by their developers or delegated to them by their users or by other agents. In computer-assisted learning, embodied interactive pedagogical software agents have the general task to promote human learning by working with students (and other agents) in computer-based learning environments, among them e-learning platforms based on Internet technologies, such as the Virtual Linguistics Campus (www.linguistics-online.com). In these environments, pedagogical agents provide contextualized, qualified, personalized, and timely assistance, cooperation, instruction, motivation, and services for both individual learners and groups of learners. This thesis develops a comprehensive, multidisciplinary, and user-oriented view of the design of embodied interactive pedagogical software agents, which integrates theoretical and practical insights from various academic and other fields. The research intends to contribute to the scientific understanding of issues, methods, theories, and technologies that are involved in the design, implementation, and evaluation of embodied interactive software agents for different roles in e-learning and other areas. For developers, the thesis provides sixteen basic principles (Added Value, Perceptible Qualities, Balanced Design, Coherence, Consistency, Completeness, Comprehensibility, Individuality, Variability, Communicative Ability, Modularity, Teamwork, Participatory Design, Role Awareness, Cultural Awareness, and Relationship Building) plus a large number of specific guidelines for the design of embodied interactive software agents and their components. Furthermore, it offers critical reviews of theories, concepts, approaches, and technologies from different areas and disciplines that are relevant to agent design. Finally, it discusses three pedagogical agent roles (virtual native speaker, coach, and peer) in the scenario of the linguistic fieldwork classes on the Virtual Linguistics Campus and presents detailed considerations for the design of an agent for one of these roles (the virtual native speaker)

    Scalable Text Mining with Sparse Generative Models

    Get PDF
    The information age has brought a deluge of data. Much of this is in text form, insurmountable in scope for humans and incomprehensible in structure for computers. Text mining is an expanding field of research that seeks to utilize the information contained in vast document collections. General data mining methods based on machine learning face challenges with the scale of text data, posing a need for scalable text mining methods. This thesis proposes a solution to scalable text mining: generative models combined with sparse computation. A unifying formalization for generative text models is defined, bringing together research traditions that have used formally equivalent models, but ignored parallel developments. This framework allows the use of methods developed in different processing tasks such as retrieval and classification, yielding effective solutions across different text mining tasks. Sparse computation using inverted indices is proposed for inference on probabilistic models. This reduces the computational complexity of the common text mining operations according to sparsity, yielding probabilistic models with the scalability of modern search engines. The proposed combination provides sparse generative models: a solution for text mining that is general, effective, and scalable. Extensive experimentation on text classification and ranked retrieval datasets are conducted, showing that the proposed solution matches or outperforms the leading task-specific methods in effectiveness, with a order of magnitude decrease in classification times for Wikipedia article categorization with a million classes. The developed methods were further applied in two 2014 Kaggle data mining prize competitions with over a hundred competing teams, earning first and second places

    Training Datasets for Machine Reading Comprehension and Their Limitations

    Get PDF
    Neural networks are a powerful model class to learn machine Reading Comprehen- sion (RC), yet they crucially depend on the availability of suitable training datasets. In this thesis we describe methods for data collection, evaluate the performance of established models, and examine a number of model behaviours and dataset limita- tions. We first describe the creation of a data resource for the science exam QA do- main, and compare existing models on the resulting dataset. The collected ques- tions are plausible – non-experts can distinguish them from real exam questions with 55% accuracy – and using them as additional training data leads to improved model scores on real science exam questions. Second, we describe and apply a distant supervision dataset construction method for multi-hop RC across documents. We identify and mitigate several dataset assembly pitfalls – a lack of unanswerable candidates, label imbalance, and spurious correlations between documents and particular candidates – which often leave shallow predictive cues for the answer. Furthermore we demonstrate that se- lecting relevant document combinations is a critical performance bottleneck on the datasets created. We thus investigate Pseudo-Relevance Feedback, which leads to improvements compared to TF-IDF-based document combination selection both in retrieval metrics and answer accuracy. Third, we investigate model undersensitivity: model predictions do not change when given adversarially altered questions in SQUAD2.0 and NEWSQA, even though they should. We characterise affected samples, and show that the phe- nomenon is related to a lack of structurally similar but unanswerable samples during training: data augmentation reduces the adversarial error rate, e.g. from 51.7% to 20.7% for a BERT model on SQUAD2.0, and improves robustness also in other settings. Finally we explore efficient formal model verification via Interval Bound Propagation (IBP) to measure and address model undersensitivity, and show that using an IBP-derived auxiliary loss can improve verification rates, e.g. from 2.8% to 18.4% on the SNLI test set

    Ontic Occlusion and Exposure in Sociotechnical Systems

    Full text link
    Living inside built environments - infrastructure - it is easy to take for granted the things that we do not need to engage, but are at work behind the scenes nonetheless. Well-designed systems become invisible, but to engage them, how do we know which perspectives, objects, and relationships are useful? I examine the University of Michigan Digital Library (UMDL), a mid-1990s interdisciplinary project attempting to build an agent-based digital library architecture. Through analyzing project data, I develop the concept of ontic occlusion and exposure - mechanisms of choice regarding objects and relationships that enter discourses and representations. By analyzing project artifacts, interview transcripts, and meeting records, this study iden- tifies key sets of discursive elements bridging concepts between disciplinary communities on the surface, but were the fundamental sites of contestation between groups’ understanding of project goals. I examine narratives of project personnel to understand the positioning of terms and ideas relating to project design, execution, and assessment, and discuss the role of the ontic in interdisciplinary work. Using data from the UMDL project, I discuss the tension between occlusion (the hidden) and exposure (the revealed) in understanding the digital library as an object through meet- ings of the project operating committee - the primary engagement site between researchers from different departments, primarily computer engineering and library science. Examining interpretive differences, use of fundamental terms, and observations about the contested responses toward resolution, we can better understand the outcomes of the project, the disciplinary positioning of institutional change, and perspectives of evaluating the project in the subsequent years. This dissertation contributes to an understanding of discourse development in interdisciplinary projects where shared language is important to design, execution, and evaluation. It combines perspectives in philosophy, digital libraries, and interdisciplinarity studies. The complementary mechanisms of ontic occlusion and exposure are useful devices to decode and describe change in sociotechnical systems, and highlight the need to examine more closely both what is rendered in accounts of infrastructure, and residual categories often left unaddressed.Ph.D.InformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/78763/1/cknobel_1.pd
    corecore