Search CORE

24,419 research outputs found

Inductive Biases for Deep Learning of Higher-Level Cognition

Author: Bengio Yoshua
Goyal Anirudh
Publication venue
Publication date: 17/02/2021
Field of study

A fascinating hypothesis is that human and animal intelligence could be explained by a few principles (rather than an encyclopedic list of heuristics). If that hypothesis was correct, we could more easily both understand our own intelligence and build intelligent machines. Just like in physics, the principles themselves would not be sufficient to predict the behavior of complex systems like brains, and substantial computation might be needed to simulate human-like intelligence. This hypothesis would suggest that studying the kind of inductive biases that humans and animals exploit could help both clarify these principles and provide inspiration for AI research and neuroscience theories. Deep learning already exploits several key inductive biases, and this work considers a larger list, focusing on those which concern mostly higher-level and sequential conscious processing. The objective of clarifying these particular principles is that they could potentially help us build AI systems benefiting from humans' abilities in terms of flexible out-of-distribution and systematic generalization, which is currently an area where a large gap exists between state-of-the-art machine learning and human intelligence.Comment: This document contains a review of authors research as part of the requirement of AG's predoctoral exam, an overview of the main contributions of the authors few recent papers (co-authored with several other co-authors) as well as a vision of proposed future researc

arXiv.org e-Print Archive

Structural Inductive Biases in Emergent Communication

Author: Gupta Abhinav
Hamilton William L.
Holden Sean B.
Jamnik Mateja
Pal Christopher
Słowik Agnieszka
Publication venue
Publication date: 01/01/2021
Field of study

In order to communicate, humans flatten a complex representation of ideas and their attributes into a single word or a sentence. We investigate the impact of representation learning in artificial agents by developing graph referential games. We empirically show that agents parametrized by graph neural networks develop a more compositional language compared to bag-of-words and sequence models, which allows them to systematically generalize to new combinations of familiar features.Comment: The first two authors contributed equally. Poster presented at CogSci 202

arXiv.org e-Print Archive

eScholarship - University of California

Mathematically Modeling the Lexicon Entropy of Emergent Language

Author: Boldt Brendon
Mortensen David
Publication venue
Publication date: 24/03/2023
Field of study

We formulate a stochastic process, FiLex, as a mathematical model of lexicon entropy in deep learning-based emergent language systems. Defining a model mathematically allows it to generate clear predictions which can be directly and decisively tested. We empirically verify across four different environments that FiLex predicts the correct correlation between hyperparameters (training steps, lexicon size, learning rate, rollout buffer size, and Gumbel-Softmax temperature) and the emergent language's entropy in 20 out of 20 environment-hyperparameter combinations. Furthermore, our experiments reveal that different environments show diverse relationships between their hyperparameters and entropy which demonstrates the need for a model which can make well-defined predictions at a precise level of granularity.Comment: 12 pages, 3 figures; added link to GitHub rep

arXiv.org e-Print Archive

Pretrain on just structure: Understanding linguistic inductive biases using transfer learning

Author: Jurafsky Dan
Papadimitriou Isabel
Publication venue
Publication date: 25/04/2023
Field of study

Both humans and transformer language models are able to learn language without explicit structural supervision. What inductive learning biases make this learning possible? In this study, we examine the effect of different inductive learning biases by predisposing language models with structural biases through pretraining on artificial structured data, and then evaluating by fine-tuning on English. Our experimental setup gives us the ability to actively control the inductive bias of language models. With our experiments, we investigate the comparative success of three types of inductive bias: 1) an inductive bias for recursive, hierarchical processing 2) an inductive bias for unrestricted token-token dependencies that can't be modeled by context-free grammars, and 3) an inductive bias for a Zipfian power-law vocabulary distribution. We show that complex token-token interactions form the best inductive biases, and that this is strongest in the non-context-free case. We also show that a Zipfian vocabulary distribution forms a good inductive bias independently from grammatical structure. Our study leverages the capabilities of transformer models to run controlled language learning experiments that are not possible to run in humans, and surfaces hypotheses about the structures that facilitate language learning in both humans and machines

arXiv.org e-Print Archive

Simplicity and informativeness in semantic category systems

Author: Carr Jon
Culbertson Jennifer
Kirby Simon
Smith Kenny
Publication venue: 'Elsevier BV'
Publication date: 30/09/2020
Field of study

Edinburgh Research Explorer

Architecting system of systems: artificial life analysis of financial market behavior

Author: Ergin Nil Hande
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2007
Field of study

This research study focuses on developing a framework that can be utilized by system architects to understand the emergent behavior of system architectures. The objective is to design a framework that is modular and flexible in providing different ways of modeling sub-systems of System of Systems. At the same time, the framework should capture the adaptive behavior of the system since evolution is one of the key characteristics of System of Systems. Another objective is to design the framework so that humans can be incorporated into the analysis. The framework should help system architects understand the behavior as well as promoters or inhibitors of change in human systems. Computational intelligence tools have been successfully used in analysis of Complex Adaptive Systems. Since a System of Systems is a collection of Complex Adaptive Systems, a framework utilizing combination of these tools can be developed. Financial markets are selected to demonstrate the various architectures developed from the analysis framework --Introduction, page 3

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Safe work practices in interdisciplinary surgical teamwork : model development and validation

Author: Høyland Sindre
Publication venue: University of Stavanger, Norway
Publication date: 01/01/2013
Field of study

PhD thesis in Health, medicine and welfareThis thesis identifies needs in health research literature on quality and safety to explore the nature of teamwork and develop models that can be applied to, and integrate findings from, such explorations. This triggers an interest in exploring how safe work practices are achieved in interdisciplinary surgical teamwork, and also a related interest in developing and validating a model for such explorations. Accordingly, surgical operations are explored by means of observations, conversations, and interviews. As part of this exploration, a literature review process produces a framework for exploring safe work practices, comprising a knowledge and system dimension. These dimensions are operationalized through a field research protocol and a semi-structured interview guide to serve as a general frame of reference during the fieldwork. The emergent findings from the exploration, in turn, establish a scientific model by refining and validating the dimensions of the framework. The exploration and model development and validation efforts are supported by a balanced methodology, emphasizing both structure/transparency and in-depth descriptions in the gathering, analysis, and presentation of data. This thesis finds that safe work practices are achieved through the ability and variety that individuals demonstrate in handling multiple sources of information before reaching a particular decision; the variety of ways in which awareness or anticipation of future events are expressed; and the different ways in which the individual team members handle sudden and unexpected situations. Safe work practices are also achieved by means of the team’s ability to compensate, through system buffers and experience from exclusive exposure to one section, for vulnerabilities and disruptions that arise through various combinations of system factors. Finally, safe work practices are achieved through the individual’s ability to disregard stress/pressure and properly apply the time and considerations necessary for the job, and sense and communicate patient-related problems, and the team’s reliance on the individual’s competency and ability to plan and improvise when challenged by a problem or unforeseen situation during an operation. Safe work practices can be defined as the dynamic and continuous effort by each individual team member and the overall team to combine and draw upon explicit and tacit knowledge repertoires to achieve a successful operation with minimal errors and complications. Safe work practices also can be viewed as the overall organization’s ability to maintain inner and outer (system) conditions that are strong enough to support individual and team abilities to combine and draw upon knowledge repertoires. This thesis’ theoretical contribution to safety research lies in establishing a scientific model for exploring safe work practices in teamwork that is of broad enough design to include existing findings and concepts, as well as new findings. By applying the model as a frame for exploration during a qualitative study, this thesis also contributes to safety research by producing a broader understanding of how safe work practices are achieved in surgical teamwork. The main implication is that safety researchers should emphasize the design of broader models to facilitate systemizing existing findings. The thesis also suggests that a broader model increases the potential for generalizability and transferability of model aspects, implying that safety researchers should consider research quality during model development. These contributions and related implications answer the identified needs for explorations into the nature of teamwork and for developing models that can be applied to, and integrate findings from, such explorations. Given the identified lack of explorations into the nature of teamwork within the health care sector, this thesis’ practical contributions lie in the broad yet in-depth approach to safety in surgical teamwork. This is potentially relevant to policy-makers, managers, researchers, and practitioners. Implications include system conditions that should be established to facilitate safe work practices in surgical teamwork, such as buffers in terms of personnel, operating rooms, and equipment and forums/seminars for sharing knowledge. Systems should also be established to formalize different types of tacit knowledge, such as by incorporating questions into checklists that trigger sharp-end/local reflections. Establishing favorable system conditions, not only physically (buffers) but also in terms of knowledge-sharing and formalizing, can reduce the likelihood of adverse events and improve patient safety

NORA - Norwegian Open Research Archives

UiS Brage