37 research outputs found
Determinantal Point Process Attention Over Grid Codes Supports Out of Distribution Generalization
Deep neural networks have made tremendous gains in emulating human-like
intelligence, and have been used increasingly as ways of understanding how the
brain may solve the complex computational problems on which this relies.
However, these still fall short of, and therefore fail to provide insight into
how the brain supports strong forms of generalization of which humans are
capable. One such case is out-of-distribution (OOD) generalization --
successful performance on test examples that lie outside the distribution of
the training set. Here, we identify properties of processing in the brain that
may contribute to this ability. We describe a two-part algorithm that draws on
specific features of neural computation to achieve OOD generalization, and
provide a proof of concept by evaluating performance on two challenging
cognitive tasks. First we draw on the fact that the mammalian brain represents
metric spaces using grid-like representations (e.g., in entorhinal cortex):
abstract representations of relational structure, organized in recurring motifs
that cover the representational space. Second, we propose an attentional
mechanism that operates over these grid representations using determinantal
point process (DPP-A) -- a transformation that ensures maximum sparseness in
the coverage of that space. We show that a loss function that combines standard
task-optimized error with DPP-A can exploit the recurring motifs in grid codes,
and can be integrated with common architectures to achieve strong OOD
generalization performance on analogy and arithmetic tasks. This provides both
an interpretation of how grid codes in the mammalian brain may contribute to
generalization performance, and at the same time a potential means for
improving such capabilities in artificial neural networks.Comment: 24 pages (including Appendix), 19 figure
Learning Representations that Support Extrapolation
Extrapolation -- the ability to make inferences that go beyond the scope of
one's experiences -- is a hallmark of human intelligence. By contrast, the
generalization exhibited by contemporary neural network algorithms is largely
limited to interpolation between data points in their training corpora. In this
paper, we consider the challenge of learning representations that support
extrapolation. We introduce a novel visual analogy benchmark that allows the
graded evaluation of extrapolation as a function of distance from the convex
domain defined by the training data. We also introduce a simple technique,
temporal context normalization, that encourages representations that emphasize
the relations between objects. We find that this technique enables a
significant improvement in the ability to extrapolate, considerably
outperforming a number of competitive techniques.Comment: ICML 202
The Relational Bottleneck as an Inductive Bias for Efficient Abstraction
A central challenge for cognitive science is to explain how abstract concepts
are acquired from limited experience. This effort has often been framed in
terms of a dichotomy between empiricist and nativist approaches, most recently
embodied by debates concerning deep neural networks and symbolic cognitive
models. Here, we highlight a recently emerging line of work that suggests a
novel reconciliation of these approaches, by exploiting an inductive bias that
we term the relational bottleneck. We review a family of models that employ
this approach to induce abstractions in a data-efficient manner, emphasizing
their potential as candidate models for the acquisition of abstract concepts in
the human mind and brain
Fetal alcohol exposure leads to abnormal olfactory bulb development and impaired odor discrimination in adult mice
Background: Children whose mothers consumed alcohol during pregnancy exhibit widespread brain abnormalities and a complex array of behavioral disturbances. Here, we used a mouse model of fetal alcohol exposure to investigate relationships between brain abnormalities and specific behavioral alterations during adulthood. Results: Mice drank a 10% ethanol so
Symptom-based stratification of patients with primary Sjögren's syndrome: multi-dimensional characterisation of international observational cohorts and reanalyses of randomised clinical trials
Background
Heterogeneity is a major obstacle to developing effective treatments for patients with primary Sjögren's syndrome. We aimed to develop a robust method for stratification, exploiting heterogeneity in patient-reported symptoms, and to relate these differences to pathobiology and therapeutic response.
Methods
We did hierarchical cluster analysis using five common symptoms associated with primary Sjögren's syndrome (pain, fatigue, dryness, anxiety, and depression), followed by multinomial logistic regression to identify subgroups in the UK Primary Sjögren's Syndrome Registry (UKPSSR). We assessed clinical and biological differences between these subgroups, including transcriptional differences in peripheral blood. Patients from two independent validation cohorts in Norway and France were used to confirm patient stratification. Data from two phase 3 clinical trials were similarly stratified to assess the differences between subgroups in treatment response to hydroxychloroquine and rituximab.
Findings
In the UKPSSR cohort (n=608), we identified four subgroups: Low symptom burden (LSB), high symptom burden (HSB), dryness dominant with fatigue (DDF), and pain dominant with fatigue (PDF). Significant differences in peripheral blood lymphocyte counts, anti-SSA and anti-SSB antibody positivity, as well as serum IgG, κ-free light chain, β2-microglobulin, and CXCL13 concentrations were observed between these subgroups, along with differentially expressed transcriptomic modules in peripheral blood. Similar findings were observed in the independent validation cohorts (n=396). Reanalysis of trial data stratifying patients into these subgroups suggested a treatment effect with hydroxychloroquine in the HSB subgroup and with rituximab in the DDF subgroup compared with placebo.
Interpretation
Stratification on the basis of patient-reported symptoms of patients with primary Sjögren's syndrome revealed distinct pathobiological endotypes with distinct responses to immunomodulatory treatments. Our data have important implications for clinical management, trial design, and therapeutic development. Similar stratification approaches might be useful for patients with other chronic immune-mediated diseases.
Funding
UK Medical Research Council, British Sjogren's Syndrome Association, French Ministry of Health, Arthritis Research UK, Foundation for Research in Rheumatology
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
Recommended from our members
Man Bites Dog: The Representation of Structured Meaning in Left-Mid Superior Temporal Cortex
Human brains flexibly combine the meanings of individual words to compose structured thoughts. For example, by combining the meanings of ‘bite’, ‘dog’, and ‘man’, we can think either of a dog biting a man, or the newsworthy case of a man biting a dog (Pinker, 1997). Here, in three functional Magnetic Resonance Imaging (fMRI) experiments, we identify a region of left-mid Superior Temporal Cortex (lmSTC) that represents the current values of abstract semantic variables (“Who did it?” and “To whom was it done?”) in anatomically distinct sub-regions. Experiment 1 first identifies a broad region of lmSTC whose activity patterns (a) facilitate decoding of who did what to whom and (b) predict affective amygdala responses that depend on this information (e.g. “the baby kicked the grandfather” vs. “the grandfather kicked the baby”). Experiment 2 then identifies distinct, but neighboring, sub-regions of lmSTC whose activity patterns carry information about the identity of the current agent (“Who did it?”) and the current patient (“To whom was it done?”). These neighboring sub-regions lie along the upper bank of the superior temporal sulcus and the lateral bank of the superior temporal gyrus, respectively. At a high-level, these regions may function like topographically defined data registers, encoding the fluctuating values of abstract semantic variables. Experiment 3 replicates the agent/patient topography of Experiment 2, and further suggests that these variables do not represent the grammatical relations of the sentence, but the semantic relations of the participants in the event described. The code by which lmSTC encodes the values of these variables remains unclear, however. We find no positive evidence that it is either phonological or semantic, leaving open the possibility that lmSTC prioritizes distinctiveness and efficiency by using a compressed code. This functional architecture, which in key respects resembles that of a classical computer, may play a critical role in enabling humans to flexibly generate complex thoughts.Psycholog
Recommended from our members
Determinantal Point Processes for Memory and Structured Inference
Determinantal Point Processes (DPPs) are probabilisticmodels of repulsion, capturing negative dependenciesbetween states. Here, we show that a DPP inrepresentation-space predicts inferential biases towardmutual exclusivity commonly observed in word learning(mutual exclusivity bias) and reasoning (disjunctivesyllogism) tasks. It does so without requiring explicitrule representations, without supervision, and withoutexplicit knowledge transfer. The DPP attempts tomaximize the total ”volume” spanned by the set ofinferred code-vectors. In a representational system inwhich combinatorial codes are constructed by re-usingcomponents, a DPP will naturally favor the combinationof previously un-used components. We suggest thatthis bias toward the selection of volume-maximizingcombinations may exist to promote the efficient retrievalof individuals from memory. In support of this, we showthe same algorithm implements efficient ”hashing”,minimizing collisions between key/value pairs withoutexpanding the required storage space. We suggestthat the mechanisms that promote efficient memorysearch may also underlie cognitive biases in structuredinference