56 research outputs found

    DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach

    Full text link
    Multiple choice questions (MCQs) are an efficient and common way to assess reading comprehension (RC). Every MCQ needs a set of distractor answers that are incorrect, but plausible enough to test student knowledge. Distractor generation (DG) models have been proposed, and their performance is typically evaluated using machine translation (MT) metrics. However, MT metrics often misjudge the suitability of generated distractors. We propose DISTO: the first learned evaluation metric for generated distractors. We validate DISTO by showing its scores correlate highly with human ratings of distractor quality. At the same time, DISTO ranks the performance of state-of-the-art DG models very differently from MT-based metrics, showing that MT metrics should not be used for distractor evaluation

    Identifying Shared Decodable Concepts in the Human Brain Using Image-Language Foundation Models

    Full text link
    We introduce a method that takes advantage of high-quality pretrained multimodal representations to explore fine-grained semantic networks in the human brain. Previous studies have documented evidence of functional localization in the brain, with different anatomical regions preferentially activating for different types of sensory input. Many such localized structures are known, including the fusiform face area and parahippocampal place area. This raises the question of whether additional brain regions (or conjunctions of brain regions) are also specialized for other important semantic concepts. To identify such brain regions, we developed a data-driven approach to uncover visual concepts that are decodable from a massive functional magnetic resonance imaging (fMRI) dataset. Our analysis is broadly split into three sections. First, a fully connected neural network is trained to map brain responses to the outputs of an image-language foundation model, CLIP (Radford et al., 2021). Subsequently, a contrastive-learning dimensionality reduction method reveals the brain-decodable components of CLIP space. In the final section of our analysis, we localize shared decodable concepts in the brain using a voxel-masking optimization method to produce a shared decodable concept (SDC) space. The accuracy of our procedure is validated by comparing it to previous localization experiments that identify regions for faces, bodies, and places. In addition to these concepts, whose corresponding brain regions were already known, we localize novel concept representations which are shared across participants to other areas of the human brain. We also demonstrate how this method can be used to inspect fine-grained semantic networks for individual participants. We envisage that this extensible method can also be adapted to explore other questions at the intersection of AI and neuroscience.Comment: Under revie

    PA-GOSUB: a searchable database of model organism protein sequences with their predicted Gene Ontology molecular function and subcellular localization

    Get PDF
    PA-GOSUB (Proteome Analyst: Gene Ontology Molecular Function and Subcellular Localization) is a publicly available, web-based, searchable and downloadable database that contains the sequences, predicted GO molecular functions and predicted subcellular localizations of more than 107 000 proteins from 10 model organisms (and growing), covering the major kingdoms and phyla for which annotated proteomes exist (http://www.cs.ualberta.ca/~bioinfo/PA/GOSUB). The PA-GOSUB database effectively expands the coverage of subcellular localization and GO function annotations by a significant factor (already over five for subcellular localization, compared with Swiss-Prot v42.7), and more model organisms are being added to PA-GOSUB as their sequenced proteomes become available. PA-GOSUB can be used in three main ways. First, a researcher can browse the pre-computed PA-GOSUB annotations on a per-organism and per-protein basis using annotation-based and text-based filters. Second, a user can perform BLAST searches against the PA-GOSUB database and use the annotations from the homologs as simple predictors for the new sequences. Third, the whole of PA-GOSUB can be downloaded in either FASTA or comma-separated values (CSV) formats

    Finding Language (and Language Learning) in the Brain

    No full text
    Presented on August 29, 2018 at 3:30 p.m. in the Parker H. Petit Institute for Bioengineering and Bioscience, Room 1128.Alona Fyshe is a computer scientist by training, but her research straddles multiple areas including neuroscience, machine learning and computational linguistics. Her lab explores how the human brain processes and represents language meaning (semantics) and how it combines words to make complex meaning (semantic composition).Runtime: 52:57 minutesUnderstanding a native language is near effortless for fluent adults. But learning a new language takes dedication and hard word. In this talk, I will describe an experiment during which adult participants learned a new (artificial) language through a reinforcement learning paradigm while we collected EEG (Electroencephalography) data. We found that 1) we could detect a reward positivity (an EEG signal correlated with a participant receiving positive feedback) when participants correctly identified a symbol's meaning, and 2) the reward positivity diminishes for subsequent correct trials. Using a machine learning approach, we found that 3) we could detect neural correlates of word meaning as the mapping from native to new language is learned; and 4) the localization of the neural representations is heavily distributed throughout the brain. Together this is evidence that learning can be detected in the brain using EEG, and that the contents of a newly learned concept can be detected
    • …
    corecore