56 research outputs found
DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach
Multiple choice questions (MCQs) are an efficient and common way to assess
reading comprehension (RC). Every MCQ needs a set of distractor answers that
are incorrect, but plausible enough to test student knowledge. Distractor
generation (DG) models have been proposed, and their performance is typically
evaluated using machine translation (MT) metrics. However, MT metrics often
misjudge the suitability of generated distractors. We propose DISTO: the first
learned evaluation metric for generated distractors. We validate DISTO by
showing its scores correlate highly with human ratings of distractor quality.
At the same time, DISTO ranks the performance of state-of-the-art DG models
very differently from MT-based metrics, showing that MT metrics should not be
used for distractor evaluation
Identifying Shared Decodable Concepts in the Human Brain Using Image-Language Foundation Models
We introduce a method that takes advantage of high-quality pretrained
multimodal representations to explore fine-grained semantic networks in the
human brain. Previous studies have documented evidence of functional
localization in the brain, with different anatomical regions preferentially
activating for different types of sensory input. Many such localized structures
are known, including the fusiform face area and parahippocampal place area.
This raises the question of whether additional brain regions (or conjunctions
of brain regions) are also specialized for other important semantic concepts.
To identify such brain regions, we developed a data-driven approach to uncover
visual concepts that are decodable from a massive functional magnetic resonance
imaging (fMRI) dataset. Our analysis is broadly split into three sections.
First, a fully connected neural network is trained to map brain responses to
the outputs of an image-language foundation model, CLIP (Radford et al., 2021).
Subsequently, a contrastive-learning dimensionality reduction method reveals
the brain-decodable components of CLIP space. In the final section of our
analysis, we localize shared decodable concepts in the brain using a
voxel-masking optimization method to produce a shared decodable concept (SDC)
space. The accuracy of our procedure is validated by comparing it to previous
localization experiments that identify regions for faces, bodies, and places.
In addition to these concepts, whose corresponding brain regions were already
known, we localize novel concept representations which are shared across
participants to other areas of the human brain. We also demonstrate how this
method can be used to inspect fine-grained semantic networks for individual
participants. We envisage that this extensible method can also be adapted to
explore other questions at the intersection of AI and neuroscience.Comment: Under revie
PA-GOSUB: a searchable database of model organism protein sequences with their predicted Gene Ontology molecular function and subcellular localization
PA-GOSUB (Proteome Analyst: Gene Ontology Molecular Function and Subcellular Localization) is a publicly available, web-based, searchable and downloadable database that contains the sequences, predicted GO molecular functions and predicted subcellular localizations of more than 107β000 proteins from 10 model organisms (and growing), covering the major kingdoms and phyla for which annotated proteomes exist (http://www.cs.ualberta.ca/~bioinfo/PA/GOSUB). The PA-GOSUB database effectively expands the coverage of subcellular localization and GO function annotations by a significant factor (already over five for subcellular localization, compared with Swiss-Prot v42.7), and more model organisms are being added to PA-GOSUB as their sequenced proteomes become available. PA-GOSUB can be used in three main ways. First, a researcher can browse the pre-computed PA-GOSUB annotations on a per-organism and per-protein basis using annotation-based and text-based filters. Second, a user can perform BLAST searches against the PA-GOSUB database and use the annotations from the homologs as simple predictors for the new sequences. Third, the whole of PA-GOSUB can be downloaded in either FASTA or comma-separated values (CSV) formats
Finding Language (and Language Learning) in the Brain
Presented on August 29, 2018 at 3:30 p.m. in the Parker H. Petit Institute for Bioengineering and Bioscience, Room 1128.Alona Fyshe is a computer scientist by training, but her research straddles multiple areas including neuroscience, machine learning and computational linguistics. Her lab explores how the human brain processes and represents language meaning (semantics) and how it combines words to make complex meaning (semantic composition).Runtime: 52:57 minutesUnderstanding a native language is near effortless for fluent adults. But learning a new language takes dedication and hard word. In this talk, I will describe an experiment during which adult participants learned a new (artificial) language through a reinforcement learning paradigm while we collected EEG (Electroencephalography) data. We found that 1) we could detect a reward positivity (an EEG signal correlated with a participant receiving positive feedback) when participants correctly identified a symbol's meaning, and 2) the reward positivity diminishes for subsequent correct trials. Using a machine learning approach, we found that 3) we could detect neural correlates of word meaning as the mapping from native to new language is learned; and 4) the localization of the neural representations is heavily distributed throughout the brain. Together this is evidence that learning can be detected in the brain using EEG, and that the contents of a newly learned concept can be detected
- β¦