43 research outputs found
Investigating the function of the ventral visual reading pathway and its involvement in acquired reading disorders
This thesis investigated the role of the left ventral occipitotemporal (vOT)
cortex and how damage to this area causes peripheral reading disorders.
Functional magnetic resonance imaging (fMRI) studies in volunteers
demonstrated that the left vOT is activated by written words over numbers
or perceptually-matched baselines, irrespective of the wordâs location on the
visual field. Mixed results were observed for the comparison of words versus
false font stimuli. This response profile suggests that the left vOT is
preferentially activated by words or word-like stimuli, due to either: (1)
bottom-up specialisation for processing familiar word-forms; (2) top-down
task-dependent modulation, or (3) a combination of the two. Further studies
are proposed to discriminate between these possibilities.
Thirteen patients with left occipitotemporal damage participated in the
rehabilitation and fMRI studies. The patients were impaired on word, text and
letter reading. A structural analysis showed that damage to the left
occipitotemporal white matter, in the vicinity of the inferior longitudinal
fasciculus, was associated with slow word reading speed. The fMRI study
showed that the patients had reduced activation of the bilateral posterior
superior temporal sulci relative to controls. Activity in this area correlated
with reading speed.
The efficacy of intensive whole-word recognition training was tested.
Immediately after the training, trained words were read faster than
untrained words, but the effects did not persist until the follow-up
assessment. Hence, damage to the left vOT white matter impairs rapid
whole-word recognition and is resistant to rehabilitation.
The final study investigated the role of spatial frequency (SF) in the
lateralisation of vOT function. Lateralisation of high and low SF processing
was demonstrated, concordant with the lateralisation for words and faces to
the left and right vOT respectively. A perceptual basis for the organisation of
vOT cortex might explain why left vOT damage is resistant to treatment
Comparing and Validating Methods of Reading Instruction Using Behavioural and Neural Findings in an Artificial Orthography
There is strong scientific consensus that emphasizing print-to-sound relationships is critical when learning to read alphabetic languages. Nevertheless, reading instruction varies across English-speaking countries, from intensive phonic training to multicuing environments that teach sound- and meaning-based strategies. We sought to understand the behavioral and neural consequences of these differences in relative emphasis. We taught 24 English-speaking adults to read 2 sets of 24 novel words (e.g., /buv/, /sig/), written in 2 different unfamiliar orthographies. Following pretraining on oral vocabulary, participants learned to read the novel words over 8 days. Training in 1 language was biased toward print-to-sound mappings while training in the other language was biased toward print-to-meaning mappings. Results showed striking benefits of printâsound training on reading aloud, generalization, and comprehension of single words. Univariate analyses of fMRI data collected at the end of training showed that printâmeaning relative to printâsound relative training increased neural effort in dorsal pathway regions involved in reading aloud. Conversely, activity in ventral pathway brain regions involved in reading comprehension was no different following printâmeaning versus printâsound training. Multivariate analyses validated our artificial language approach, showing high similarity between the spatial distribution of fMRI activity during artificial and English word reading. Our results suggest that early literacy education should focus on the systematicities present in print-to-sound relationships in alphabetic languages, rather than teaching meaning-based strategies, in order to enhance both reading aloud and comprehension of written words
Comparing and validating methods of reading instruction using behavioural and neural findings in an artificial orthography.
There is strong scientific consensus that emphasizing print-to-sound relationships is critical when learning to read alphabetic languages. Nevertheless, reading instruction varies across English-speaking countries, from intensive phonic training to multicuing environments that teach sound- and meaning-based strategies. We sought to understand the behavioral and neural consequences of these differences in relative emphasis. We taught 24 English-speaking adults to read 2 sets of 24 novel words (e.g., /buv/, /sig/), written in 2 different unfamiliar orthographies. Following pretraining on oral vocabulary, participants learned to read the novel words over 8 days. Training in 1 language was biased toward print-to-sound mappings while training in the other language was biased toward print-to-meaning mappings. Results showed striking benefits of print-sound training on reading aloud, generalization, and comprehension of single words. Univariate analyses of fMRI data collected at the end of training showed that print-meaning relative to print-sound relative training increased neural effort in dorsal pathway regions involved in reading aloud. Conversely, activity in ventral pathway brain regions involved in reading comprehension was no different following print-meaning versus print-sound training. Multivariate analyses validated our artificial language approach, showing high similarity between the spatial distribution of fMRI activity during artificial and English word reading. Our results suggest that early literacy education should focus on the systematicities present in print-to-sound relationships in alphabetic languages, rather than teaching meaning-based strategies, in order to enhance both reading aloud and comprehension of written words. (PsycINFO Database Recor
Grounding semantic cognition using computational modelling and network analysis
The overarching objective of this thesis is to further the field of grounded semantics using a range of computational and empirical studies. Over the past thirty years, there have been many algorithmic advances in the
modelling of semantic cognition. A commonality across these cognitive models is a reliance on hand-engineering âtoy-modelsâ. Despite incorporating newer
techniques (e.g. Long short-term memory), the model inputs remain unchanged. We argue that the inputs to these traditional semantic models have little resemblance with real human experiences. In this dissertation, we ground our neural network models by training them with real-world visual scenes using naturalistic photographs. Our approach is an alternative to both hand-coded
features and embodied raw sensorimotor signals.
We conceptually replicate the mutually reinforcing nature of hybrid (feature-based and grounded) representations using silhouettes of concrete concepts as model inputs. We next gradually develop a novel grounded cognitive semantic representation which we call scene2vec, starting with object co-occurrences and then adding emotions and language-based tags. Limitations of our scene-based representation are identified for more abstract concepts (e.g. freedom). We further present a large-scale human semantics study, which reveals small-world semantic network topologies are context-dependent and
that scenes are the most dominant cognitive dimension. This finding leads us to conclude that there is no meaning without context. Lastly, scene2vec shows
promising human-like context-sensitive stereotypes (e.g. gender role bias), and we explore how such stereotypes are reduced by targeted debiasing. In conclusion, this thesis provides support for a novel computational
viewpoint on investigating meaning - scene-based grounded semantics. Future research scaling scene-based semantic models to human-levels through virtual grounding has the potential to unearth new insights into the human mind and
concurrently lead to advancements in artificial general intelligence by enabling robots, embodied or otherwise, to acquire and represent meaning directly from the environment
Using fMRI and Behavioural Measures to Investigate Rehabilitation in Post-Stroke Aphasic Deficits
In this thesis I investigated whether an intensive computerised, home-based therapy programme could improve phonological discrimination ability in 19 patients with chronic post-stroke aphasia. One skill specifically targeted by the treatment demonstrated an improvement due to the therapy. However, this improvement did not generalise to untreated items, and was only effective for participants without a lesion involving the frontal lobe, indicating a potentially important role for this region in determining outcome of aphasia therapy.
Complementary functional imaging studies investigated activity in domain-general and domain-specific networks in both patients and healthy volunteers during listening and repeating simple sentences. One important consideration when comparing a patient group with a healthy population is the difference in task difficulty encountered by the two groups. Increased cognitive effort can be expected to increase activity in domain-general networks. I minimised the effect of this confound by manipulating task difficulty for the healthy volunteers to reduce their behavioural performance so that it was comparable to that of the patients. By this means I demonstrated that the activation patterns in domain-general regions were very similar in the two groups. Region-of-interest analysis demonstrated that activity within a domain-general network, the salience network, predicted residual language function in the patients with aphasia, even after accounting for lesion volume and their chronological age.
I drew two broad conclusions from these studies. First, that computer-based rehabilitation can improve disordered phonological discrimination in chronic aphasia, but that lesion distribution may influence the response to this training. Second, that the ability to activate domain-general cognitive control regions influences outcome in aphasia. This allows me to propose that in future work, therapeutic strategies, pharmacological or behavioural, targeting domain-general brain systems, may benefit aphasic stroke rehabilitation.Open Acces
A Model of the Network Architecture of the Brain that Supports Natural Language Processing
For centuries, neuroscience has proposed models of the neurobiology of language
processing that are static and localised to few temporal and inferior frontal regions. Although
existing models have offered some insight into the processes underlying lower-level language
features, they have largely overlooked how language operates in the real world.
Here, we aimed at investigating the network organisation of the brain and how it
supports language processing in a naturalistic setting. We hypothesised that the brain is
organised in a multiple core-periphery and dynamic modular architecture, with canonical
language regions forming high-connectivity hubs. Moreover, we predicted that language
processing would be distributed to much of the rest of the brain, allowing it to perform more
complex tasks and to share information with other cognitive domains.
To test these hypotheses, we collected the Naturalistic Neuroimaging Database of
people watching full length movies during functional magnetic resonance imaging. We
computed network algorithms to capture the voxel-wise architecture of the brain in individual
participants and inspected variations in activity distribution over different stimuli and over
more complex language features. Our results confirmed the hypothesis that the brain is
organised in a flexible multiple core-periphery architecture with large dynamic communities.
Here, language processing was distributed to much of the rest of the brain, together forming
multiple communities. Canonical language regions constituted hubs, explaining why they
consistently appear in various other neurobiology of language models. Moreover, language
processing was supported by other regions such as visual cortex and episodic memory regions, when processing more complex context-specific language features. Overall, our flexible and
distributed model of language comprehension and the brain points to additional brain regions
and pathways that could be exploited for novel and more individualised therapies for patients
suffering from speech impairments
Recommended from our members
Understanding Semantic Implicit Learning through distributional linguistic patterns: A computational perspective
The research presented in this PhD dissertation provides a computational perspective on Semantic Implicit Learning (SIL). It puts forward the idea that SIL does not depend on semantic knowledge as classically conceived but upon semantic-like knowledge gained through distributional analysis of massive linguistic input. Using methods borrowed from the machine learning and artificial intelligence literature, we construct computational models, which can simulate the performance observed during behavioural tasks of semantic implicit learning in a human-like way. We link this methodology to the current literature on implicit learning, arguing that this behaviour is a necessary by-product of efficient language processing.
Chapter 1 introduces the computational problem posed by implicit learning in general, and semantic implicit learning, in particular, as well as the computational framework, used to tackle them.
Chapter 2 introduces distributional semantics models as a way to learn semantic-like representations from exposure to linguistic input.
Chapter 3 reports two studies on large datasets of semantic priming which seek to identify the computational model of semantic knowledge that best fits the data under conditions that resemble SIL tasks. We find that a model which acquires semantic-like knowledge gained through distributional analysis of massive linguistic input provides the best fit to the data.
Chapter 4 generalises the results of the previous two studies by looking at the performance of the same models in languages other than English.
Chapter 5 applies the results of the two previous Chapters on eight datasets of semantic implicit learning. Crucially, these datasets use various semantic manipulations and speakers of different L1s enabling us to test the predictions of different models of semantics.
Chapter 6 examines more closely two assumptions which we have taken for granted throughout this thesis. Firstly, we test whether a simpler model based on phonological information can explain the generalisation patterns observed in the tasks. Secondly, we examine whether our definition of the computational problem in Chapter 5 is reasonable.
Chapter 7 summarises and discusses the implications for implicit language learning and computational models of cognition. Furthermore, we offer one more study that seeks to bridge the literature on distributional models of semantics to `deeper' models of semantics by learning semantic relations.
There are two main contributions of this dissertation to the general field of implicit learning research. Firstly, we highlight the superiority of distributional models of semantics in modelling unconscious semantic knowledge. Secondly, we question whether `deep' semantic knowledge is needed to achieve above chance performance in SIIL tasks. We show how a simple model that learns through distributional analysis of the patterns found in the linguistic input can match the behavioural results in different languages. Furthermore, we link these models to more general problems faced in psycholinguistics such as language processing and learning of semantic relations.Alexandros Onassis Foundatio
Auditory comprehension: from the voice up to the single word level
Auditory comprehension, the ability to understand spoken language, consists of a
number of different auditory processing skills. In the five studies presented in this
thesis I investigated both intact and impaired auditory comprehension at different
levels: voice versus phoneme perception, as well as single word auditory
comprehension in terms of phonemic and semantic content.
In the first study, using sounds from different continua of âmaleâ-/pĂŠ/ to âfemaleâ-/tĂŠ/
and âmaleâ-/tĂŠ/ to âfemaleâ-/pĂŠ/, healthy participants (n=18) showed that phonemes
are categorised faster than voice, in contradistinction with the common hypothesis that
voice information is stripped away (or normalised) to access phonemic content.
Furthermore, reverse correlation analysis suggests that gender and phoneme are
processed on the basis of different perceptual representations. A follow-up study (same
paradigm) in stroke patients (n=25, right or left hemispheric brain lesions, both with
and without aphasia) showed that lesions of the right frontal cortex (likely ventral
inferior frontal gyrus) leads to systematic voice perception deficits while left
hemispheric lesions can elicit both voice and phoneme deficits. Together these results
show that phoneme processing is lateralized while voice information processing
requires both hemispheres. Furthermore, this suggests that commencing Speech and
Language Therapy at a low level of acoustic processing/voice perception may be an
appropriate method in the treatment of phoneme perception impairments.
A longitudinal case study (CF) of crossed aphasia (rare acquired communication
impairment secondary to lesion ipsilateral to the dominant hand) is then presented
alongside a mini-review of the literature. Extensive clinical investigation showed that
CF presented with word-finding difficulties related to impaired auditory phonological
analysis, while functional Magnetic Resonance Imaging (fMRI) analyses showed right
hemispheric lateralization of language functions (reading, repetition and verb
generation). These results, together with the co-morbidity analysis from the mini-review,
suggest that crossed aphasia can be explained by developmental disorders
which cause partial right lateralization shift of language processes. Interestingly, in CF
this process did not affect voice lateralization and information processing, suggesting
partial segregation of voice and speech processing.
In the last two studies, auditory comprehension was examined at the single word level
using a word-picture matching task with congruent (correct target) and incongruent
(semantic, phonological and unrelated foils) conditions. fMRI in healthy participants
(n=16) revealed a key role of the pars triangularis (phonological processing), the left
angular gyrus (semantic incongruency) and the left precuneus (semantic relatedness)
in this task â regions typically associated via the arcuate fasciculus and often impaired
in aphasia. Further investigation of stroke patients on the same task (n=15) suggested
that the connections between the angular gyrus and the pars triangularis serve a
fundamental role in semantic processing. The quality of a published word-picture
matching task was also investigated, with results questioning the clinical relevance of
this task as an assessment tool.
Finally, a pilot study looking at the effect of a computer-assisted auditory
comprehension therapy (React2©) in 6 stroke patients (vs. 6 healthy controls and 6
stroke patients without therapy) is presented. Results show that the more therapy
patients carry out the more improvement is seen in the semantic processing of single
nouns. However, these results need to be reproduced on a larger scale in order to
generalise any outcomes.
Overall, the findings from these studies present new insight into, as well as extending
on, current cognitive and neuroanatomical models of voice perception, speech
perception and single word auditory comprehension. A combinatorial approach to
cognitive and neuroanatomical models is proposed in order to further research, and
thus improve clinical care, into impaired auditory comprehension
Automatic Image Captioning with Style
This thesis connects two core topics in machine learning, vision
and language. The problem of choice is image caption generation:
automatically constructing natural language descriptions of image
content. Previous research into image caption generation has
focused on generating purely descriptive captions; I focus on
generating visually relevant captions with a distinct linguistic
style. Captions with style have the potential to ease
communication and add a new layer of personalisation.
First, I consider naming variations in image captions, and
propose a method for predicting context-dependent names that
takes into account visual and linguistic information. This method
makes use of a large-scale image caption dataset, which I also
use to explore naming conventions and report naming conventions
for hundreds of animal classes. Next I propose the SentiCap
model, which relies on recent advances in artificial neural
networks to generate visually relevant image captions with
positive or negative sentiment. To balance descriptiveness and
sentiment, the SentiCap model dynamically switches between two
recurrent neural networks, one tuned for descriptive words and
one for sentiment words. As the first published model for
generating captions with sentiment, SentiCap has influenced a
number of subsequent works. I then investigate the sub-task of
modelling styled sentences without images. The specific task
chosen is sentence simplification: rewriting news article
sentences to make them easier to understand.
For this task I design a neural sequence-to-sequence model that
can work with
limited training data, using novel adaptations for word copying
and sharing
word embeddings. Finally, I present SemStyle, a system for
generating visually
relevant image captions in the style of an arbitrary text corpus.
A shared term
space allows a neural network for vision and content planning to
communicate
with a network for styled language generation. SemStyle achieves
competitive
results in human and automatic evaluations of descriptiveness and
style.
As a whole, this thesis presents two complete systems for styled
caption generation that are first of their kind and demonstrate,
for the first time, that automatic style transfer for image
captions is achievable. Contributions also include novel ideas
for object naming and sentence simplification. This thesis opens
up inquiries into highly personalised image captions; large scale
visually grounded concept naming; and more generally, styled text
generation with content control