Auditory speech perception can be described as the task of mapping an auditory
signal into meaning. We routinely perform this task in an automatic and effortless
manner, which might conceal the complexity behind this process. It should be
noted that the speech signal is highly variable, ambiguous and usually perceived in
noise. One possible strategy the brain might use to handle this task is to generate
predictions about the incoming auditory stream.
Prediction occupies a prominent role in cognitive functions ranging from perception
to motor control. In the specific case of speech perception, evidence shows
that listeners are able to make predictions about incoming speech stimuli. Word
processing, for example, is facilitated by the context of a sentence. Furthermore,
electroencephalography studies have shown neural correlates that behave like error
signals triggered when an unexpected word is encountered.
But these examples of prediction in speech processing occur between words, and
rely on semantic and or syntactic knowledge. Given the salient role of prediction
in other cognitive domains, we hypothesize that prediction might serve a role in
speech processing, even at the phonological level (within words) and independently
from higher level information such as syntax or semantics. In other words, the brain
might use the first phonemes of a word to anticipate which should be the following
ones.
To test this hypothesis, we performed three electroencephalography experiments
with an oddball design. This approach allowed us to present individual words in
a context that does not contain neither semantic nor syntactic information. Additionally,
this type of experimental design is optimal for the elicitation of event
related potentials that are well established marker of prediction violation, such as
the Mismatch Negativity (MMN) and P3b responses.
In these experiments, participants heard repetitions of standard words, among
which, deviant words were presented infrequently. Importantly, deviant words were
composed by the same syllables as standard words, although in different combinations.
For example if in an experiment XXX and YYY were two standard words,
XXY could be a deviant word. We expected that if as we proposed, the first
phonemes of a word are used to predict which should be the following ones, encountering
a deviant of this kind would elicit a prediction error signal.
In Chapter 3, we establish that as we expected, the presentation of deviant
words, composed of an unexpected sequence of phonemes, generates a chain of well
established prediction error signals, which we take as evidence of the prediction of
the forthcoming phonemes of a word. Furthermore, we show that the amplitude of
these error signals can be modulated by the amount of congruent syllables presented
before the point of deviance, which suggests that prediction strength can increase
within a word as previous predictions prove to be successful.
In Chapter 4, we study the modulating role of attentional set on the chain
of prediction error signals. In particular we show that while high level prediction
(indexed by the P3b response) is strategically used depending on the task at hand,
early prediction error signals such as the MMN response are generated automatically,
even when participants are simply instructed to listen to all the words. These results
imply that phonological predictions are automatically deployed while listening to
words, regardless of the task at hand.
In Chapter 5, we extend our results to a more complex stimulus set that resemble
natural speech more closely. Furthermore we show that the amplitude of the
MMN and P3b prediction error signals is correlated with participant's reaction time
in an on-line deviant detection task. This provides a strong argument in favor of a
functional role of phonological predictions in speech processing.
Taken together, this work shows that phonological predictions can be generated
even in the absence higher level information such as syntax and semantics. This
might help the human brain to complete the challenging task of mapping such a
variable and noisy signal as speech, into meaning, in real time