138,416 research outputs found
Towards an Integrative Information Society: Studies on Individuality in Speech and Sign
The flow of information within modern information society has increased rapidly over the last decade. The major part of this information flow relies on the individual’s abilities to handle text or speech input. For the majority of us it presents no problems, but there are some individuals who would benefit from other means of conveying information, e.g. signed information flow. During the last decades the new results from various disciplines have all suggested towards the common background and processing for sign and speech and this was one of the key issues that I wanted to investigate further in this thesis. The basis of this thesis is firmly within speech research and that is why I wanted to design analogous test batteries for widely used speech perception tests for signers – to find out whether the results for signers would be the same as in speakers’ perception tests. One of the key findings within biology – and more precisely its effects on speech and communication research – is the mirror neuron system. That finding has enabled us to form new theories about evolution of communication, and it all seems to converge on the hypothesis that all communication has a common core within humans.
In this thesis speech and sign are discussed as equal and analogical counterparts of communication and all research methods used in speech are modified for sign. Both speech and sign are thus investigated using similar test batteries. Furthermore, both production and perception of speech and sign are studied separately. An additional framework for studying production is given by gesture research using cry sounds. Results of cry sound research are then compared to results from children acquiring sign language. These results show that individuality manifests itself from very early on in human development. Articulation in adults, both in speech and sign, is studied from two perspectives: normal production and re-learning production when the apparatus has been changed. Normal production is studied both in speech and sign and the effects of changed articulation are studied with regards to speech. Both these studies are done by using carrier sentences. Furthermore, sign production is studied giving the informants possibility for spontaneous speech. The production data from the signing informants is also used as the basis for input in the sign synthesis stimuli used in sign perception test battery.
Speech and sign perception were studied using the informants’ answers to questions using forced choice in identification and discrimination tasks. These answers were then compared across language modalities. Three different informant groups participated in the sign perception tests: native signers, sign language interpreters and Finnish adults with no knowledge of any signed language. This gave a chance to investigate which of the characteristics found in the results were due to the language per se and which were due to the changes in modality itself.
As the analogous test batteries yielded similar results over different informant groups, some common threads of results could be observed. Starting from very early on in acquiring speech and sign the results were highly individual. However, the results were the same within one individual when the same test was repeated. This individuality of results represented along same patterns across different language modalities and - in some occasions - across language groups. As both modalities yield similar answers to analogous study questions, this has lead us to providing methods for basic input for sign language applications, i.e. signing avatars. This has also given us answers to questions on precision of the animation and intelligibility for the users – what are the parameters that govern intelligibility of synthesised speech or sign and how precise must the animation or synthetic speech be in order for it to be intelligible. The results also give additional support to the well-known fact that intelligibility in fact is not the same as naturalness. In some cases, as shown within the sign perception test battery design, naturalness decreases intelligibility. This also has to be taken into consideration when designing applications.
All in all, results from each of the test batteries, be they for signers or speakers, yield strikingly similar patterns, which would indicate yet further support for the common core for all human communication. Thus, we can modify and deepen the phonetic framework models for human communication based on the knowledge obtained from the results of the test batteries within this thesis.Siirretty Doriast
L2 speech learning of European Portuguese /l/ and /ɾ/ by L1-Mandarin learners: experimental evidence and theoretical modelling
It has been long recognized that the poor distinction between /l/ and /ɾ/ is one
of the most perceptible characteristics in Chinese-accented Portuguese. Recent
empirical research revealed that this notorious L2 speech learning difficulty
goes beyond the confusion between two L2 categories, as L1-Mandarin learners’
acquisition of Portuguese /l/ and /ɾ/ seems to be subject to the interaction
among different prosodic positions, speech modalities and representational
levels. This thesis aims to deepen our current understanding of this L2 speech
learning process, by exploring what constrains the development of L2
phonological categories across syllable positions and how different modalities
interact during this process. To achieve this goal, both experimental tasks and
theoretical modelling were employed.
The first study of this thesis explores the role of cross-linguistic influence
and orthography on L2 category formation. In order to elicit cross-linguistic
influence directly, a delayed-imitation task was performed with L1-Mandarin
naïve listeners. This task examined how the Mandarin phonology parses the
Portuguese input ([l], [ɾ]) in intervocalic onset and in word-internal coda
position. Moreover, whether orthography plays a role during the construction
of L2 phonological representation was tested by manipulating the input types
that were given in the experiment (auditory input alone vs. auditory + written
input). Our study shows that naïve Mandarin listeners’ responses corroborated
with that of L1-Mandarin learners, suggesting that cross-linguistic influence is
responsible for the observed L2 prosodic effects. Moreover, the Mandarin [ɻ] (a
repair strategy for /ɾ/) occurred almost exclusively when the written form was
given, providing evidence for the cross-linguistic interaction between
phonological categorization and orthography during the construction of L2
categories.
In the second study, we first investigate the interaction between speech
perception and production in L2 speech learning, by examining whether the L2
deviant productions stem from misperception and whether the order of
acquisition in L2 speech perception mirrors that in production. Secondly, we
test whether L2 phonological categories remain malleable at a mid-late stage of
L2 speech learning. Two perceptual experiments were performed to test L1-Mandarin learners on their discrimination ability between the target
Portuguese form and the deviant form employed in L2 production. Expanding
on prior research, in this study, the perceptual motivation for L2 speech
difficulties was assessed in different syllable constituents (onset and coda) and
at both segmental and suprasegmental levels (structural modification). The
results demonstrate that some deviant forms observed in L2 production indeed
have a perceptual motivation ([w] for the velarised lateral; [l] and [ɾə] for the
tap), while some others cannot be attributed to misperception (deletion of
syllable-final tap). Furthermore, learners confused the intervocalic /l/ and /ɾ/
bidirectionally in perception, while in production they never misproduced the
lateral (/ɾ/ → [l], */l/ → [ɾ]), revealing a mismatch between two speech
modalities. By contrast, the order of acquisition (/ɾ/coda > /ɾ/onset) was shown to
be consistent in L2 perception and production. The correspondence and
discrepancy between the two speech modalities signal a complex relationship
between L2 speech perception and production. To assess the plasticity of L2
categories /l/ and /ɾ/, two groups of L1-Mandarin learners who differ
substantially in terms of L2 experience were recruited in the perceptual tasks.
Our study shows that both groups behaved similarly in terms of the
discrimination performance. No evidence for a role of L2 experience was found.
The implication of this null result on L2 phonological development is discussed.
The third study of the thesis aims to contribute to bridging the gap between
the L2 experimental evidence and formal theories. Adopting the Bidirectional
Phonology and Phonetics Model, we formalise some of the experimental
findings that cannot be elucidated by current L2 speech theories, namely, the
between and within-subject variation in L2 phonological categorization; the
interaction between phonological categorization and orthography during L2
category construction; and the asymmetry between L2 perception and
production.
Overall, this thesis sheds light on the complex nature of L2 phonological
acquisition and provides a formal account of how different modalities interact
in shaping L2 speech learning. Moreover, it puts forward testable predictions
for future research and suggestions for improving foreign language
teaching/training methodologies.É bem conhecido o facto de as trocas associadas a /l/ e /ɾ/ constituírem uma
das caraterísticas mais percetíveis no português articulado pelos aprendentes
chineses. Recentemente, estudos empíricos revelam que a dificuldade por parte
dos aprendentes chineses não se restringe à discriminação moderada entre as
duas categorias da L2, dado que a aquisição de /l/ e /ɾ/ do português por
aprendentes chineses parece estar sujeita à interação entre contextos
prosódicos, entre modalidades de fala e entre níveis representacionais
diferentes. Esta tese visa aprofundar a nossa compreensão deste processo da
aquisição fonológica L2, explorando o que condiciona o desenvolvimento das
categorias fonológicas L2 em diferentes constituintes silábicos e de que modo
as modalidades interagem durante este processo, recorrendo para tal a tarefas
experimentais bem como a formalização teórica.
O primeiro estudo averigua o papel da influência interlinguística e o da
ortografia na construção das categorias de L2. Para elicitar a influência
interlinguística diretamente, uma tarefa de imitação retardada foi aplicada aos
falantes nativos do mandarim sem conhecimento de português, investigando
assim como a fonologia do mandarim categoriza o input do português ([l], [ɾ])
em ataque simples intervocálico e em coda medial. Para além disso, a influência
ortográfica na construção de representações fonológicas em L2 foi examinada
através da manipulação do tipo do input apresentado na experiência (input
auditivo vs. input auditivo + ortográfico). Os resultados da situação
experimental em que os participantes receberam input de ambos os tipos
replicaram o efeito prosódico observado na literatura, evidenciando a interação
entre categorização fonológica e ortografia na construção das categorias de L2.
No segundo estudo, investigamos a interação entre a perceção e a produção
de fala na aquisição das líquidas do PE por aprendentes chineses e a
plasticidade destas categorias fonológicas, respondendo às questões seguintes:
1) as produções desviantes de L2 resultam da perceção incorreta? 2) a ordem
da aquisição em L2 é consistente na perceção e na produção? 3) as categorias
da L2 permanecem maleáveis numa fase intermédia da aquisição? Duas tarefas
percetivas foram conduzidas para testar a capacidade percetiva dos
aprendentes nativos do mandarim em relação à discriminação entre a forma
alvo do português e as formas desviantes utilizadas na produção. No presente
estudo, a motivação percetiva das dificuldades em L2 foi testada nos constituintes silábicos diferentes (ataque simples e coda) e nos níveis segmental e suprassegmental (modificação estrutural). Os resultados demonstram que algumas formas desviantes que os aprendentes chineses produzem têm uma
motivação percetiva (i.e. [w] para a lateral velarizada; [l] e [ɾə] para a vibrante
alveolar), enquanto outras não podem ser analisadas como casos de perceção
incorreta (como é o caso do o apagamento da vibrante em coda). Para além
disso, na posição intervocálica, os aprendentes manifestam dificuldade na
discriminação entre /l/ e /ɾ/ de forma bidirecional, mas, na produção, a lateral
nunca é produzida incorretamente (/ɾ/ → [l], */l/ → [ɾ]). Tal revela uma
divergência entre as duas modalidades de fala. Por contraste, mostrou-se que a
ordem da aquisição (/ɾ/coda > /ɾ/ataque) é consistente na perceção e na produção
da L2. A correspondência e a discrepância entre as duas modalidades de fala,
sinalizam uma relação complexa entre a perceção e a produção na aquisição
fonológica de L2. Em relação à questão da plasticidade das categorias de L2,
recrutaram-se para as tarefas percetivas dois grupos de aprendentes nativos do
mandarim que se diferenciavam substancialmente em termos da experiência
em L2. Não se encontrou um efeito significativo da experiência da L2. A
implicação deste resultado nulo no desenvolvimento fonológico de L2 foi
discutida.
O terceiro estudo desta tese tem como objetivo contribuir para a
colmatação das lacunas entre estudos empíricos de L2 e as teorias formais.
Adotando o Modelo Bidirecional de Fonologia e Fonética, formalizamos os
resultados experimentais que as teorias atuais da aquisição fonológica de L2
não conseguem explicar, nomeadamente, a variação inter e intra-sujeitos na
categorização fonológica em L2; a interação entre categorização fonológica e
ortografia na construção das categorias na L2; a assimetria entre a perceção e a
produção na L2.
Em suma, esta tese contribui com dados empíricos para a discussão da
relação complexa entre a perceção, produção e ortografia na aquisição
fonológica de L2 e formaliza a interação entre essas modalidades através de um
modelo linguístico generativo. Além disso, apresentam-se predições testáveis
para investigação futura e sugestões para o aperfeiçoamento das metodologias
de ensino/treino da língua não materna
A Dynamic Approach to Rhythm in Language: Toward a Temporal Phonology
It is proposed that the theory of dynamical systems offers appropriate tools
to model many phonological aspects of both speech production and perception. A
dynamic account of speech rhythm is shown to be useful for description of both
Japanese mora timing and English timing in a phrase repetition task. This
orientation contrasts fundamentally with the more familiar symbolic approach to
phonology, in which time is modeled only with sequentially arrayed symbols. It
is proposed that an adaptive oscillator offers a useful model for perceptual
entrainment (or `locking in') to the temporal patterns of speech production.
This helps to explain why speech is often perceived to be more regular than
experimental measurements seem to justify. Because dynamic models deal with
real time, they also help us understand how languages can differ in their
temporal detail---contributing to foreign accents, for example. The fact that
languages differ greatly in their temporal detail suggests that these effects
are not mere motor universals, but that dynamical models are intrinsic
components of the phonological characterization of language.Comment: 31 pages; compressed, uuencoded Postscrip
The Effects of a Combined Output and Input-Oriented Approach in Teaching Reported Speech
The participants of the study are 74 first year students of the English philology who were divided into four groups: 3 treatment groups and a control one. The study results do not mirror those reported in the vast majority of relevant literature and points that although input manipulation appears to have more beneficial effect on the development of the interlanguage than the analysis of output, a combination of the two approaches turns out to be the most beneficial and economical
Recognizing Speech in a Novel Accent: The Motor Theory of Speech Perception Reframed
The motor theory of speech perception holds that we perceive the speech of
another in terms of a motor representation of that speech. However, when we
have learned to recognize a foreign accent, it seems plausible that recognition
of a word rarely involves reconstruction of the speech gestures of the speaker
rather than the listener. To better assess the motor theory and this
observation, we proceed in three stages. Part 1 places the motor theory of
speech perception in a larger framework based on our earlier models of the
adaptive formation of mirror neurons for grasping, and for viewing extensions
of that mirror system as part of a larger system for neuro-linguistic
processing, augmented by the present consideration of recognizing speech in a
novel accent. Part 2 then offers a novel computational model of how a listener
comes to understand the speech of someone speaking the listener's native
language with a foreign accent. The core tenet of the model is that the
listener uses hypotheses about the word the speaker is currently uttering to
update probabilities linking the sound produced by the speaker to phonemes in
the native language repertoire of the listener. This, on average, improves the
recognition of later words. This model is neutral regarding the nature of the
representations it uses (motor vs. auditory). It serve as a reference point for
the discussion in Part 3, which proposes a dual-stream neuro-linguistic
architecture to revisits claims for and against the motor theory of speech
perception and the relevance of mirror neurons, and extracts some implications
for the reframing of the motor theory
PRESENCE: A human-inspired architecture for speech-based human-machine interaction
Recent years have seen steady improvements in the quality and performance of speech-based human-machine interaction driven by a significant convergence in the methods and techniques employed. However, the quantity of training data required to improve state-of-the-art systems seems to be growing exponentially and performance appears to be asymptotic to a level that may be inadequate for many real-world applications. This suggests that there may be a fundamental flaw in the underlying architecture of contemporary systems, as well as a failure to capitalize on the combinatorial properties of human spoken language. This paper addresses these issues and presents a novel architecture for speech-based human-machine interaction inspired by recent findings in the neurobiology of living systems. Called PRESENCE-"PREdictive SENsorimotor Control and Emulation" - this new architecture blurs the distinction between the core components of a traditional spoken language dialogue system and instead focuses on a recursive hierarchical feedback control structure. Cooperative and communicative behavior emerges as a by-product of an architecture that is founded on a model of interaction in which the system has in mind the needs and intentions of a user and a user has in mind the needs and intentions of the system
Self-, other-, and joint monitoring using forward models
In the psychology of language, most accounts of self-monitoring assume that it is based on comprehension. Here we outline and develop the alternative account proposed by Pickering and Garrod (2013), in which speakers construct forward models of their upcoming utterances and compare them with the utterance as they produce them. We propose that speakers compute inverse models derived from the discrepancy (error) between the utterance and the predicted utterance and use that to modify their production command or (occasionally) begin anew. We then propose that comprehenders monitor other people’s speech by simulating their utterances using covert imitation and forward models, and then comparing those forward models with what they hear. They use the discrepancy to compute inverse models and modify their representation of the speaker’s production command, or realize that their representation is incorrect and may develop a new production command. We then discuss monitoring in dialogue, paying attention to sequential contributions, concurrent feedback, and the relationship between monitoring and alignment
An integrated theory of language production and comprehension
Currently, production and comprehension are regarded as quite distinct in accounts of language processing. In rejecting this dichotomy, we instead assert that producing and understanding are interwoven, and that this interweaving is what enables people to predict themselves and each other. We start by noting that production and comprehension are forms of action and action perception. We then consider the evidence for interweaving in action, action perception, and joint action, and explain such evidence in terms of prediction. Specifically, we assume that actors construct forward models of their actions before they execute those actions, and that perceivers of others' actions covertly imitate those actions, then construct forward models of those actions. We use these accounts of action, action perception, and joint action to develop accounts of production, comprehension, and interactive language. Importantly, they incorporate well-defined levels of linguistic representation (such as semantics, syntax, and phonology). We show (a) how speakers and comprehenders use covert imitation and forward modeling to make predictions at these levels of representation, (b) how they interweave production and comprehension processes, and (c) how they use these predictions to monitor the upcoming utterances. We show how these accounts explain a range of behavioral and neuroscientific data on language processing and discuss some of the implications of our proposal
Towards a complete multiple-mechanism account of predictive language processing [Commentary on Pickering & Garrod]
Although we agree with Pickering & Garrod (P&G) that prediction-by-simulation and prediction-by-association are important mechanisms of anticipatory language processing, this commentary suggests that they: (1) overlook other potential mechanisms that might underlie prediction in language processing, (2) overestimate the importance of prediction-by-association in early childhood, and (3) underestimate the complexity and significance of several factors that might mediate prediction during language processing
- …