1,021 research outputs found

    MISPRONUNCIATION DETECTION AND DIAGNOSIS IN MANDARIN ACCENTED ENGLISH SPEECH

    Get PDF
    This work presents the development, implementation, and evaluation of a Mispronunciation Detection and Diagnosis (MDD) system, with application to pronunciation evaluation of Mandarin-accented English speech. A comprehensive detection and diagnosis of errors in the Electromagnetic Articulography corpus of Mandarin-Accented English (EMA-MAE) was performed by using the expert phonetic transcripts and an Automatic Speech Recognition (ASR) system. Articulatory features derived from the parallel kinematic data available in the EMA-MAE corpus were used to identify the most significant articulatory error patterns seen in L2 speakers during common mispronunciations. Using both acoustic and articulatory information, an ASR based Mispronunciation Detection and Diagnosis (MDD) system was built and evaluated across different feature combinations and Deep Neural Network (DNN) architectures. The MDD system captured mispronunciation errors with a detection accuracy of 82.4%, a diagnostic accuracy of 75.8% and a false rejection rate of 17.2%. The results demonstrate the advantage of using articulatory features in revealing the significant contributors of mispronunciation as well as improving the performance of MDD systems

    Multi-View Multi-Task Representation Learning for Mispronunciation Detection

    Full text link
    The disparity in phonology between learner's native (L1) and target (L2) language poses a significant challenge for mispronunciation detection and diagnosis (MDD) systems. This challenge is further intensified by lack of annotated L2 data. This paper proposes a novel MDD architecture that exploits multiple `views' of the same input data assisted by auxiliary tasks to learn more distinctive phonetic representation in a low-resource setting. Using the mono- and multilingual encoders, the model learn multiple views of the input, and capture the sound properties across diverse languages and accents. These encoded representations are further enriched by learning articulatory features in a multi-task setup. Our reported results using the L2-ARCTIC data outperformed the SOTA models, with a phoneme error rate reduction of 11.13% and 8.60% and absolute F1 score increase of 5.89%, and 2.49% compared to the single-view mono- and multilingual systems, with a limited L2 dataset.Comment: 5 page

    Speaker Independent Acoustic-to-Articulatory Inversion

    Get PDF
    Acoustic-to-articulatory inversion, the determination of articulatory parameters from acoustic signals, is a difficult but important problem for many speech processing applications, such as automatic speech recognition (ASR) and computer aided pronunciation training (CAPT). In recent years, several approaches have been successfully implemented for speaker dependent models with parallel acoustic and kinematic training data. However, in many practical applications inversion is needed for new speakers for whom no articulatory data is available. In order to address this problem, this dissertation introduces a novel speaker adaptation approach called Parallel Reference Speaker Weighting (PRSW), based on parallel acoustic and articulatory Hidden Markov Models (HMM). This approach uses a robust normalized articulatory space and palate referenced articulatory features combined with speaker-weighted adaptation to form an inversion mapping for new speakers that can accurately estimate articulatory trajectories. The proposed PRSW method is evaluated on the newly collected Marquette electromagnetic articulography - Mandarin Accented English (EMA-MAE) corpus using 20 native English speakers. Cross-speaker inversion results show that given a good selection of reference speakers with consistent acoustic and articulatory patterns, the PRSW approach gives good speaker independent inversion performance even without kinematic training data

    Sound structure and sound change: A modeling approach

    Get PDF
    Research in linguistics, as in most other scientific domains, is usually approached in a modular way – narrowing the domain of inquiry in order to allow for increased depth of study. This is necessary and productive for a topic as wide-ranging and complex as human language. However, precisely because language is a complex system, tied to perception, learning, memory, and social organization, the assumption of modularity can also be an obstacle to understanding language at a deeper level. This book examines the consequences of enforcing non-modularity along two dimensions: the temporal, and the cognitive. Along the temporal dimension, synchronic and diachronic domains are linked by the requirement that sound changes must lead to viable, stable language states. Along the cognitive dimension, sound change and variation are linked to speech perception and production by requiring non-trivial transformations between acoustic and articulatory representations. The methodological focus of this work is on computational modeling. By formalising and implementing theoretical accounts, modeling can expose theoretical gaps and covert assumptions. To do so, it is necessary to formally assess the functional equivalence of specific implementational choices, as well as their mapping to theoretical structures. This book applies this analytic approach to a series of implemented models of sound change. As theoretical inconsistencies are discovered, possible solutions are proposed, incrementally constructing a set of sufficient properties for a working model. Because internal theoretical consistency is enforced, this model corresponds to an explanatorily adequate theory. And because explicit links between modules are required, this is a theory, not only of sound change, but of many aspects of phonological competence. The book highlights two aspects of modeling work that receive relatively little attention: the formal mapping from model to theory, and the scalability of demonstration models. Focusing on these aspects of modeling makes it clear that any theory of sound change in the specific is impossible without a more general theory of language: of the relationship between perception and production, the relationship between phonetics and phonology, the learning of linguistic units, and the nature of underlying representations. Theories of sound change that do not explicitly address these aspects of language are making tacit, untested assumptions about their properties. Addressing so many aspects of language may seem to complicate the linguist's task. However, as this book shows, it actually helps impose boundary conditions of ecological validity that reduce the theoretical search space

    Sound structure and sound change: A modeling approach

    Get PDF
    Research in linguistics, as in most other scientific domains, is usually approached in a modular way – narrowing the domain of inquiry in order to allow for increased depth of study. This is necessary and productive for a topic as wide-ranging and complex as human language. However, precisely because language is a complex system, tied to perception, learning, memory, and social organization, the assumption of modularity can also be an obstacle to understanding language at a deeper level. This book examines the consequences of enforcing non-modularity along two dimensions: the temporal, and the cognitive. Along the temporal dimension, synchronic and diachronic domains are linked by the requirement that sound changes must lead to viable, stable language states. Along the cognitive dimension, sound change and variation are linked to speech perception and production by requiring non-trivial transformations between acoustic and articulatory representations. The methodological focus of this work is on computational modeling. By formalising and implementing theoretical accounts, modeling can expose theoretical gaps and covert assumptions. To do so, it is necessary to formally assess the functional equivalence of specific implementational choices, as well as their mapping to theoretical structures. This book applies this analytic approach to a series of implemented models of sound change. As theoretical inconsistencies are discovered, possible solutions are proposed, incrementally constructing a set of sufficient properties for a working model. Because internal theoretical consistency is enforced, this model corresponds to an explanatorily adequate theory. And because explicit links between modules are required, this is a theory, not only of sound change, but of many aspects of phonological competence. The book highlights two aspects of modeling work that receive relatively little attention: the formal mapping from model to theory, and the scalability of demonstration models. Focusing on these aspects of modeling makes it clear that any theory of sound change in the specific is impossible without a more general theory of language: of the relationship between perception and production, the relationship between phonetics and phonology, the learning of linguistic units, and the nature of underlying representations. Theories of sound change that do not explicitly address these aspects of language are making tacit, untested assumptions about their properties. Addressing so many aspects of language may seem to complicate the linguist's task. However, as this book shows, it actually helps impose boundary conditions of ecological validity that reduce the theoretical search space

    Sound structure and sound change: A modeling approach

    Get PDF
    Research in linguistics, as in most other scientific domains, is usually approached in a modular way – narrowing the domain of inquiry in order to allow for increased depth of study. This is necessary and productive for a topic as wide-ranging and complex as human language. However, precisely because language is a complex system, tied to perception, learning, memory, and social organization, the assumption of modularity can also be an obstacle to understanding language at a deeper level. This book examines the consequences of enforcing non-modularity along two dimensions: the temporal, and the cognitive. Along the temporal dimension, synchronic and diachronic domains are linked by the requirement that sound changes must lead to viable, stable language states. Along the cognitive dimension, sound change and variation are linked to speech perception and production by requiring non-trivial transformations between acoustic and articulatory representations. The methodological focus of this work is on computational modeling. By formalising and implementing theoretical accounts, modeling can expose theoretical gaps and covert assumptions. To do so, it is necessary to formally assess the functional equivalence of specific implementational choices, as well as their mapping to theoretical structures. This book applies this analytic approach to a series of implemented models of sound change. As theoretical inconsistencies are discovered, possible solutions are proposed, incrementally constructing a set of sufficient properties for a working model. Because internal theoretical consistency is enforced, this model corresponds to an explanatorily adequate theory. And because explicit links between modules are required, this is a theory, not only of sound change, but of many aspects of phonological competence. The book highlights two aspects of modeling work that receive relatively little attention: the formal mapping from model to theory, and the scalability of demonstration models. Focusing on these aspects of modeling makes it clear that any theory of sound change in the specific is impossible without a more general theory of language: of the relationship between perception and production, the relationship between phonetics and phonology, the learning of linguistic units, and the nature of underlying representations. Theories of sound change that do not explicitly address these aspects of language are making tacit, untested assumptions about their properties. Addressing so many aspects of language may seem to complicate the linguist's task. However, as this book shows, it actually helps impose boundary conditions of ecological validity that reduce the theoretical search space

    Sound structure and sound change: A modeling approach

    Get PDF
    Research in linguistics, as in most other scientific domains, is usually approached in a modular way – narrowing the domain of inquiry in order to allow for increased depth of study. This is necessary and productive for a topic as wide-ranging and complex as human language. However, precisely because language is a complex system, tied to perception, learning, memory, and social organization, the assumption of modularity can also be an obstacle to understanding language at a deeper level. This book examines the consequences of enforcing non-modularity along two dimensions: the temporal, and the cognitive. Along the temporal dimension, synchronic and diachronic domains are linked by the requirement that sound changes must lead to viable, stable language states. Along the cognitive dimension, sound change and variation are linked to speech perception and production by requiring non-trivial transformations between acoustic and articulatory representations. The methodological focus of this work is on computational modeling. By formalising and implementing theoretical accounts, modeling can expose theoretical gaps and covert assumptions. To do so, it is necessary to formally assess the functional equivalence of specific implementational choices, as well as their mapping to theoretical structures. This book applies this analytic approach to a series of implemented models of sound change. As theoretical inconsistencies are discovered, possible solutions are proposed, incrementally constructing a set of sufficient properties for a working model. Because internal theoretical consistency is enforced, this model corresponds to an explanatorily adequate theory. And because explicit links between modules are required, this is a theory, not only of sound change, but of many aspects of phonological competence. The book highlights two aspects of modeling work that receive relatively little attention: the formal mapping from model to theory, and the scalability of demonstration models. Focusing on these aspects of modeling makes it clear that any theory of sound change in the specific is impossible without a more general theory of language: of the relationship between perception and production, the relationship between phonetics and phonology, the learning of linguistic units, and the nature of underlying representations. Theories of sound change that do not explicitly address these aspects of language are making tacit, untested assumptions about their properties. Addressing so many aspects of language may seem to complicate the linguist's task. However, as this book shows, it actually helps impose boundary conditions of ecological validity that reduce the theoretical search space

    Sound structure and sound change: A modeling approach

    Get PDF
    Research in linguistics, as in most other scientific domains, is usually approached in a modular way – narrowing the domain of inquiry in order to allow for increased depth of study. This is necessary and productive for a topic as wide-ranging and complex as human language. However, precisely because language is a complex system, tied to perception, learning, memory, and social organization, the assumption of modularity can also be an obstacle to understanding language at a deeper level. This book examines the consequences of enforcing non-modularity along two dimensions: the temporal, and the cognitive. Along the temporal dimension, synchronic and diachronic domains are linked by the requirement that sound changes must lead to viable, stable language states. Along the cognitive dimension, sound change and variation are linked to speech perception and production by requiring non-trivial transformations between acoustic and articulatory representations. The methodological focus of this work is on computational modeling. By formalising and implementing theoretical accounts, modeling can expose theoretical gaps and covert assumptions. To do so, it is necessary to formally assess the functional equivalence of specific implementational choices, as well as their mapping to theoretical structures. This book applies this analytic approach to a series of implemented models of sound change. As theoretical inconsistencies are discovered, possible solutions are proposed, incrementally constructing a set of sufficient properties for a working model. Because internal theoretical consistency is enforced, this model corresponds to an explanatorily adequate theory. And because explicit links between modules are required, this is a theory, not only of sound change, but of many aspects of phonological competence. The book highlights two aspects of modeling work that receive relatively little attention: the formal mapping from model to theory, and the scalability of demonstration models. Focusing on these aspects of modeling makes it clear that any theory of sound change in the specific is impossible without a more general theory of language: of the relationship between perception and production, the relationship between phonetics and phonology, the learning of linguistic units, and the nature of underlying representations. Theories of sound change that do not explicitly address these aspects of language are making tacit, untested assumptions about their properties. Addressing so many aspects of language may seem to complicate the linguist's task. However, as this book shows, it actually helps impose boundary conditions of ecological validity that reduce the theoretical search space

    Sound structure and sound change: A modeling approach

    Get PDF
    Research in linguistics, as in most other scientific domains, is usually approached in a modular way – narrowing the domain of inquiry in order to allow for increased depth of study. This is necessary and productive for a topic as wide-ranging and complex as human language. However, precisely because language is a complex system, tied to perception, learning, memory, and social organization, the assumption of modularity can also be an obstacle to understanding language at a deeper level. This book examines the consequences of enforcing non-modularity along two dimensions: the temporal, and the cognitive. Along the temporal dimension, synchronic and diachronic domains are linked by the requirement that sound changes must lead to viable, stable language states. Along the cognitive dimension, sound change and variation are linked to speech perception and production by requiring non-trivial transformations between acoustic and articulatory representations. The methodological focus of this work is on computational modeling. By formalising and implementing theoretical accounts, modeling can expose theoretical gaps and covert assumptions. To do so, it is necessary to formally assess the functional equivalence of specific implementational choices, as well as their mapping to theoretical structures. This book applies this analytic approach to a series of implemented models of sound change. As theoretical inconsistencies are discovered, possible solutions are proposed, incrementally constructing a set of sufficient properties for a working model. Because internal theoretical consistency is enforced, this model corresponds to an explanatorily adequate theory. And because explicit links between modules are required, this is a theory, not only of sound change, but of many aspects of phonological competence. The book highlights two aspects of modeling work that receive relatively little attention: the formal mapping from model to theory, and the scalability of demonstration models. Focusing on these aspects of modeling makes it clear that any theory of sound change in the specific is impossible without a more general theory of language: of the relationship between perception and production, the relationship between phonetics and phonology, the learning of linguistic units, and the nature of underlying representations. Theories of sound change that do not explicitly address these aspects of language are making tacit, untested assumptions about their properties. Addressing so many aspects of language may seem to complicate the linguist's task. However, as this book shows, it actually helps impose boundary conditions of ecological validity that reduce the theoretical search space

    Sound structure and sound change: A modeling approach

    Get PDF
    Research in linguistics, as in most other scientific domains, is usually approached in a modular way – narrowing the domain of inquiry in order to allow for increased depth of study. This is necessary and productive for a topic as wide-ranging and complex as human language. However, precisely because language is a complex system, tied to perception, learning, memory, and social organization, the assumption of modularity can also be an obstacle to understanding language at a deeper level. This book examines the consequences of enforcing non-modularity along two dimensions: the temporal, and the cognitive. Along the temporal dimension, synchronic and diachronic domains are linked by the requirement that sound changes must lead to viable, stable language states. Along the cognitive dimension, sound change and variation are linked to speech perception and production by requiring non-trivial transformations between acoustic and articulatory representations. The methodological focus of this work is on computational modeling. By formalising and implementing theoretical accounts, modeling can expose theoretical gaps and covert assumptions. To do so, it is necessary to formally assess the functional equivalence of specific implementational choices, as well as their mapping to theoretical structures. This book applies this analytic approach to a series of implemented models of sound change. As theoretical inconsistencies are discovered, possible solutions are proposed, incrementally constructing a set of sufficient properties for a working model. Because internal theoretical consistency is enforced, this model corresponds to an explanatorily adequate theory. And because explicit links between modules are required, this is a theory, not only of sound change, but of many aspects of phonological competence. The book highlights two aspects of modeling work that receive relatively little attention: the formal mapping from model to theory, and the scalability of demonstration models. Focusing on these aspects of modeling makes it clear that any theory of sound change in the specific is impossible without a more general theory of language: of the relationship between perception and production, the relationship between phonetics and phonology, the learning of linguistic units, and the nature of underlying representations. Theories of sound change that do not explicitly address these aspects of language are making tacit, untested assumptions about their properties. Addressing so many aspects of language may seem to complicate the linguist's task. However, as this book shows, it actually helps impose boundary conditions of ecological validity that reduce the theoretical search space
    • …
    corecore