15,422 research outputs found

    Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems

    Full text link
    Voice Processing Systems (VPSes), now widely deployed, have been made significantly more accurate through the application of recent advances in machine learning. However, adversarial machine learning has similarly advanced and has been used to demonstrate that VPSes are vulnerable to the injection of hidden commands - audio obscured by noise that is correctly recognized by a VPS but not by human beings. Such attacks, though, are often highly dependent on white-box knowledge of a specific machine learning model and limited to specific microphones and speakers, making their use across different acoustic hardware platforms (and thus their practicality) limited. In this paper, we break these dependencies and make hidden command attacks more practical through model-agnostic (blackbox) attacks, which exploit knowledge of the signal processing algorithms commonly used by VPSes to generate the data fed into machine learning systems. Specifically, we exploit the fact that multiple source audio samples have similar feature vectors when transformed by acoustic feature extraction algorithms (e.g., FFTs). We develop four classes of perturbations that create unintelligible audio and test them against 12 machine learning models, including 7 proprietary models (e.g., Google Speech API, Bing Speech API, IBM Speech API, Azure Speaker API, etc), and demonstrate successful attacks against all targets. Moreover, we successfully use our maliciously generated audio samples in multiple hardware configurations, demonstrating effectiveness across both models and real systems. In so doing, we demonstrate that domain-specific knowledge of audio signal processing represents a practical means of generating successful hidden voice command attacks

    Jazz Improvisation, the Body, and the Ordinary

    Get PDF
    What is one doing when one improvises music, as one does in jazz? There are two sorts of account prominent in jazz literature. The traditional answer is that one is organizing sound materials in the only way they can be organized if they are to be musical. This implies that jazz solos are to be interpreted with the procedures of written music in mind. A second, more controversial answer is offered in David Sudnow's pioneering account of the phenomenology of improvisation, Ways of the Hand. Sudnow claims that learning to improvise at the piano is concerned centrally with copying the bodily ways of one's mentors and finding how one's instructable hands and the keyboard come to answer to one another, so that "to define jazz ... is to describe the body's ways." But despite its greater sensitivity over the traditional account, Sudnow's account is flawed both as a description of how improvisatory skill is acquired and as a model for describing the interest of jazz. My critique of Sudnow compares his account to Augustine's account of learning language, and finds that Wittgenstein's criticisms of Augustine extend to Sudnow. I offer a third approach to understanding improvised music, one which treats the procedures of improvisation as derived from, and importantly at play in, our everyday actions

    Aims and methods in teaching typewriting

    Full text link
    Thesis (Ed.M.)--Boston University. Cover page is damaged

    Adversarial Black-Box Attacks on Automatic Speech Recognition Systems using Multi-Objective Evolutionary Optimization

    Full text link
    Fooling deep neural networks with adversarial input have exposed a significant vulnerability in the current state-of-the-art systems in multiple domains. Both black-box and white-box approaches have been used to either replicate the model itself or to craft examples which cause the model to fail. In this work, we propose a framework which uses multi-objective evolutionary optimization to perform both targeted and un-targeted black-box attacks on Automatic Speech Recognition (ASR) systems. We apply this framework on two ASR systems: Deepspeech and Kaldi-ASR, which increases the Word Error Rates (WER) of these systems by upto 980%, indicating the potency of our approach. During both un-targeted and targeted attacks, the adversarial samples maintain a high acoustic similarity of 0.98 and 0.97 with the original audio.Comment: Published in Interspeech 201

    An exploration of how jazz improvisation is taught

    Full text link
    The purpose of this study was to explore how master jazz pedagogues and artist-level jazz musicians used pedagogical content knowledge to sequence their instructional methods when teaching jazz improvisation. Pedagogical content knowledge served as the theoretical framework for this study. To gain insights into how they used their knowledge when teaching jazz improvisation, I first sought to explore how they learned to improvise. For this study, an overarching research question “How did the participants learn to improvise in jazz?” aided me with contextualizing how they learned content and pedagogy when they began to improvise. Then, the following questions guided my investigation into how these participants used their pedagogical and content knowledge when they taught jazz improvisation: (1) How, if at all, did the participants’ curriculum knowledge influence their approaches to teaching jazz improvisation? (2) How, if at all, did the participants’ pedagogical knowledge influence their approaches to teaching jazz improvisation? (3) How, if at all, did the participants’ content knowledge influence their approaches to teaching jazz improvisation? In this study both the artist-level musicians and master jazz pedagogues all subscribed to an organic mode of teaching jazz improvisation, and not a one size fits all approach that many published jazz materials espouse. Most of these participants did not utilize an established curriculum for teaching, but rather relied on the knowledge of their students and their own content knowledge of what they know and how they learned for the best practices of teaching. Based on the pedagogical content knowledge they provided in this study, I devised a model for teaching jazz improvisation to undergraduate students. I organized this model by developing an eight-semester, or four-year sequence, of pedagogy and content for instruction. For each academic year, I present a description of what I learned from the participants, and how this pedagogical content knowledge can be used with students to learn how to improvise in jazz. I then present a two-semester outline (one academic year) that demonstrates how the pedagogical principles and content knowledge shared by the participants in this study can be sequenced. Each of the participants in this study taught their students based on their own content knowledge and the knowledge of their students. In order to teach jazz and jazz improvisation, preservice teachers need more than just a casual experience with jazz pedagogy, and should look to increase their own content knowledge in the area of jazz through both formal and informal educational opportunities. Furthermore, the scope of this study was limited to world renowned jazz musicians and educators who taught at the university level and only considered the perspectives of jazz educators. Additional studies could focus on active school music teachers who identify as jazz educators or could involve researchers studying the perspectives of the students regarding how they learn pedagogy and content and how they use/retain this knowledge with improvisation. Keywords: jazz, jazz pedagogy, jazz improvisation, pedagogical content knowledge, jazz educatio

    Revising the Sound Value of Meroitic D: A Phonological Approach

    Get PDF
    The Meroitic sign d and its cursive equivalent d have been the subject of a number of investigations into its origins but particularly into attributing a sound value. In trying to deduce a correlative sound value to this sign, Griffith used comparative forms from Greek and Egyptian, although these forms gave contradictory indications. This led to an unstable proposal that the Meroitic sign d d represents a retroflex consonant, although this proposal and subsequent affirmations of its retroflex nature did not consider empirical and typological phonological evidence for this association. This paper revisits the comparative forms used in proposing the retroflex nature of the sign d d and uses a phonological approach in proposing a revision of its sound value

    Slips of the Ear Experienced by the English Department Students

    Get PDF
    Abstrak Penelitian ini mempunyai tujuan untuk mengetahui kesalahan dengar atau persepsi mahasiswa jurusan bahasa inggris di Surabaya. Tujuan dalam penelitian ini adalah untuk mengetahui fenomena salah dengar atau persepsi mahasiswa terhadap monolog yang berisi narasi dari penutur asli bahasa Inggris. Dalam pelaksanaan penelitian ini diggunakan metode deskriptif kualitatif, dengan 23 mahasiswa jurusan bahasa Inggris dari kelas dan tingkatan yang sama sebagai peserta penelitian dan yang telah diinstruksikan untuk menuliskan apa yang mereka dengar dari sebuah rekaman monolog. Dari penilitian yang telah dilakukan, ditemukan beberapa kesalahan dengar yang dituliskan oleh mahasiswa jurusan Bahasa Inggris di Universitas Negeri Surabaya. Kesalahan dengar tersebut digolongkan dalam beberapa kategori yang berdasarkan teori dari Bond (2005). Bond mengklasifikasikan 15 kesalahan dengar yang di dikelompokkan dalam 5 tingkatan pengetahuan yang berbeda: tingkatan pengetahuan fonetik; pengetahuan fonologis; pengetahuan penetahuan leksikal; pengetahuan sintaktis; pragmatik dan semantik. Dari hasil penelitian, ditemukan berbagai kesalahan yang berada di tiga tingkatan pengetahan yaitu, pengetahuan fonetik, fonologis, dan leksikal. Dan hanya ditemukan tujuh dari 15 macam kesalahan dengar yang dikemukakan oleh Bond (2005). Kata Kunci: Salah dengar, pengetahuan fonetik, pengetahuan fonologis, pengetahuan leksikal. Abstract The aim of this study was to find out the phenomenon of slips of the ear or student misperception of a monologue narrated by a native English speaker. In conducting this research, a qualitative descriptive method was used with twenty-three students majoring in English Education State University of Surabaya from the same class and level as research participants. They were instructed to write down what they have heard from the monologue recording. After the research was completed, several slips of the ear have been found which produced and written by the participants. These data are classified into several categories based on Bond’s theory of slips of the ear (2005). He classifies fifteen errors which are grouped into five different levels of knowledge, those are phonetic knowledge; phonological knowledge; lexical knowledge; syntactic knowledge; pragmatics and semantics. The result of the study indicates that various errors were found in three different levels of knowledge: phonetic, phonological, and lexical knowledge. In addition, only seven out of the 15 kinds of error by Bond were found. Keywords: Slips of the ear, phonetic knowledge, phonological knowledge, lexical knowledg
    • …
    corecore