8,617 research outputs found

    Acoustic model adaptation for ortolan bunting (Emberiza hortulana L.) song-type classification

    Get PDF
    Automatic systems for vocalization classification often require fairly large amounts of data on which to train models. However, animal vocalization data collection and transcription is a difficult and time-consuming task, so that it is expensive to create large data sets. One natural solution to this problem is the use of acoustic adaptation methods. Such methods, common in human speech recognition systems, create initial models trained on speaker independent data, then use small amounts of adaptation data to build individual-specific models. Since, as in human speech, individual vocal variability is a significant source of variation in bioacoustic data, acoustic model adaptation is naturally suited to classification in this domain as well. To demonstrate and evaluate the effectiveness of this approach, this paper presents the application of maximum likelihood linear regression adaptation to ortolan bunting (Emberiza hortulana L.) song-type classification. Classification accuracies for the adapted system are computed as a function of the amount of adaptation data and compared to caller-independent and caller-dependent systems. The experimental results indicate that given the same amount of data, supervised adaptation significantly outperforms both caller-independent and caller-dependent systems

    Echo Cancellation : the generalized likelihood ratio test for double-talk vs. channel change

    Get PDF
    Echo cancellers are required in both electrical (impedance mismatch) and acoustic (speaker-microphone coupling) applications. One of the main design problems is the control logic for adaptation. Basically, the algorithm weights should be frozen in the presence of double-talk and adapt quickly in the absence of double-talk. The optimum likelihood ratio test (LRT) for this problem was studied in a recent paper. The LRT requires a priori knowledge of the background noise and double-talk power levels. Instead, this paper derives a generalized log likelihood ratio test (GLRT) that does not require this knowledge. The probability density function of a sufficient statistic under each hypothesis is obtained and the performance of the test is evaluated as a function of the system parameters. The receiver operating characteristics (ROCs) indicate that it is difficult to correctly decide between double-talk and a channel change, based upon a single look. However, detection based on about 200 successive samples yields a detection probability close to unity (0.99) with a small false alarm probability (0.01) for the theoretical GLRT model. Application of a GLRT-based echo canceller (EC) to real voice data shows comparable performance to that of the LRT-based EC given in a recent paper
    • 

    corecore