5 research outputs found

    Pengembangan Media Pembelajaran Koala untuk Pembelajaran Pola Irama di Sekolah Dasar

    Get PDF
    Perkembangan teknologi di era digital menuntut guru sekolah dasar mengintegrasikan teknologi dalam proses kegiatan pembelajaran, salah satunya melalui penggunaan media berbasis digital yang menyenangkan dan menarik. Media pembelajaran berbasis digital yang populer dengan visual menarik dan aksesibilitas mudah salah satunya adalah komik digital. Penelitian ini bertujuan untuk menghasilkan produk berupa komik digital interaktif yang berfokus pada topik pengenalan bentuk dan variasi pola irama di sekolah dasar. Desain penelitian ini adalah penelitian pengembangan dengan model Analyzing, Designing, Developing, Implementation, Evaluation (ADDIE). Instrumen pengumpulan data melalui kuesioner validasi dan System Usability Scale (SUS). Data yang terkumpul dianalisis secara deskriptif kuantitatif dan kualitatif. Hasil penelitian ini menunjukkan bahwa rata-rata total dari keseluruhan aspek penilaian ahli media memperoleh skor rata-rata 93,55% (sangat valid), ahli materi memperoleh skor rata-rata 88,05% (sangat valid), ahli bahasa memperoleh skor rata-rata 100,00% (sangat valid). Respon siswa terhadap produk pada uji coba perorangan memperoleh skor 79 (good), pada uji kelompok kecil memperoleh skor 77 (good), dan pada uji lapangan memperoleh skor 85 (excellent). Berdasarkan hasil tersebut, media pembelajaran yang dikembangkan yaitu komik digital interaktif pola irama (Koala) layak digunakan dalam proses pembelajaran

    Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect

    No full text
    When speaking in presence of background noise, humans reflexively change their way of speaking in order to improve the intelligibility of their speech. This reflex is known as Lombard effect. Collecting speech in Lombard conditions is usually hard and costly. For this reason, speech enhancement systems are generally trained and evaluated on speech recorded in quiet to which noise is artificially added. Since these systems are often used in situations where Lombard speech occurs, in this work we perform an analysis of the impact that Lombard effect has on audio, visual and audio-visual speech enhancement, focusing on deep-learning-based systems, since they represent the current state of the art in the field. We conduct several experiments using an audio-visual Lombard speech corpus consisting of utterances spoken by 54 different talkers. The results show that training deep-learning-based models with Lombard speech is beneficial in terms of both estimated speech quality and estimated speech intelligibility at low signal to noise ratios, where the visual modality can play an important role in acoustically challenging situations. We also find that a performance difference between genders exists due to the distinct Lombard speech exhibited by males and females, and we analyse it in relation with acoustic and visual features. Furthermore, listening tests conducted with audio-visual stimuli show that the speech quality of the signals processed with systems trained using Lombard speech is statistically significantly better than the one obtained using systems trained with non-Lombard speech at a signal to noise ratio of -5 dB. Regarding speech intelligibility, we find a general tendency of the benefit in training the systems with Lombard speech

    Deep Learning-based Speech Enhancement for Real-life Applications

    Get PDF
    Speech enhancement is the process of improving speech quality and intelligibility by suppressing noise. Inspired by the outstanding performance of the deep learning approach for speech enhancement, this thesis aims to add to this research area through the following contributions. The thesis presents an experimental analysis of different deep neural networks for speech enhancement, to compare their performance and investigate factors and approaches that improve the performance. The outcomes of this analysis facilitate the development of better speech enhancement networks in this work. Moreover, this thesis proposes a new deep convolutional denoising autoencoderbased speech enhancement architecture, in which strided and dilated convolutions were applied to improve the performance while keeping network complexity to a minimum. Furthermore, a two-stage speech enhancement approach is proposed that reduces distortion, by performing a speech denoising first stage in the frequency domain, followed by a second speech reconstruction stage in the time domain. This approach was proven to reduce speech distortion, leading to better overall quality of the processed speech in comparison to state-of-the-art speech enhancement models. Finally, the work presents two deep neural network speech enhancement architectures for hearing aids and automatic speech recognition, as two real-world speech enhancement applications. A smart speech enhancement architecture was proposed for hearing aids, which is an integrated hearing aid and alert system. This architecture enhances both speech and important emergency noise, and only eliminates undesired noise. The results show that this idea is applicable to improve the performance of hearing aids. On the other hand, the architecture proposed for automatic speech recognition solves the mismatch issue between speech enhancement automatic speech recognition systems, leading to significant reduction in the word error rate of a baseline automatic speech recognition system, provided by Intelligent Voice for research purposes. In conclusion, the results presented in this thesis show promising performance for the proposed architectures for real time speech enhancement applications
    corecore