17,528 research outputs found

    An Efficient Optimal Reconstruction Based Speech Separation Based on Hybrid Deep Learning Technique

    Get PDF
    Conventional single-channel speech separation has two long-standing issues. The first issue, over-smoothing, is addressed, and estimated signals are used to expand the training data set. Second, DNN generates prior knowledge to address the problem of incomplete separation and mitigate speech distortion. To overcome all current issues, we suggest employing an efficient optimal reconstruction-based speech separation (ERSS) to overcome those problems using a hybrid deep learning technique. First, we propose an integral fox ride optimization (IFRO) algorithm for spectral structure reconstruction with the help of multiple spectrum features: time dynamic information, binaural and mono features. Second, we introduce a hybrid retrieval-based deep neural network (RDNN) to reconstruct the spectrograms size of speech and noise directly. The input signals are sent to Short Term Fourier Transform (STFT). STFT converts a clean input signal into spectrograms then uses a feature extraction technique called IFRO to extract features from spectrograms. After extracting the features, using the RDNN classification algorithm, the classified features are converted to softmax. ISTFT then applies to softmax and correctly separates speech signals. Experiments show that our proposed method achieves the highest gains in SDR, SIR, SAR STIO, and PESQ outcomes of 10.9, 15.3, 10.8, 0.08, and 0.58, respectively. The Joint-DNN-SNMF obtains 9.6, 13.4, 10.4, 0.07, and 0.50, comparable to the Joint-DNN-SNMF. The proposed result is compared to a different method and some previous work. In comparison to previous research, our proposed methodology yields better results

    The morphological and audiative interconnectedness of sound: Equivalence in a multidimensional soundscape

    Get PDF
    This paper draws on the authorʼs recent theoretical and practical research into the morphology of sound and audiation. In particular, it explores the notion of equivalence in a multidimensional soundscape. Correlations between the interconnectedness of sound-based morphologies emanating from extended guitar techniques and comprehending internal auditory imagination when sound is not physically present will be assessed. To express an all-encompassing mental and visual image of apprehending the value of sound from a morphological and audiative perspective, three-dimensional topological diagrams will be evaluated ‒ a development of previous two-dimensional visualisations. In regard to morphologies, topics of interest are spectromorphology, spatiomorphology, spectral quality, performance space, and performance aspects. Studying these topics will help in the understanding of morphological value. Learning to comprehend morphologies in relation to the listening experience will deepen all round musical abilities. We will therefore investigate audiation through encompassing deep listening, reduced listening, inherent and external qualities, psychological experience, imagination, and improvisation. As more mutual inclusivity is discovered we can start to contemplate more adventurous pedagogical tools from which future nurturing of musicians may be drawn

    Effects of errorless learning on the acquisition of velopharyngeal movement control

    Get PDF
    Session 1pSC - Speech Communication: Cross-Linguistic Studies of Speech Sound Learning of the Languages of Hong Kong (Poster Session)The implicit motor learning literature suggests a benefit for learning if errors are minimized during practice. This study investigated whether the same principle holds for learning velopharyngeal movement control. Normal speaking participants learned to produce hypernasal speech in either an errorless learning condition (in which the possibility for errors was limited) or an errorful learning condition (in which the possibility for errors was not limited). Nasality level of the participants’ speech was measured by nasometer and reflected by nasalance scores (in %). Errorless learners practiced producing hypernasal speech with a threshold nasalance score of 10% at the beginning, which gradually increased to a threshold of 50% at the end. The same set of threshold targets were presented to errorful learners but in a reversed order. Errors were defined by the proportion of speech with a nasalance score below the threshold. The results showed that, relative to errorful learners, errorless learners displayed fewer errors (50.7% vs. 17.7%) and a higher mean nasalance score (31.3% vs. 46.7%) during the acquisition phase. Furthermore, errorless learners outperformed errorful learners in both retention and novel transfer tests. Acknowledgment: Supported by The University of Hong Kong Strategic Research Theme for Sciences of Learning © 2012 Acoustical Society of Americapublished_or_final_versio

    DMRN+16: Digital Music Research Network One-day Workshop 2021

    Get PDF
    DMRN+16: Digital Music Research Network One-day Workshop 2021 Queen Mary University of London Tuesday 21st December 2021 Keynote speakers Keynote 1. Prof. Sophie Scott -Director, Institute of Cognitive Neuroscience, UCL. Title: "Sound on the brain - insights from functional neuroimaging and neuroanatomy" Abstract In this talk I will use functional imaging and models of primate neuroanatomy to explore how sound is processed in the human brain. I will demonstrate that sound is represented cortically in different parallel streams. I will expand this to show how this can impact on the concept of auditory perception, which arguably incorporates multiple kinds of distinct perceptual processes. I will address the roles that subcortical processes play in this, and also the contributions from hemispheric asymmetries. Keynote 2: Prof. Gus Xia - Assistant Professor at NYU Shanghai Title: "Learning interpretable music representations: from human stupidity to artificial intelligence" Abstract Gus has been leading the Music X Lab in developing intelligent systems that help people better compose and learn music. In this talk, he will show us the importance of music representation for both humans and machines, and how to learn better music representations via the design of inductive bias. Once we got interpretable music representations, the potential applications are limitless

    Waking up to the Present: Vipassana Meditation and the Body

    Get PDF
    Using ethnographic methods I examine the process of learning vipassana meditation, a form of meditation in which the practitioner focuses on their bodily sensations, and the ways in which learning this form of meditation affects the practitioner\u27s daily life. I employ reflexivity alongside an ethnography of the particular to capture my experiences as the student of a Thai Theravada Buddhist monk who teaches at a temple in Portland, Oregon. Through this process I have found that learning vipassana meditation pervades numerous aspects of daily life, extending beyond direct instruction and meditation practice, bringing about perceptual changes in reality as learned concepts become embodied through both meditation and lived experience
    • …
    corecore