268 research outputs found

    Real Time QRS Detection Based on M-ary Likelihood Ratio Test on the DFT Coefficients

    Get PDF
    This paper shows an adaptive statistical test for QRS detection of electrocardiography (ECG) signals. The method is based on a M-ary generalized likelihood ratio test (LRT) defined over a multiple observation window in the Fourier domain. The motivations for proposing another detection algorithm based on maximum a posteriori (MAP) estimation are found in the high complexity of the signal model proposed in previous approaches which i) makes them computationally unfeasible or not intended for real time applications such as intensive care monitoring and (ii) in which the parameter selection conditions the overall performance. In this sense, we propose an alternative model based on the independent Gaussian properties of the Discrete Fourier Transform (DFT) coefficients, which allows to define a simplified MAP probability function. In addition, the proposed approach defines an adaptive MAP statistical test in which a global hypothesis is defined on particular hypotheses of the multiple observation window. In this sense, the observation interval is modeled as a discontinuous transmission discrete-time stochastic process avoiding the inclusion of parameters that constraint the morphology of the QRS complexes.This work has received research funding from the Spanish government (www.micinn.es) under project TEC2012 34306 (DiagnoSIS, Diagnosis by means of Statistical Intelligent Systems, 70K€) and projects P09-TIC-4530 (300K€) and P11-TIC-7103 (156K€) from the Andalusian government (http://www.juntadeandalucia.es/organismo​s/economiainnovacioncienciayempleo.html)

    Multisensory mental representation of objects in typical and Gifted Word Learner dogs

    Get PDF
    Little research has been conducted on dogs’ (Canis familiaris) ability to integrate information obtained through different sensory modalities during object discrimination and recognition tasks. Such a process would indicate the formation of multisensory mental representations. In Experiment 1, we tested the ability of 3 Gifted Word Learner (GWL) dogs that can rapidly learn the verbal labels of toys, and 10 Typical (T) dogs to discriminate an object recently associated with a reward, from distractor objects, under light and dark conditions. While the success rate did not differ between the two groups and conditions, a detailed behavioral analysis showed that all dogs searched for longer and sniffed more in the dark. This suggests that, when possible, dogs relied mostly on vision, and switched to using only other sensory modalities, including olfaction, when searching in the dark. In Experiment 2, we investigated whether, for the GWL dogs (N = 4), hearing the object verbal labels activates a memory of a multisensory mental representation. We did so by testing their ability to recognize objects based on their names under dark and light conditions. Their success rate did not differ between the two conditions, whereas the dogs’ search behavior did, indicating a flexible use of different sensory modalities. Little is known about the cognitive mechanisms involved in the ability of GWL dogs to recognize labeled objects. These findings supply the first evidence that for GWL dogs, verbal labels evoke a multisensory mental representation of the objects. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10071-022-01639-z

    About voice activity detection

    Get PDF
    Orientadores: Romis Ribeiro de Faissol Attux, Everton Zaccaria NadalinDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Este trabalho tem por objetivo o estudo e a avaliação de técnicas de detecção de atividade de voz (VAD, Voice Activity Detection) em arquivos de áudio digital, bem como a proposta de uma nova metodologia de solução. Para tanto, foram estudados os conceitos fundamentais de processamento digital de sinais de fala, em especial, algumas abordagens clássicas ao problema da distinção entre voz e não voz. Começamos os estudos pelas pioneiras técnicas que faziam uso de análises de energia e das taxas de cruzamento por zero do sinal de voz, para então passarmos por enfoques mais recen-tes, tais como os que exploram a entropia espectral, a variabilidade em longo prazo, bem como a periodicidade do sinal de voz. Seguindo a história das metodologias para detecção da presença de fala, voltamos o foco para classificadores de atividade de voz baseados em modelos estatísticos e terminamos por examinar as recentes aplicações de reconhecimento de padrões e de técnicas de aprendizado de máquina ao problema estudado. Tal cenário revela uma vasta gama de caracterís-ticas representativas da voz a serem exploradas para a detecção da presença da mesma, bem como de métodos para extração de tais atributos. Assim, a seleção destas características e as técnicas de classificação a serem utilizadas são dois aspectos complementares que formam o par de interesses deste estudo. Em um sinal com alta relação sinal ruído, a detecção de atividade de voz pode ser realiza-da satisfatoriamente ao se aplicar um limiar de energia. Contudo, em baixa relação sinal-ruído pode ser bastante difícil detectar corretamente o sinal de interesse, especialmente quando este é corrompido por sinais acusticamente mais complexos tais como oriundos de vias urbanas e de praças de alimentação. Com o intuito de avaliar os atributos bem como as técnicas de classificação utilizados pela literatura em diferentes tipos e níveis de ruído, alguns algoritmos de detecção de atividade de voz tiveram o desempenho observado com o auxilio de uma extensa base de dados de ruído, a QUT-NOISE-TIMIT. Neste trabalho, apresenta-se, ainda, uma nova proposta que explora a natureza quase pe-riódica da voz para a detecção da parte vozeada da fala, uma vez que esta é mais robusta ao ruído e que a parte não vozeada da fala pode ser aproximada com técnicas de suavização. A investigação de tal proposta foi possível através da elaboração de algoritmos de VAD que aplicam a correlação cruzada entre espectros de quadros consecutivos para extração de atributo a ser explorado por diferentes estratégias de classificação. Discute-se o desempenho da proposta em comparação com o desempenho dos atributos utilizados pela literatura em conjunto com diferentes técnicas de classificação. Bons resultados foram obtidos quando da utilização da característica proposta em diferentes abordagens de classificação, especialmente em ambientes com ruídos de burburinhoAbstract: This work aims to study and evaluate voice activity detection techniques (VAD Voice Activity Detection) applied to digital audio files, as well as proposes a new solution methodology. To achieve this end, the fundamental concepts of digital speech processing were studied, in particu-lar some classic approaches to the problem of the distinction between voice and non-voice. We started the study from the pioneering technique, which use energy analysis and zero-crossing rate of the speech signal, proceeding to more recent approaches such as those exploiting the spectral entropy, the long-term variability, as well as the periodicity of the voice signal. Following the history of the methodologies for detecting the presence of speech, we focused on VADs classifiers based on statistical models and, finally we examined recent pattern recognition ap-plications and machine learning techniques to solve the studied problem. This scenario presents a wide range of representative features of the voice that could be exploited for the detection of presence as well as methods for extracting these attributes. Thus, the selection of these features and classifi-cation techniques to be used are two complementary aspects that form the core of this study. In the context of a high signal to noise ratio, voice activity detection can be per-formed satisfactorily by applying an energy threshold. However, in low signal to noise ratio, it can be quite difficult to correctly detect the signal of interest, especially when it is corrupted by acoustically complex signals such as from urban roads and food courts. In order to evalu-ate the attributes and the classification techniques used in the literature in different scenarios and noise levels, some voice activity detection algorithms have their performance assessed with the aid of an extensive noise database, QUT -NOISE - TIMIT. In this study, we also present a new proposal that exploits the quasi-periodic nature of the voice for the detection of voiced speech, since it is more robust to noise and the non-voiced speech can be approximated with smoothing techniques. The investigation of such proposal was possible through the development of VAD algorithms that apply cross-correlation be-tween spectra of consecutive frames for attribute extraction that can be exploited by different classification strategies. We discuss the performance of the proposal compared with the performance of features commonly used in the literature in combination with different classification techniques. Good results were obtained when using the proposed resource in different classification approaches, especially in environments with bubble noiseMestradoEngenharia de ComputaçãoMestre em Engenharia ElétricaCAPE

    Studies on noise robust automatic speech recognition

    Get PDF
    Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

    A reliable past or a reliable pest? Testing canonical stimuli in speech perception research

    Get PDF
    A growing body of research is exploring second language (L2) learners’ listening perception of vowel contrasts. Conventionally, researchers have estimated how well listeners differentiate between L2 vowels with isolated words (or syllables) in a fixed consonantal frame, such as b-vowel-t (e.g., beat-bit). However, there is a dearth of research that systematically examines how well results generalise beyond isolated frames or the suitability of employing more phonologically and sententially diverse listening prompt types for assessing L2 vowel perception. To address this gap, two studies investigated the effects of using b-vowel-t and more diverse prompt types for assessing intermediate-advanced adult L2 perception of English /i/-/ɪ/ and /ɛ/-/æ/ vowel pairs. Prompt performance was measured for internal consistency, congruence with the Perceptual Assimilation Model for L2 speech learning (Best & Tyler, 2007), and listeners’ subjective experiences with each prompt type. Mixed effects modelling investigated the predictive power of b-vowel-t performance on more diverse prompt types. Study 1 explored prompt performance using closed-set, forced choice tasks with first language (L1) Mandarin and Korean listeners. Study 2 investigated the effect of Mandarin and Spanish L1 listeners’ target word familiarity and associations with sentence prompts using transcription-response tasks and self-report surveys. Both studies found that diverse prompts had adequate internal consistency and aligned with PAM-L2 predictions. B-vowel-t prompts poorly generalised to diverse prompts and accorded less with PAM-L2 predictions. Survey results showed increased demands from more diverse prompt types based on participants’ ratings; however, this did not always correspond to lower performance. Collectively, results indicate utility in employing prompts beyond isolated words in a fixed consonantal frame for laboratory and at-home administrations. These findings contribute to the vowel perception literature by evaluating and extending the scope of prompts which may be used

    MEASURING POSTSECONDARY STUDENTS’ SENSE OF BELONGING: PSYCHOMETRIC INVESTIGATIONS INTO STUDENT DEMOGRAPHICS AND COURSE DELIVERY CONTEXTS

    Get PDF
    Research suggests sense of belonging in academic contexts influences student academic outcomes and well-being. Instruments (i.e., surveys, questionnaires) developed to measure sense of belonging mainly focus on the experience of students in middle grades. Few instruments measure sense of belonging experienced by postsecondary students, despite many colleges and universities seeking to improve retention, persistence, and graduation by addressing this complex construct. Furthermore, the rapid growth of online courses necessitates and presents an opportunity to employ psychometric investigations to explore the sense of belonging experienced by both face-to-face and online students. The first of the two studies conducted for this dissertation extends a brief instrument originally tested on an adolescent sample for use among postsecondary students, testing for differential item functioning based on various groupings, including but not limited to degree level, gender, and ethnicity. The second study investigates if it is possible to similarly measure students’ sense of belonging to other students within the same course in face-to-face and online delivery methods using a common instrument. Employing modern measurement strategies, these studies demonstrate the value of rigorous analyses of internal structure to produce validity evidence for practical and reliable instruments—reflective of the diversity in student identities and learning contexts in higher education institutions—to measure postsecondary students’ sense of belonging

    Using Personality Detection Tools for Software Engineering Research: How Far Can We Go?

    Get PDF
    Assessing the personality of software engineers may help to match individual traits with the characteristics of development activities such as code review and testing, as well as support managers in team composition. However, self-assessment questionnaires are not a practical solution for collecting multiple observations on a large scale. Instead, automatic personality detection, while overcoming these limitations, is based on off-the-shelf solutions trained on non-technical corpora, which might not be readily applicable to technical domains like software engineering. In this paper, we first assess the performance of general-purpose personality detection tools when applied to a technical corpus of developers’ emails retrieved from the public archives of the Apache Software Foundation. We observe a general low accuracy of predictions and an overall disagreement among the tools. Second, we replicate two previous research studies in software engineering by replacing the personality detection tool used to infer developers’ personalities from pull-request discussions and emails. We observe that the original results are not confirmed, i.e., changing the tool used in the original study leads to diverging conclusions. Our results suggest a need for personality detection tools specially targeted for the software engineering domain

    A comparative developmental approach to multimodal communication in chimpanzees (Pan troglodytes)

    Get PDF
    Studying how communication of our closest relatives, the great-apes, develops can inform our understanding of the socio-ecological drivers shaping language evolution. However, despite a now recognized ability of great apes to produce multimodal signal combinations, a key feature of human language, we lack knowledge about when or how this ability manifests throughout ontogeny. In this thesis, I aimed to address this issue by examining the development of multimodal signal combinations (also referred to as multimodal combinations) in chimpanzees. To establish an ontogenetic trajectory of combinatorial signalling, my first empirical study examined age and context related variation in the production of multimodal combinations in relation to unimodal signals. Results showed that older individuals used multimodal combinations at significantly higher frequencies than younger individuals although the unimodal signalling remained dominant. In addition, I found a strong influence of playful and aggressive contexts on multimodal communication, supporting previous suggestions that combinations function to disambiguate messages in high-stakes interactions. Subsequently, I looked at influences in the social environment which may contribute to patterns of communication development. I turned first to the mother-infant relationship which characterises early infancy before moving onto interactive behaviour in the wider social environment and the role of multimodal combinations in communicative interactions. Results indicate that mothers support the development of communicative signalling in their infants, transitioning from more action-based to signalling behaviours with infant age. Furthermore, mothers responded more to communicative signals than physical actions overall, which may help young chimpanzees develop effective communication skills. Within the wider community, I found that interacting with a wider number of individuals positively influenced multimodal combination production. Moreover, in contrast to the literature surrounding unimodal signals, these multimodal signals appeared highly contextually specific. Finally, I found that within communicative interactions, young chimpanzees showed increasing awareness of recipient visual orientation with age, producing multimodal combinations most often when the holistic signal could be received. Moreover, multimodal combinations were more effective in soliciting recipient responses and satisfactory interactional outcomes irrespective of age. Overall, these findings highlight the relevance of studying ape communication development from a multimodal perspective and provide new evidence of developmental patterns that echo those seen in humans, while simultaneously highlighting important species differences. Multimodal communication development appears to be influenced by varying socio-environmental factors including the context and patterns of communicative interaction
    corecore