4 research outputs found

    Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction

    Full text link
    Self-supervised learning (SSL) has emerged as a promising paradigm for learning flexible speech representations from unlabeled data. By designing pretext tasks that exploit statistical regularities, SSL models can capture useful representations that are transferable to downstream tasks. This study provides an empirical analysis of Barlow Twins (BT), an SSL technique inspired by theories of redundancy reduction in human perception. On downstream tasks, BT representations accelerated learning and transferred across domains. However, limitations exist in disentangling key explanatory factors, with redundancy reduction and invariance alone insufficient for factorization of learned latents into modular, compact, and informative codes. Our ablations study isolated gains from invariance constraints, but the gains were context-dependent. Overall, this work substantiates the potential of Barlow Twins for sample-efficient speech encoding. However, challenges remain in achieving fully hierarchical representations. The analysis methodology and insights pave a path for extensions incorporating further inductive priors and perceptual principles to further enhance the BT self-supervision framework.Comment: 13 pages, 5 figures, in submission to MDPI Informatio

    Speaker Recognition Using Machine Learning Techniques

    Get PDF
    Speaker recognition is a technique of identifying the person talking to a machine using the voice features and acoustics. It has multiple applications ranging in the fields of Human Computer Interaction (HCI), biometrics, security, and Internet of Things (IoT). With the advancements in technology, hardware is getting powerful and software is becoming smarter. Subsequently, the utilization of devices to interact effectively with humans and performing complex calculations is also increasing. This is where speaker recognition is important as it facilitates a seamless communication between humans and computers. Additionally, the field of security has seen a rise in biometrics. At present, multiple biometric techniques co-exist with each other, for instance, iris, fingerprint, voice, facial, and more. Voice is one metric which apart from being natural to the users, provides comparable and sometimes even higher levels of security when compared to some traditional biometric approaches. Hence, it is a widely accepted form of biometric technique and is constantly being studied by scientists for further improvements. This study aims to evaluate different pre-processing, feature extraction, and machine learning techniques on audios recorded in unconstrained and natural environments to determine which combination of these works well for speaker recognition and classification. Thus, the report presents several methods of audio pre- processing like trimming, split and merge, noise reduction, and vocal enhancements to enhance the audios obtained from real-world situations. Additionally, a text-independent approach is used in this research which makes the model flexible to multiple languages. Mel Frequency Cepstral Coefficients (MFCC) are extracted for each audio, along with their differentials and accelerations to evaluate machine learning classification techniques such as kNN, Support Vector Machines, and Random Forest Classifiers. Lastly, the approaches are evaluated against existing research to study which techniques performs well on these sets of audio recordings

    Towards sustainable e-learning platforms in the context of cybersecurity: A TAM-driven approach

    Get PDF
    The rapid growth of electronic learning (e-learning) platforms has raised concerns about cybersecurity risks. The vulnerability of university students to cyberattacks and privacy concerns within e-learning platforms presents a pressing issue. Students’ frequent and intense internet presence, coupled with their extensive computer usage, puts them at higher risk of being a potential victim of cyberattacks. This problem necessitates a deeper understanding in order to enhance cybersecurity measures and safeguard students’ privacy and intellectual property in educational environments. This dissertation work addresses the following research questions: (a) To what extent do cybersecurity perspectives affect student’s intention to use e-learning platforms? (b) To what extent do students’ privacy concerns affect their intention to use e-learning platforms? (c) To what extent does students’ cybersecurity awareness affect their intention to use e-learning platforms? (d) To what extent do academic integrity concerns affect their intention to use e-learning platforms? and (e) To what extent does students’ computer self-efficacy affect their intention to use e-learning platforms? This study was conducted using an enhanced version of the technology acceptance model (TAM3) to examine the factors influencing students’ intention to use e-learning platforms. The study involved undergraduate and graduate students at Eastern Michigan University, and data were collected through a web-based questionnaire. The questionnaire was developed using the Qualtrics tool and included validated measures and scales with close-ended questions. The collected data were analyzed using SPSS 28, and the significance level for hypothesis testing was set at 0.05. Out of 6,800 distributed surveys, 590 responses were received, and after data cleaning, 582 responses were included in the final sample. The findings revealed that cybersecurity perspectives, cybersecurity awareness, academic integrity concerns, and computer self-efficacy significantly influenced students’ intention to use e-learning platforms. The study has implications for practitioners, educators, and researchers involved in designing secure e-learning platforms, emphasizing the importance of cybersecurity and recommending effective cybersecurity training programs to enhance user engagement. Overall, the study highlights the role of cybersecurity in promoting the adoption and usage of e-learning platforms, providing valuable insights for developers and educators to create secure e-learning environments and benefiting stakeholders in the e-learning industry
    corecore