    COVID-19 Diagnosis Using Spectral and Statistical Analysis of Cough Recordings Based on the Combination of SVD and DWT

    تستخدم الإشارات الصوتية التي يولدها جسم الإنسان بشكل روتيني من قبل التخصصين في البحوث والتطبيقات الصحية  للمساعدة في تشخيص بعض  الامراض أو تقييم تقدم المرض. وبالنظر إلى التقنيات الجديدة ، من الممكن في الوقت الحاضر جمع الأصوات التي يولدها الإنسان ، مثل السعال. ويمكن بعد ذلك اعتماد تقنيات التعلم الآلي المستندة إلى الصوت من أجل التحليل التلقائي للبيانات التي تم جمعها مما يوفر معلومات قيمة غنية من إشارة السعال واستخراج الميزات الفعالة من فترة زمنية محدودة الطول تتغير كدالة للوقت. في هذا البحث يتم اقتراح وتقديم خوارزمية  للكشف عن COVID-19 وتشخيصه من خلال معالجة السعال الذي يتم جمعه من المرضى الذين يعانون من الأعراض الأكثر شيوعًا لهذا الوباء. تعتمد الطريقة المقترحة على اعتماد مزيج من تحليل القيمة المفردة (SVD) وتحويل المويجات المنفصل (DWT).  وقد أدى الجمع بين هاتين التقنيتين لمعالجة الإشارات إلى اتباع نهج جيد للتعرف على السعال ، حيث يولد ويستخدم الحد الأدنى من الميزات الفعالة. وفي هذه الخوارزمية المقترحة يتم تطبيق الترددات المتوسطة (mean and median)، والمعروفة بأنها أكثر الميزات المفيدة في مجال التردد ، لإنشاء مقياس إحصائي فعال لمقارنة النتائج. بالإضافة إلى الحصول على معدل كشف وتمييز عاليين ، تتميز الخوارزمية المقترحة بكفاءتها حيث يتم تحقيق تخفيض 200 مرة، من حيث عدد العمليات. على الرغم من حقيقة أن أعراض الأشخاص المصابين وغير المصابين في الدراسة بها الكثير من أوجه التشابه ، فإن نتائج التشخيص التي تم الحصول عليها من تطبيق نهجنا تُظهر معدل تشخيص مرتفعًا، والذي تم إثباته من خلال مطابقتها مع اختبارات PCR ذات الصلة. نعتقد أنه يمكن تحقيق أداء أفضل من خلال توسيع مجموعة البيانات ، مع تضمين الأشخاص الأصحاء.Healthcare professionals routinely use audio signals, generated by the human body, to help diagnose disease or assess its progression. With new technologies, it is now possible to collect human-generated sounds, such as coughing. Audio-based machine learning technologies can be adopted for automatic analysis of collected data. Valuable and rich information can be obtained from the cough signal and extracting effective characteristics from a finite duration time interval that changes as a function of time. This article presents a proposed approach to the detection and diagnosis of COVID-19 through the processing of cough collected from patients suffering from the most common symptoms of this pandemic. The proposed method is based on adopting a combination of Singular Value Decomposition (SVD), and Discrete Wavelet Transform (DWT). The combination of these two signal processing techniques is gaining lots of interest in the field of speaker and speech recognition. As a cough recognition approach, we found it well-performing, as it generates and utilizes an efficient minimum number of features. Mean and median frequencies, which are known to be the most useful features in the frequency domain, are applied to generate an effective statistical measure to compare the results. The hybrid structure of DWT and SVD, adopted in this approach adds to its efficiency, where a 200 times reduction, in terms of the number of operations, is achieved. Despite the fact that symptoms of the infected and non-infected people used in the study are having lots of similarities, diagnosis results obtained from the application of the proposed approach show high diagnosis rate, which is proved through the matching with relevant PCR tests.  The proposed approach is open for more improvements with its performance further assured by enlarging the dataset, while including healthy people

    Protection of Records and Data Authentication based on Secret Shares and Watermarking

    The rapid growth in communication technology facilitates the health industry in many aspects from transmission of sensor’s data to real-time diagnosis using cloud-based frameworks. However, the secure transmission of data and its authenticity become a challenging task, especially, for health-related applications. The medical information must be accessible to only the relevant healthcare staff to avoid any unfortunate circumstances for the patient as well as for the healthcare providers. Therefore, a method to protect the identity of a patient and authentication of transmitted data is proposed in this study. The proposed method provides dual protection. First, it encrypts the identity using Shamir’s secret sharing scheme without the increase in dimension of the original identity. Second, the identity is watermarked using zero-watermarking to avoid any distortion into the host signal. The experimental results show that the proposed method encrypts, embeds and extracts identities reliably. Moreover, in case of malicious attack, the method distorts the embedded identity which provides a clear indication of fabrication. An automatic disorder detection system using Mel-frequency cepstral coefficients and Gaussian mixture model is also implemented which concludes that malicious attacks greatly impact on the accurate diagnosis of disorders

    Digital Watermarking for Verification of Perception-based Integrity of Audio Data

    In certain application fields digital audio recordings contain sensitive content. Examples are historical archival material in public archives that preserve our cultural heritage, or digital evidence in the context of law enforcement and civil proceedings. Because of the powerful capabilities of modern editing tools for multimedia such material is vulnerable to doctoring of the content and forgery of its origin with malicious intent. Also inadvertent data modification and mistaken origin can be caused by human error. Hence, the credibility and provenience in terms of an unadulterated and genuine state of such audio content and the confidence about its origin are critical factors. To address this issue, this PhD thesis proposes a mechanism for verifying the integrity and authenticity of digital sound recordings. It is designed and implemented to be insensitive to common post-processing operations of the audio data that influence the subjective acoustic perception only marginally (if at all). Examples of such operations include lossy compression that maintains a high sound quality of the audio media, or lossless format conversions. It is the objective to avoid de facto false alarms that would be expectedly observable in standard crypto-based authentication protocols in the presence of these legitimate post-processing. For achieving this, a feasible combination of the techniques of digital watermarking and audio-specific hashing is investigated. At first, a suitable secret-key dependent audio hashing algorithm is developed. It incorporates and enhances so-called audio fingerprinting technology from the state of the art in contentbased audio identification. The presented algorithm (denoted as ”rMAC” message authentication code) allows ”perception-based” verification of integrity. This means classifying integrity breaches as such not before they become audible. As another objective, this rMAC is embedded and stored silently inside the audio media by means of audio watermarking technology. This approach allows maintaining the authentication code across the above-mentioned admissible post-processing operations and making it available for integrity verification at a later date. For this, an existent secret-key ependent audio watermarking algorithm is used and enhanced in this thesis work. To some extent, the dependency of the rMAC and of the watermarking processing from a secret key also allows authenticating the origin of a protected audio. To elaborate on this security aspect, this work also estimates the brute-force efforts of an adversary attacking this combined rMAC-watermarking approach. The experimental results show that the proposed method provides a good distinction and classification performance of authentic versus doctored audio content. It also allows the temporal localization of audible data modification within a protected audio file. The experimental evaluation finally provides recommendations about technical configuration settings of the combined watermarking-hashing approach. Beyond the main topic of perception-based data integrity and data authenticity for audio, this PhD work provides new general findings in the fields of audio fingerprinting and digital watermarking. The main contributions of this PhD were published and presented mainly at conferences about multimedia security. These publications were cited by a number of other authors and hence had some impact on their works

    Deep Transfer Learning for Automatic Speech Recognition: Towards Better Generalization

    Automatic speech recognition (ASR) has recently become an important challenge when using deep learning (DL). It requires large-scale training datasets and high computational and storage resources. Moreover, DL techniques and machine learning (ML) approaches in general, hypothesize that training and testing data come from the same domain, with the same input feature space and data distribution characteristics. This assumption, however, is not applicable in some real-world artificial intelligence (AI) applications. Moreover, there are situations where gathering real data is challenging, expensive, or rarely occurring, which can not meet the data requirements of DL models. deep transfer learning (DTL) has been introduced to overcome these issues, which helps develop high-performing models using real datasets that are small or slightly different but related to the training data. This paper presents a comprehensive survey of DTL-based ASR frameworks to shed light on the latest developments and helps academics and professionals understand current challenges. Specifically, after presenting the DTL background, a well-designed taxonomy is adopted to inform the state-of-the-art. A critical analysis is then conducted to identify the limitations and advantages of each framework. Moving on, a comparative study is introduced to highlight the current challenges before deriving opportunities for future research

    Multibiometric security in wireless communication systems

    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 05/08/2010.This thesis has aimed to explore an application of Multibiometrics to secured wireless communications. The medium of study for this purpose included Wi-Fi, 3G, and WiMAX, over which simulations and experimental studies were carried out to assess the performance. In specific, restriction of access to authorized users only is provided by a technique referred to hereafter as multibiometric cryptosystem. In brief, the system is built upon a complete challenge/response methodology in order to obtain a high level of security on the basis of user identification by fingerprint and further confirmation by verification of the user through text-dependent speaker recognition. First is the enrolment phase by which the database of watermarked fingerprints with memorable texts along with the voice features, based on the same texts, is created by sending them to the server through wireless channel. Later is the verification stage at which claimed users, ones who claim are genuine, are verified against the database, and it consists of five steps. Initially faced by the identification level, one is asked to first present one’s fingerprint and a memorable word, former is watermarked into latter, in order for system to authenticate the fingerprint and verify the validity of it by retrieving the challenge for accepted user. The following three steps then involve speaker recognition including the user responding to the challenge by text-dependent voice, server authenticating the response, and finally server accepting/rejecting the user. In order to implement fingerprint watermarking, i.e. incorporating the memorable word as a watermark message into the fingerprint image, an algorithm of five steps has been developed. The first three novel steps having to do with the fingerprint image enhancement (CLAHE with 'Clip Limit', standard deviation analysis and sliding neighborhood) have been followed with further two steps for embedding, and extracting the watermark into the enhanced fingerprint image utilising Discrete Wavelet Transform (DWT). In the speaker recognition stage, the limitations of this technique in wireless communication have been addressed by sending voice feature (cepstral coefficients) instead of raw sample. This scheme is to reap the advantages of reducing the transmission time and dependency of the data on communication channel, together with no loss of packet. Finally, the obtained results have verified the claims

    Recent Advances in Signal Processing

    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Content-based music structure analysis

    Anonymizing Speech: Evaluating and Designing Speaker Anonymization Techniques

    The growing use of voice user interfaces has led to a surge in the collection and storage of speech data. While data collection allows for the development of efficient tools powering most speech services, it also poses serious privacy issues for users as centralized storage makes private personal speech data vulnerable to cyber threats. With the increasing use of voice-based digital assistants like Amazon's Alexa, Google's Home, and Apple's Siri, and with the increasing ease with which personal speech data can be collected, the risk of malicious use of voice-cloning and speaker/gender/pathological/etc. recognition has increased. This thesis proposes solutions for anonymizing speech and evaluating the degree of the anonymization. In this work, anonymization refers to making personal speech data unlinkable to an identity while maintaining the usefulness (utility) of the speech signal (e.g., access to linguistic content). We start by identifying several challenges that evaluation protocols need to consider to evaluate the degree of privacy protection properly. We clarify how anonymization systems must be configured for evaluation purposes and highlight that many practical deployment configurations do not permit privacy evaluation. Furthermore, we study and examine the most common voice conversion-based anonymization system and identify its weak points before suggesting new methods to overcome some limitations. We isolate all components of the anonymization system to evaluate the degree of speaker PPI associated with each of them. Then, we propose several transformation methods for each component to reduce as much as possible speaker PPI while maintaining utility. We promote anonymization algorithms based on quantization-based transformation as an alternative to the most-used and well-known noise-based approach. Finally, we endeavor a new attack method to invert anonymization.Comment: PhD Thesis Pierre Champion | Universit\'e de Lorraine - INRIA Nancy | for associated source code, see https://github.com/deep-privacy/SA-toolki

    Get PDF
