105 research outputs found

    Towards Solving Cocktail-Party: The First Method to Build a Realistic Dataset with Ground Truths for Speech Separation

    Get PDF
    Speech separation is very important in real-world applications such as human-machine interaction, hearing aids devices, and automatic meeting transcription. In recent years, a significant improvement occurred towards the solution based on deep learning. In fact, much attention has been drawn to supervised learning methods using synthetic mixtures datasets despite their being not representative of real-world mixtures. The difficulty in building a realistic dataset led researchers to use unsupervised learning methods, because of their ability to handle realistic mixtures directly. The results of unsupervised learning methods are still unconvincing. In this paper, a method is introduced to create a realistic dataset with ground truth sources for speech separation. The main challenge in designing a realistic dataset is the unavailability of ground truths for speakers’ signals. To address this, we propose a method for simultaneously recording two speakers and obtaining the ground truth for each. We present a methodology for benchmarking our realistic dataset using a deep learning model based on Bidirectional Gated Recurrent Units (BGRU) and clustering algorithm. The experiments show that our proposed dataset improved SI-SDR (Scale Invariant Signal to Distortion Ratio) by 1.65 dB and PESQ (Perceptual Evaluation of Speech Quality) by approximately 0.5. We also evaluated the effectiveness of our method at different distances between the microphone and the speakers, and found that it improved the stability of the learned model

    Phonetic Segmentation using a Wavelet-based Speech Cepstral Features and Sparse Representation Classifier, Journal of Telecommunications and Information Technology, 2021, nr 4

    Get PDF
    Speech segmentation is the process of dividing speech signal into distinct acoustic blocks that could be words, syllables or phonemes. Phonetic segmentation is about finding the exact boundaries for the different phonemes that composes a specific speech signal. This problem is crucial for many applications, i.e. automatic speech recognition (ASR). In this paper we propose a new model-based text independent phonetic segmentation method based on wavelet packet speech parametrization features and using the sparse representation classifier (SRC). Experiments were performed on two datasets, the first is an English one derived from TIMIT corpus, while the second is an Arabic one derived from the Arabic speech corpus. Results showed that the proposed wavelet packet de composition features outperform the MFCC features in speech segmentation task, in terms of both F1-score and R-measure on both datasets. Results also indicate that the SRC gives higher hit rate than the famous k-Nearest Neighbors (k-NN) classifier on TIMIT datase

    Incoherent Discriminative Dictionary Learning for Speech Enhancement, Journal of Telecommunications and Information Technology, 2018, nr 3

    Get PDF
    Speech enhancement is one of the many challenging tasks in signal processing, especially in the case of nonstationary speech-like noise. In this paper a new incoherent discriminative dictionary learning algorithm is proposed to model both speech and noise, where the cost function accounts for both “source confusion” and “source distortion” errors, with a regularization term that penalizes the coherence between speech and noise sub-dictionaries. At the enhancement stage, we use sparse coding on the learnt dictionary to find an estimate for both clean speech and noise amplitude spectrum. In the final phase, the Wiener filter is used to refine the clean speech estimate. Experiments on the Noizeus dataset, using two objective speech enhancement measures: frequency-weighted segmental SNR and Perceptual Evaluation of Speech Quality (PESQ) demonstrate that the proposed algorithm outperforms other speech enhancement methods tested

    Long-term virological outcome in children on antiretroviral therapy in the UK and Ireland

    Get PDF
    Objective: To assess factors at the start of antiretroviral therapy (ART) associated with long-term virological response in children. Design: Multicentre national cohort. Methods: Factors associated with viral load below 400 copies/ml by 12 months and virologic failure among children starting 3/4-drug ART in the UK/Irish Collaborative HIV Paediatric Study were assessed using Poisson models. Results: Nine hundred and ninety-seven children started ART at a median age of 7.7 years (inter-quartile range 2.9–11.7), 251 (25%) below 3 years: 411 (41%) with efavirenz and two nucleoside reverse transcriptase inhibitors (EFVþ2NRTIs), 264 (26%) with nevirapine and two NRTIs (NVPþ2NRTIs), 119 (12%; 106 NVP, 13 EFV) with non-nucleoside reverse transcriptase inhibitor and three NRTIs (NNRTIþ3NRTIs), and 203 (20%) with boosted protease inhibitor-based regimens. Median follow-up after ART initiation was 5.7 (3.0–8.8) years. Viral load was less than 400 copies/ml by 12 months in 92% [95% confidence interval (CI) 91–94%] of the children. Time to suppression was similar across regimens (P¼0.10), but faster over calendar time, with older age and lower baseline viral load. Three hundred and thirtynine (34%) children experienced virological failure. Although progression to failure varied by regimen (P<0.001) and was fastest for NVPþ2NRTIs regimens, risk after 2 years on therapy was similar for EFVþ2NRTIs and NVPþ2NRTIs, and lowest for NNRTIþ3NRTIs regimens (P-interaction¼0.03). Older age, earlier calendar periods and maternal ART exposure were associated with increased failure risk. Early treatment discontinuation for toxicity occurred more frequently for NVP-based regimens, but 5-year cumulative incidence was similar: 6.1% (95% CI 3.9–8.9%) NVP, 8.3% (95% CI 5.6–11.6) EFV, and 9.8% (95% CI 5.7–15.3%) protease inhibitor-based regimens (P¼0.48). Conclusion: Viral load suppression by 12 months was high with all regimens. NVPþ3NRTIs regimens were particularly efficacious in the longer term and may be a good alternative to protease inhibitor-based ART in young children

    BREATHER (PENTA 16) short-cycle therapy (SCT) (5 days on/2 days off) in young people with chronic human immunodeficiency virus infection: an open, randomised, parallel-group Phase II/III trial.

    Get PDF
    BACKGROUND: For human immunodeficiency virus (HIV)-infected adolescents facing lifelong antiretroviral therapy (ART), short-cycle therapy (SCT) with long-acting agents offers the potential for drug-free weekends, less toxicity, better adherence and cost savings. OBJECTIVES: To determine whether or not efavirenz (EFV)-based ART in short cycles of 5 days on and 2 days off is as efficacious (in maintaining virological suppression) as continuous EFV-based ART (continuous therapy; CT). Secondary objectives included the occurrence of new clinical HIV events or death, changes in immunological status, emergence of HIV drug resistance, drug toxicity and changes in therapy. DESIGN: Open, randomised, non-inferiority trial. SETTING: Europe, Thailand, Uganda, Argentina and the USA. PARTICIPANTS: Young people (aged 8-24 years) on EFV plus two nucleoside reverse transcriptase inhibitors and with a HIV-1 ribonucleic acid level [viral load (VL)] of  12 months. INTERVENTIONS: Young people were randomised to continue daily ART (CT) or change to SCT (5 days on, 2 days off ART). MAIN OUTCOME MEASURES: Follow-up was for a minimum of 48 weeks (0, 4 and 12 weeks and then 12-weekly visits). The primary outcome was the difference between arms in the proportion with VL > 50 copies/ml (confirmed) by 48 weeks, estimated using the Kaplan-Meier method (12% non-inferiority margin) adjusted for region and age. RESULTS: In total, 199 young people (11 countries) were randomised (n = 99 SCT group, n = 100 CT group) and followed for a median of 86 weeks. Overall, 53% were male; the median age was 14 years (21% ≥ 18 years); 13% were from the UK, 56% were black, 19% were Asian and 21% were Caucasian; and the median CD4% and CD4 count were 34% and 735 cells/mm(3), respectively. By week 48, only one participant (CT) was lost to follow-up. The SCT arm had a 27% decreased drug exposure as measured by the adherence questionnaire and a MEMSCap(™) Medication Event Monitoring System (MEMSCap Inc., Durham, NC, USA) substudy (median cap openings per week: SCT group, n = 5; CT group, n = 7). By 48 weeks, six participants in the SCT group and seven in the CT group had a confirmed VL > 50 copies/ml [difference -1.2%, 90% confidence interval (CI) -7.3% to 4.9%] and two in the SCT group and four in the CT group had a confirmed VL > 400 copies/ml (difference -2.1%, 90% CI -6.2% to 1.9%). All six participants in the SCT group with a VL > 50 copies/ml resumed daily ART, of whom five were resuppressed, three were on the same regimen and two with a switch; two others on SCT resumed daily ART for other reasons. Overall, three participants in the SCT group and nine in the CT group (p = 0.1) changed ART regimen, five because of toxicity, four for simplification reasons, two because of compliance issues and one because of VL failure. Seven young people (SCT group, n = 2; CT group, n = 5) had major non-nucleoside reverse transcriptase inhibitor mutations at VL failure, of whom two (n = 1 SCT group, n = 1 CT group) had the M184V mutation. Two young people had new Centers for Disease Control B events (SCT group, n = 1; CT group, n = 1). There were no significant differences between SCT and CT in grade 3/4 adverse events (13 vs. 14) or in serious adverse events (7 vs. 6); there were fewer ART-related adverse events in the SCT arm (2 vs. 14; p = 0.02). At week 48 there was no evidence that SCT led to increased inflammation using an extensive panel of markers. Young people expressed a strong preference for SCT in a qualitative substudy and in pre- and post-trial questionnaires. In total, 98% of the young people are taking part in a 2-year follow-up extension of the trial. CONCLUSIONS: Non-inferiority of VL suppression in young people on EFV-based first-line ART with a VL of < 50 copies/ml was demonstrated for SCT compared with CT, with similar resistance, safety and inflammatory marker profiles. The SCT group had fewer ART-related adverse events. Further evaluation of the immunological and virological impact of SCT is ongoing. A limitation of the trial is that the results cannot be generalised to settings where VL monitoring is either not available or infrequent, nor to use of low-dose EFV. Two-year extended follow-up of the trial is ongoing to confirm the durability of the SCT strategy. Further trials of SCT in settings with infrequent VL monitoring and with other antiretroviral drugs such as tenofovir alafenamide, which has a long intracellular half-life, and/or dolutegravir, which has a higher barrier to resistance, are planned. TRIAL REGISTRATION: Current Controlled Trials ISRCTN97755073; EUDRACT 2009-012947-40; and CTA 27505/0005/001-0001. FUNDING: This project was funded by the National Institute for Health Research (NIHR) Health Technology Assessment programme (projects 08/53/25 and 11/136/108), the European Commission through EuroCoord (FP7/2007/2015), the Economic and Social Research Council, the PENTA Foundation, the Medical Research Council and INSERM SC10-US19, France, and will be published in full in Health Technology Assessment; Vol. 20, No. 49. See the NIHR Journals Library website for further project information
    corecore