6,837 research outputs found

    Automatic Detection of Laryngeal Pathology on Sustained Vowels Using Short-Term Cepstral Parameters: Analysis of Performance and Theoretical Justification

    Get PDF
    The majority of speech signal analysis procedures for automatic detection of laryngeal pathologies mainly rely on parameters extracted from time domain processing. Moreover, calculation of these parameters often requires prior pitch period estimation; therefore, their validity heavily depends on the robustness of pitch detection. Within this paper, an alternative approach based on cepstral- domain processing is presented which has the advantage of not requiring pitch estimation, thus providing a gain in both simplicity and robustness. While the proposed scheme is similar to solutions based on Mel-frequency cepstral parameters, already present in literature, it has an easier physical interpretation while achieving similar performance standards

    Learned and handcrafted features for early-stage laryngeal SCC diagnosis

    Get PDF
    Squamous cell carcinoma (SCC) is the most common and malignant laryngeal cancer. An early-stage diagnosis is of crucial importance to lower patient mortality and preserve both the laryngeal anatomy and vocal-fold function. However, this may be challenging as the initial larynx modifications, mainly concerning the mucosa vascular tree and the epithelium texture and color, are small and can pass unnoticed to the human eye. The primary goal of this paper was to investigate a learning-based approach to early-stage SCC diagnosis, and compare the use of (i) texture-based global descriptors, such as local binary patterns, and (ii) deep-learning-based descriptors. These features, extracted from endoscopic narrow-band images of the larynx, were classified with support vector machines as to discriminate healthy, precancerous, and early-stage SCC tissues. When tested on a benchmark dataset, a median classification recall of 98% was obtained with the best feature combination, outperforming the state of the art (recall = 95%). Despite further investigation is needed (e.g., testing on a larger dataset), the achieved results support the use of the developed methodology in the actual clinical practice to provide accurate early-stage SCC diagnosis. [Figure not available: see fulltext.]

    Automatic Workflow for Narrow-Band Laryngeal Video Stitching

    Get PDF
    In narrow band (NB) laryngeal endoscopy, the clinician usually positions the endoscope near the tissue for a correct inspection of possible vascular pattern alterations, indicative of laryngeal malignancies. The video is usually reviewed many times to refine the diagnosis, resulting in loss of time since the salient frames of the video are mixed with blurred, noisy, and redundant frames caused by the endoscope movements. The aim of this work is to provide to the clinician a unique larynx panorama, obtained through an automatic frame selection strategy to discard non-informative frames. Anisotropic diffusion filtering was exploited to lower the noise level while encouraging the selection of meaningful image features, and a feature-based stitching approach was carried out to generate the panorama. The frame selection strategy, tested on on six pathological NB endoscopic videos, was compared with standard strategies, as uniform and random sampling, showing higher performance of the subsequent stitching procedure, both visually, in terms of vascular structure preservation, and numerically, through a blur estimation metric

    Artificial neural network-statistical approach for PET volume analysis and classification

    Get PDF
    Copyright © 2012 The Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.This article has been made available through the Brunel Open Access Publishing Fund.The increasing number of imaging studies and the prevailing application of positron emission tomography (PET) in clinical oncology have led to a real need for efficient PET volume handling and the development of new volume analysis approaches to aid the clinicians in the clinical diagnosis, planning of treatment, and assessment of response to therapy. A novel automated system for oncological PET volume analysis is proposed in this work. The proposed intelligent system deploys two types of artificial neural networks (ANNs) for classifying PET volumes. The first methodology is a competitive neural network (CNN), whereas the second one is based on learning vector quantisation neural network (LVQNN). Furthermore, Bayesian information criterion (BIC) is used in this system to assess the optimal number of classes for each PET data set and assist the ANN blocks to achieve accurate analysis by providing the best number of classes. The system evaluation was carried out using experimental phantom studies (NEMA IEC image quality body phantom), simulated PET studies using the Zubal phantom, and clinical studies representative of nonsmall cell lung cancer and pharyngolaryngeal squamous cell carcinoma. The proposed analysis methodology of clinical oncological PET data has shown promising results and can successfully classify and quantify malignant lesions.This study was supported by the Swiss National Science Foundation under Grant SNSF 31003A-125246, Geneva Cancer League, and the Indo Swiss Joint Research Programme ISJRP 138866. This article is made available through the Brunel Open Access Publishing Fund

    Analysis of a Modern Voice Morphing Approach using Gaussian Mixture Models for Laryngectomees

    Full text link
    This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping the coefficients from source to destination. However, the use of the traditional/conventional GMM-based mapping approach results in the problem of over-smoothening of the converted voice. Thus, we hereby propose a unique method to perform efficient voice morphing and conversion based on GMM,which overcomes the traditional-method effects of over-smoothening. It uses a technique of glottal waveform separation and prediction of excitations and hence the result shows that not only over-smoothening is eliminated but also the transformed vocal tract parameters match with the target. Moreover, the synthesized speech thus obtained is found to be of a sufficiently high quality. Thus, voice morphing based on a unique GMM approach has been proposed and also critically evaluated based on various subjective and objective evaluation parameters. Further, an application of voice morphing for Laryngectomees which deploys this unique approach has been recommended by this paper.Comment: 6 pages, 4 figures, 4 tables; International Journal of Computer Applications Volume 49, Number 21, July 201

    Confident texture-based laryngeal tissue classification for early stage diagnosis support

    Get PDF
    none8siopenMoccia, Sara; De Momi, Elena; Guarnaschelli, Marco; Savazzi, Matteo; Laborai, Andrea; Guastini, Luca; Peretti, Giorgio; Mattos, Leonardo S.Moccia, Sara; De Momi, Elena; Guarnaschelli, Marco; Savazzi, Matteo; Laborai, Andrea; Guastini, Luca; Peretti, Giorgio; Mattos, Leonardo S
    corecore