777 research outputs found

    A Survey On Medical Digital Imaging Of Endoscopic Gastritis.

    Get PDF
    This paper focuses on researches related to medical digital imaging of endoscopic gastritis

    Vocal Folds Disorders Detection and Classification in Endoscopic Narrow-Band Images

    Get PDF
    The diagnosis of vocal folds (VF) diseases is error- prone due to the large variety of diseases that can affect them. VF lesions can be divided in nodular, e.g. nodules, polyps and cysts, and diffuse, e.g. hyperplastic laryngitis and carcinoma. By endoscopic examination, the clinician traditionally evaluates the presence of macroscopic formations and mucosal vessels alteration. Endoscopic narrow-band imaging (NBI) has recently started to be employed since it provides enhanced vessels contrast as compared to classical white-light endoscopy. This work presents a preliminary study on the development of an automatic diagnostic tool based on the assessment of vocal cords symmetry in NBI images. The objective is to identify possible protruding mass lesions on which subsequent vessels analysis may be performed. The method proposed here is based on the segmentation of the glottal area (GA) from the endoscopic images, based on which the right and the left portions of the vocal folds are detected and analyzed for the detection of protruding areas. The obtained information is then used to classify the VF edges as healthy or pathological. Results from the analysis of 22 endoscopic NBI images demonstrated that the proposed algorithm is robust and effective, providing a 100% success rate in the classification of VF edges as healthy or pathological. Such results support the investment in further research to expand and improve the algorithm presented here, potentially with the addition of vessels analysis to determine the pathological classification of detected protruding areas

    Interactive Development Of F0 As An Acoustic Cue For Korean Stop Contrast

    Get PDF
    Korean stop contrasts (lenis, fortis, and aspirated) have been phonetically differentiated by Voice Onset Time (VOT), but with a tonogenetic sound change in progress, the role of fundamental frequency (F0) has been amplified for distinguishing between Korean stop contrasts with the loss of VOT differentiation in young adults’ production. The present study explores how F0 is perceptually acquired and how it phonetically operates in toddler speech in relation to Korean stop contrasts according to age. In order to determine the relationship between F0 developmental patterns and age in child stop production, this study uses a quantitative acoustic model to examine the word-initial stop productions of 58 Korean monolingual children aged 20 to 47 months. The production experiment confirmed that VOT is useful for distinguishing fortis stops, but F0 is required for distinguishing between lenis and aspirated stops, and this tendency is significantly related to age. As F0 becomes a determinant acoustic parameter for articulatory distinction, the role of F0 in perceptual distinction was investigated through a perceptual identification test with the F0 continuum. Children were provided with selected lenis-aspirated pairs of images in which they would point to one or the other image in response to given synthetic sounds with different F0 values. This allowed us to observe how phonetic boundaries in the F0 dimension for aspirated stops change with age. A comparative analysis between children’s production and perception of F0 indicates that articulatory skills depend on perceived F0 differences depending on the phonemic categories. Additionally, the analysis indicates that once F0 is acquired, VOT differentiation diminishes for distinction between lenis and aspirated stops, and this trade-off between VOT and F0 would occur around the age of 3 years. These findings suggest that phonemic categorization of lenis and aspirated stops should be processed in the F0 dimension and that phonemic processing in perceptual acoustic space is directly linked to phonetic discrimination between the non-fortis stops in production. This study provides experimental evidence for understanding a developmental trajectory of F0 as an acoustic cue for native phonological contrasts

    Models and analysis of vocal emissions for biomedical applications: 5th International Workshop: December 13-15, 2007, Firenze, Italy

    Get PDF
    The MAVEBA Workshop proceedings, held on a biannual basis, collect the scientific papers presented both as oral and poster contributions, during the conference. The main subjects are: development of theoretical and mechanical models as an aid to the study of main phonatory dysfunctions, as well as the biomedical engineering methods for the analysis of voice signals and images, as a support to clinical diagnosis and classification of vocal pathologies. The Workshop has the sponsorship of: Ente Cassa Risparmio di Firenze, COST Action 2103, Biomedical Signal Processing and Control Journal (Elsevier Eds.), IEEE Biomedical Engineering Soc. Special Issues of International Journals have been, and will be, published, collecting selected papers from the conference

    Converging toward a common speech code: imitative and perceptuo-motor recalibration processes in speech production

    Get PDF
    International audienceAuditory and somatosensory systems play a key role in speech motor control. In the act of speaking, segmental speech movements are programmed to reach phonemic sensory goals, which in turn are used to estimate actual sensory feedback in order to further control production. The adult's tendency to automatically imitate a number of acoustic-phonetic characteristics in another speaker's speech however suggests that speech production not only relies on the intended phonemic sensory goals and actual sensory feedback but also on the processing of external speech inputs. These online adaptive changes in speech production, or phonetic convergence effects, are thought to facilitate conversational exchange by contributing to setting a common perceptuo-motor ground between the speaker and the listener. In line with previous studies on phonetic convergence, we here demonstrate, in a non-interactive situation of communication, online unintentional and voluntary imitative changes in relevant acoustic features of acoustic vowel targets (fundamental and first formant frequencies) during speech production and imitation. In addition, perceptuo-motor recalibration processes, or after-effects, occurred not only after vowel production and imitation but also after auditory categorization of the acoustic vowel targets. Altogether, these findings demonstrate adaptive plasticity of phonemic sensory-motor goals and suggest that, apart from sensory-motor knowledge, speech production continuously draws on perceptual learning from the external speech environment

    Co-registration of paired histological sections and MRI scans of the rabbit larynx

    Full text link
    Co-registering images of different modalities, termed intermodal image registration, is an important tool in improving our understanding of how certain features detectable in one modality might manifest in the other. However, structural changes – usually the result of tissue processing or noise in image acquisition – can make matching difficult. In this thesis, I outline a pre-processing protocol for co-registration of paired histological sections and MRI scans as well as discuss different co-registration strategies using the rabbit larynx as a model system

    Computer-assisted liver graft steatosis assessment via learning-based texture analysis

    Get PDF
    Purpose: Fast and accurate graft hepatic steatosis (HS) assessment is of primary importance for lowering liver dysfunction risks after transplantation. Histopathological analysis of biopsied liver is the gold standard for assessing HS, despite being invasive and time consuming. Due to the short time availability between liver procurement and transplantation, surgeons perform HS assessment through clinical evaluation (medical history, blood tests) and liver texture visual analysis. Despite visual analysis being recognized as challenging in the clinical literature, few efforts have been invested to develop computer-assisted solutions for HS assessment. The objective of this paper is to investigate the automatic analysis of liver texture with machine learning algorithms to automate the HS assessment process and offer support for the surgeon decision process. Methods: Forty RGB images of forty different donors were analyzed. The images were captured with an RGB smartphone camera in the operating room (OR). Twenty images refer to livers that were accepted and 20 to discarded livers. Fifteen randomly selected liver patches were extracted from each image. Patch size was 100 × 100. This way, a balanced dataset of 600 patches was obtained. Intensity-based features (INT), histogram of local binary pattern (HLBPriu2), and gray-level co-occurrence matrix (FGLCM) were investigated. Blood-sample features (Blo) were included in the analysis, too. Supervised and semisupervised learning approaches were investigated for feature classification. The leave-one-patient-out cross-validation was performed to estimate the classification performance. Results: With the best-performing feature set (HLBPriu2+INT+Blo) and semisupervised learning, the achieved classification sensitivity, specificity, and accuracy were 95, 81, and 88%, respectively. Conclusions: This research represents the first attempt to use machine learning and automatic texture analysis of RGB images from ubiquitous smartphone cameras for the task of graft HS assessment. The results suggest that is a promising strategy to develop a fully automatic solution to assist surgeons in HS assessment inside the OR

    Vocal imitation for query by vocalisation

    Get PDF
    PhD ThesisThe human voice presents a rich and powerful medium for expressing sonic ideas such as musical sounds. This capability extends beyond the sounds used in speech, evidenced for example in the art form of beatboxing, and recent studies highlighting the utility of vocal imitation for communicating sonic concepts. Meanwhile, the advance of digital audio has resulted in huge libraries of sounds at the disposal of music producers and sound designers. This presents a compelling search problem: with larger search spaces, the task of navigating sound libraries has become increasingly difficult. The versatility and expressive nature of the voice provides a seemingly ideal medium for querying sound libraries, raising the question of how well humans are able to vocally imitate musical sounds, and how we might use the voice as a tool for search. In this thesis we address these questions by investigating the ability of musicians to vocalise synthesised and percussive sounds, and evaluate the suitability of different audio features for predicting the perceptual similarity between vocal imitations and imitated sounds. In the first experiment, musicians were tasked with imitating synthesised sounds with one or two time–varying feature envelopes applied. The results show that participants were able to imitate pitch, loudness, and spectral centroid features accurately, and that imitation accuracy was generally preserved when the imitated stimuli combined two, non-necessarily congruent features. This demonstrates the viability of using the voice as a natural means of expressing time series of two features simultaneously. The second experiment consisted of two parts. In a vocal production task, musicians were asked to imitate drum sounds. Listeners were then asked to rate the similarity between the imitations and sounds from the same category (e.g. kick, snare etc.). The results show that drum sounds received the highest similarity ratings when rated against their imitations (as opposed to imitations of another sound), and overall more than half the imitated sounds were correctly identified with above chance accuracy from the imitations, although this varied considerably between drum categories. The findings from the vocal imitation experiments highlight the capacity of musicians to vocally imitate musical sounds, and some limitations of non– verbal vocal expression. Finally, we investigated the performance of different audio features as predictors of perceptual similarity between the imitations and imitated sounds from the second experiment. We show that features learned using convolutional auto–encoders outperform a number of popular heuristic features for this task, and that preservation of temporal information is more important than spectral resolution for differentiating between the vocal imitations and same–category drum sounds
    corecore