Search CORE

81,924 research outputs found

Real-Time Vocal Tract Modelling

Author: Benallal A.
Benkrid A.
Benkrid K.
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2007
Field of study

To date, most speech synthesis techniques have relied upon the representation of the vocal tract by some form of filter, a typical example being linear predictive coding (LPC). This paper describes the development of a physiologically realistic model of the vocal tract using the well-established technique of transmission line modelling (TLM). This technique is based on the principle of wave scattering at transmission line segment boundaries and may be used in one, two, or three dimensions. This work uses this technique to model the vocal tract using a one-dimensional transmission line. A six-port scattering node is applied in the region separating the pharyngeal, oral, and the nasal parts of the vocal tract

Directory of Open Access Journals

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Author: Barman Raphaël
Clematide Simon
Ehrmann Maud
Kaplan Frédéric
Oliveira Sofia Ares
Publication venue: 'Centre pour la Communication Scientifique Directe (CCSD)'
Publication date: 14/12/2020
Field of study

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration. Research work seeking to automatically process facsimiles and extract information thereby are multiplying with, as a first essential step, document layout analysis. If the identification and categorization of segments of interest in document images have seen significant progress over the last years thanks to deep learning techniques, many challenges remain with, among others, the use of finer-grained segmentation typologies and the consideration of complex, heterogeneous documents such as historical newspapers. Besides, most approaches consider visual features only, ignoring textual signal. In this context, we introduce a multimodal approach for the semantic segmentation of historical newspapers that combines visual and textual features. Based on a series of experiments on diachronic Swiss and Luxembourgish newspapers, we investigate, among others, the predictive power of visual and textual features and their capacity to generalize across time and sources. Results show consistent improvement of multimodal models in comparison to a strong visual baseline, as well as better robustness to high material variance

arXiv.org e-Print Archive

Measurement of retinal vessel widths from fundus images based on 2-D modeling

Author: Basu A.
Hunter Andrew
Kennedy R. L.
Lowell J.
Ryder R.
Steel D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Changes in retinal vessel diameter are an important sign of diseases such as hypertension, arteriosclerosis and diabetes mellitus. Obtaining precise measurements of vascular widths is a critical and demanding process in automated retinal image analysis as the typical vessel is only a few pixels wide. This paper presents an algorithm to measure the vessel diameter to subpixel accuracy. The diameter measurement is based on a two-dimensional difference of Gaussian model, which is optimized to fit a two-dimensional intensity vessel segment. The performance of the method is evaluated against Brinchmann-Hansen's half height, Gregson's rectangular profile and Zhou's Gaussian model. Results from 100 sample profiles show that the presented algorithm is over 30% more precise than the compared techniques and is accurate to a third of a pixel

ResearchOnline@JCU