8 research outputs found

    Contrast and color balance enhancement for non-uniform illumination retinal images

    Get PDF
    Color retinal images play an important role in supporting a medical diagnosis. However, some retinal images are unsuitable for diagnosis due to the non-uniform illumination. In order to solve this problem, we propose a method for improving non-uniform illumination that can enhance the image quality of a color fundus photograph suitable for reliable visual diagnosis. Firstly, a hidden anatomical structure in dark regions of the retinal images is revealed by improving the image luminosity with gamma correction. Secondly, multi-scale tone manipulation is then used to adjust the image contrast in the lightness channel of L*a*b* color space. Finally, color balance is adjusted by specifying the image brightness based on Hubbard’s specification. The performance of the applied method has been evaluated against the data from the DIARETDB1 dataset. The results obtained show that the proposed algorithm performs well for correcting the non-uniform illumination of color retinal images

    Evaluating biometrics fingerprint template protection for an emergency situation

    Get PDF
    Biometric template protection approaches have been developed to secure the biometric templates against image reconstruction on the stored templates. Two cancellable fingerprint template protection approaches namely minutiae-based bit-string cancellable fingerprint template and modified minutiae-based bit-string cancellable fingerprint template, are selected to be evaluated. Both approaches include the geometric information of the fingerprint into the extracted minutiae. Six modified fingerprint data sets are derived from the original fingerprint images in FVC2002DB1_B and FVC2002DB2_B by conducting the rotation and changing the quality of original fingerprint images according to the environment conditions during an emergency situation such as wet or dry fingers and disoriented angle of fingerprint images. The experimental results show that the modified minutiae-based bit-string cancellable fingerprint template performs well on all conditions during an emergency situation by achieving the matching accuracy between 83% and 100% on FVC2002DB1_B data set and between 99% and 100% on FVC2002DB2_B data set

    SIGNAL MODELING WITH NON-UNIFORM TIME SAMPLING OF FEATURES FOR AUTOMATIC SPEECH RECOGNITION

    Get PDF
    This dissertation presents an investigation of nonuniform time sampling methods for spectral/temporal feature extraction in speech. Frame-based features were computed based on an encoding of the global spectral shape using a Discrete Cosine Transform. In most current “standard” methods, trajectory (dynamic) features are determined from frame-based parameters using a fixed time sampling, i.e., fixed block length and fixed block spacing. In this research, new methods are proposed and investigated in which block length and/or block spacing are variable. The idea was initially tested with HMM-based isolated word recognition, and a significant performance improvement resulted when a variable block length and variable block method were applied. An accuracy of 97.9 % was obtained with an alphabet recognition task using the ISOLET database. This result i

    Person recognition using fingerprints and top-view finger images

    No full text
    Our multimodal biometric system combines fingerprinting with a top-view finger image captured by a CCD camera without user intervention. The greyscale image is preprocessed to enhance its edges, skin furrows, and the nail shape before being manipulated by a bank of oriented filters. A square tessellation is applied to the filtered image to create a feature map, called a NailCode, which is employed in Euclidean distance computations. The NailCode reduces system errors by 17.68% in the verification mode, and by 6.82% in the identification mode

    AN INVESTIGATION OF VARIABLE BLOCK LENGTH METHODS FOR CALCULATION OF SPECTRAL/TEMPORAL FEATURES FOR AUTOMATIC SPEECH RECOGNITION

    No full text
    This paper presents an investigation of non-uniform time sampling methods for spectral/temporal feature extraction for use in automatic speech recognition. In most current methods for signal modeling of speech information, “dynamic ” features are determined from frame-based parameters using a fixed time sampling, i.e., fixed block length and fixed block spacing. This work explores new methods in which block length and/or block spacing are variable. Three methods are suggested and each was tested with the TIMIT database using a standard HMM recognizer. Phone recognition experiments were conducted using the standard 39 phone set. The methods were also evaluated with various HMM model complexities. Experimental results indicated that none of the proposed nonuniform feature time sampling methods perform significantly better than fixed time sampling methods. However, the best results obtained with the front end are comparable to those obtained with current state-of-the-art systems. Also the performance of our monophone system surpasses that of most reported context-dependent monophone systems. 1

    Signal Modeling for Isolated Word Recognition

    No full text
    This paper presents speech signal modeling techniques which are well suited to high performance and robust isolated word recognition. Speech is encoded by a discrete cosine transform of its spectra, after several preprocessing steps. Temporal information is then also explicitly encoded into the feature set. We present a new technique for incorporating this temporal information as a function of temporal position within each word. We tested features computed with this method using an alphabet recognition task based on the ISOLET database. The HTK toolkit was used to implement the isolated word recognizer with whole word HMM models. The best result obtained based on 50 features and speaker independent alphabet recognition was 98.0%. Gaussian noise was added to the original speech to simulate a noisy environment. We achieved a recognition accuracy of 95.8 % at a SNR of 15 dB. We also tested our recognizer with simulated telephone quality speech by adding noise and band limiting the original speech. For this "telephone" speech, our recognizer achieved 89.6 % recognition accuracy. The recognizer was also tested in a speaker dependent mode, resulting in 97.4 % accuracy on test data. 1

    Development of a notification delivery specimen system for perioperative Thai nurses via the LINE application

    No full text
    Objective The aim of the study was to develop and examine satisfaction in using a notification delivery specimen system for perioperative Thai nurses through the LINE application. Methods Design and development research was used in the study and 100 perioperative nurses were recruited from the three operating theatres in hospital settings in Thailand. Data analysis was performed using descriptive statistics. Results The overall satisfaction in using a notification delivery specimen system for perioperative Thai nurses through the LINE application was at the high level (M = 4.09, SD = 0.75). The perioperative nurses reported ease of use and safety scored high (M = 4.24, SD = 0.62), followed by sharpness of figures and the coloured light alert (M = 4.15, SD = 0.92), sending messages via LINE notification, and delivering the specimen quickly within the time period (M = 4.10, SD = 0.69). Conclusion The notification delivery specimen system, designed specifically for perioperative Thai nurses and integrated with the LINE application, yielded exceptionally high levels of satisfaction among users. These promising results suggest the potential for widespread adoption in various hospital settings in the coming years

    Open Source Multi-Language Audio Database for Spoken Language Processing Applications

    No full text
    Abstract Over the past few decades, research in automatic speech recognition and automatic speaker recognition has been greatly facilitated by the sharing of large annotated speech databases such as those distributed by the Linguistic Data Consortium (LDC). Open sources, particularly web sites such as YouTube, contain vast and varied speech recordings in a variety of languages. However, these "open sources" for speech data are largely untapped as resources for speech research. In this paper, a project to collect, organize, and annotate a large group of this speech data is described. The data consists of approximately 30 hours of speech in each of three languages, English, Mandarin Chinese, and Russian. Each of 900 recordings has been orthographically transcribed at the sentence/phrase level by human listeners. Some of the issues related to working with this low quality, varied, noisy speech data in three languages are described
    corecore