115 research outputs found

    A novel lip geometry approach for audio-visual speech recognition

    Get PDF
    By identifying lip movements and characterizing their associations with speech sounds, the performance of speech recognition systems can be improved, particularly when operating in noisy environments. Various method have been studied by research group around the world to incorporate lip movements into speech recognition in recent years, however exactly how best to incorporate ,the additional visual information is still not known. This study aims to extend the knowledge of relationships between visual and speech information specifically using lip geometry information due to its robustness to head rotation and the fewer number of features required to represent movement. A new method has been developed to extract lip geometry information, to perform classification and to integrate visual and speech modalities. This thesis makes several contributions. First, this work presents a new method to extract lip geometry features using the combination ofa skin colour filter, a border following algorithm and a convex hull approach. The proposed method was found to improve lip shape extraction performance compared to existing approaches. Lip geometry features including height, width, ratio, area, perimeter and various combinations of these features were evaluated to determine which performs best when representing speech in the visual domain. Second, a novel template matching techniqLie able to adapt dynamic differences in the way words are uttered by speakers has been developed, which determines the best fit of an unseen feature signal to those stored in a database template. Third, following on evaluation of integration strategies, a novel method has been developed based on alternative decision fusion strategy, in which the outcome from the visual and speech modality is chosen by measuring the quality of audio based on kurtosis and skewness analysis and driven by white noise confusion. Finally, the performance of the new methods introduced in this work are evaluated using the CUAVE and LUNA-V data corpora under a range of different signal to noise ratio conditions using the NOISEX-92 dataset

    Statistical Analysis of Wind Power Density Based on the Weibull and Rayleigh Models of Selected Site in Malaysia

    Full text link
    <p class="MsoNoSpacing" style="margin-right: 2.4pt; text-align: justify; tab-stops: 467.8pt;"><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;; background: #F8F8F8;" lang="EN-US">The demand for electricity in Malaysia is growing in tandem with its Gross Domestic Product (GDP) growth. Malaysia is going to need even more energy as it strives to grow towards a high-income economy. </span><span style="font-size: 12.0pt; font-family: &quot;Times New Roman&quot;,&quot;serif&quot;;" lang="EN-US">Malaysia has taken steps to exploring the renewable energy (RE) including wind energy as an alternative source for generating electricity. In the present study, the wind energy potential of the site is statistically analyzed based on 1-year measured hourly time-series wind speed data. Wind data were&nbsp;obtained from the Malaysian Meteorological Department (MMD) weather stations at nine selected sites in Malaysia. The data were calculated by using the MATLAB programming to determine and generate the Weibull and Rayleigh distribution functions. Both Weibull and Rayleigh models are fitted and compared to the Field data probability distributions of year 2011. From the analysis, it was shown that the Weibull distribution is fitting the&nbsp;Field data&nbsp;better than the Rayleigh distribution for the whole year 2011. The wind power density of every site has been studied based on the Weibull and Rayleigh functions. The Weibull distribution shows a good approximation for estimation of wind power density in Malaysia.</span></p

    Image segmentation of womenā€™s salivary ferning patterns using harmony frangi filter

    Get PDF
    Medical research proves that entering the fertile period, especially during ovulation, all-female body fluids contain ferning patterns in the form of crystallization of salt shaped like a fern tree. Until now, not many research topics have been carried out related to the segmentation process in the salivary ferning pattern, this is due to several problems including first, the unavailability of a database of image salivary ferning pattern online. Second, the salivary ferning pattern has several hidden layers and uneven intensity. The purpose of this study was to detect and determine the line shape of the salivary ferning crystal pattern using the Harmony Frangi Filter method based on the Hessian matrix operation. The results of the segmentation process from this study are a crucial basis in determining the level of accuracy and precision at the next stage of research, namely: the prediction process of a womanā€™s ovulation in each menstrual cycle. The measurement of segmentation results has an average value of MSE 2.25, PSNR 44.86 dB, FSIM 0.954, accuracy 99.88%, sensitivity 99.98% and specificity 99.88%

    A novel fern-like lines detection using a hybrid of pre-trained convolutional neural network model and Frangi filter

    Get PDF
    Full ferning is the peak of the formation of a salt crystallization line pattern shaped like a fern tree in a womanā€™s saliva at the time of ovulation. The main problem in this study is how to detect the shape of the salivary ferning line patterns that are transparent, irregular and the surface lighting is uneven. This study aims to detect transparent and irregular lines on the salivary ferning surface using a comparison of 15 pre-trained convolutional neural network models. To detect fern-like lines on transparent and irregular layers, a pre-processing stage using the Frangi filter is required. The pre-trained convolutional neural network model is a promising framework with high precision and accuracy for detecting fern-like lines in salivary ferning. The results of this study using the fixed learning rate model ResNet50 showed the best performance with an error rate of 4.37% and an accuracy of 95.63%. Meanwhile, in implementing the automatic learning rate, ResNet18 achieved the best results with an error rate of 1.99% and an accuracy of 98.01%. The results of visual detection of fern-like lines in salivary ferning using a patch size of 34Ɨ34 pixels indicate that the ResNet34 model gave the best appearance

    In-The-Wild deepfake detection using adaptable CNN models with visual class activation mapping for improved accuracy

    Get PDF
    Deepfake technology has become increasingly sophisticated in recent years, making detecting fake images and videos challenging. This paper investigates the performance of adaptable convolutional neural network (CNN) models for detecting Deepfakes. In-the-wild OpenForensics dataset was used to evaluate four different CNN models (DenseNet121, ResNet18, SqueezeNet, and VGG11) at different batch sizes and with various performance metrics. Results show that the adapted VGG11 model with a batch size of 32 achieved the highest accuracy of 94.46% in detecting Deepfakes, outperforming the other models, with DenseNet121 as the second-best performer achieving an accuracy of 93.89% with the same batch size. Grad-CAM techniques are utilized to visualize the decision-making process within the models, aiding in understanding the Deepfake classification process. These findings provide valuable insights into the performance of different deep learning models and can guide the selection of an appropriate model for a specific application

    An attention-augmented convolutional neural network with focal loss for mixed-type wafer defect classification

    Get PDF
    Silicon wafer defect classification is crucial for improving fabrication and chip production. Although deep learning methods have been successful in single-defect wafer classification, the increasing complexity of the fabrication process has introduced the challenge of multiple defects on wafers, which requires more robust feature learning and classification techniques. Attention mechanisms have been used to enhance feature learning for multiple wafer defects. However, they have limited use in a few mixed-type defect categories, and their performance declines as the number of mixed patterns increases. This work proposes an attention-augmented convolutional neural networks (A2CNN) model for enhanced discriminative feature learning of complex defects. The A2CNN model emphasizes the features in the channel and spatial dimensions. Additionally, the model adopts the focal loss function to reduce misclassification and a global average pooling layer to enhance the network's generalization by reducing overfitting. The A2CNN model is evaluated on the MixedWM38 wafer defect dataset using 10-fold cross-validation. It achieves impressive results, with accuracy, precision, recall, and F1-score reported as 98.66%, 99.0%, 98.55%, and 98.82% respectively. Compared to existing works, the A2CNN model performs better by effectively learning valuable information for complex mixed-type wafer defects

    A Survey on Building Safety after Completing the Construction Process in Malaysia Using Statistical Approach

    Get PDF
    Building condition is an important issue in all over the world to enhance safety, health and sustainability of built environment. The objective of this study is to determine the most frequent causes of building failures in order to avoid the building from collapses, cracks and so on. The collection of data has been done among the engineers, workers and public. The questionnaire was distributed among engineers, contractors and public with 100 respondents. This survey focuses on two main parts of the safety which are building design and building management. The building designs are divided into four main criteria which are building structure, service design, building fitting and hazard environment. Meanwhile, the item of building management is focused on the management criteria. Results are analysed using statistical approach. Structural equation modeling (SEM) is used to evaluate the efficiency of the modelsā€™ fitness and goodness. The survey shows that all criteria are importantly needed in maintaining the safety of building after completing the contraction process

    Implementation of artiļ¬cial neural network to recognize numbers from voice

    Get PDF
    Speech recognition is a subjective phenomenon which also an important part of humanā€“machine interaction which still faces a lot of problem. The purpose of this work is to investigate and apply the artificial neural network (ANN) to recognise numbers using voice. In this work, MATLAB neural network toolbox is used to create, train and simulate the ANN. The dataset consisted a voice from ā€˜oneā€™ to ā€˜fiveā€™ undergo windowing process to view a short time segment of a longer signal and analyse its frequency content and then being filtered by using a band-pass filter to remove the unwanted noise and been converted into histograms as an input for the network. From the experiments, the highest accuracy level obtained is 72.5% by using histograms as Feature Extraction
    • ā€¦
    corecore