204,646 research outputs found

    Efficient smile detection by Extreme Learning Machine

    Get PDF
    Smile detection is a specialized task in facial expression analysis with applications such as photo selection, user experience analysis, and patient monitoring. As one of the most important and informative expressions, smile conveys the underlying emotion status such as joy, happiness, and satisfaction. In this paper, an efficient smile detection approach is proposed based on Extreme Learning Machine (ELM). The faces are first detected and a holistic flow-based face registration is applied which does not need any manual labeling or key point detection. Then ELM is used to train the classifier. The proposed smile detector is tested with different feature descriptors on publicly available databases including real-world face images. The comparisons against benchmark classifiers including Support Vector Machine (SVM) and Linear Discriminant Analysis (LDA) suggest that the proposed ELM based smile detector in general performs better and is very efficient. Compared to state-of-the-art smile detector, the proposed method achieves competitive results without preprocessing and manual registration

    Spoken language identification based on the enhanced self-adjusting extreme learning machine approach

    Get PDF
    Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%

    Air quality forecasting using neural networks

    Get PDF
    In this thesis project, a special type of neural network: Extreme Learning Machine (ELM) is implemented to predict the air quality based on the air quality time series itself and the external meteorological records. A regularized version of ELM with linear components is chosen to be the main model for prediction. To take full advantage of this model, its hyper-parameters are studied and optimized. Then a set of variables is selected (or constructed) to maximize the performance of ELM, where two different variable selection methods (i.e. wrapper and filtering methods) are evaluated. The wrapper method ELM-based forward selection is chosen for the variable selection. Meanwhile, a feature extraction method (Principal Component Analysis) is implemented in the hope of reducing the candidate meteorological variables for feature selection, which proves to be helpful. At last, with all the parameters being properly optimized, ELM is used for the prediction and generates satisfying results

    A Different Traditional Approach for Automatic Comparative Machine Learning in Multimodality Covid-19 Severity Recognition

    Get PDF
    In March 2020, the world health organization introduced a new infectious pandemic called “novel coronavirus disease” or “Covid-19”, origin dates back to World War II (1939) and spread from the city of Wuhan in China (2019). The severity of the outbreak affected the health of abundant folk worldwide. This bred the emergence of unimodal artificial intelligence approaches in the diagnosis of coronavirus disease but solely led to a significant percentage of false-negative results. In this paper, we combined 2500 Covid-19 multimodal data based on Early Fusion Type-I (EFT1) architecture as a severity recognition model for the classification task. We designed and implemented one-step systems of automatic comparative machine learning (AutoCML) and automatic comparative machine learning based on important feature selection (AutoIFSCML). We utilized our posed assessment method called “Descended Composite Scores Average (DCSA)”. In AutoCML, Extreme Gradient Boost (DCSA=0.998) and in AutoIFSCML, Random Forest (DCSA=0.960) demonstrated the best performance for multimodality Covid-19 severity recognition while 70% of the characteristics with high DCSA were chosen by the internal important features selection system (AutoIFS) to enter the AutoCML system. The DCSA-based designed systems can be useful in implementing fine-tuned machine learning models in medical processes by leveraging the capacities and performances of the model in all methods. As well as, ensemble learning made sounds good among evaluated traditional models in systems

    Well-log attributes assist in the determination of reservoir formation tops in wells with sparse well-log data

    Get PDF
    The manual picking of reservoir formation boundaries using limited available well-log data in multiple wells across gas and oil reservoirs tends to be subjective and unreliable. The reasons for this are typically caused by the combined effects of spatial boundary complexity and limited well-log data availability. Formation boundary characterization and classification can be improved when treated as a binary classification task based on two or three recorded well logs assisted by their calculated derivative and volatility attributes assessed by machine learning. Two example wellbores penetrating a complex reservoir boundary, one with gamma-ray, compressional-sonic, and bulk-density logs recorded, the other with just gamma-ray and bulk-density logs recorded, are used to illustrate a more rigorous proposed methodology. By combining attribute calculation, optimized feature selection, multi-k-fold cross validation, confusion matrices, feature-influence analysis, and machine learning models it is possible to improve the classification of the formation boundary. With just gamma-ray and bulk-density recorded well logs plus selected attributes. K-nearest neighbour, support vector classification, and extreme gradient boosting machine learning models are able to achieve high binary classification accuracy: greater than 0.97 for training/validation in one well; and greater than 0.94 for testing in another well. extreme gradient boosting feature-influence analysis reveals the attributes that are the most important in the formation boundary predictions but these are likely to vary from reservoir to reservoir. The results of the study suggest that well-log attribute analysis, combined with machine learning has the potential to provide a more systematic formation boundary definition than relying only on a few recorded well-log curves.Cited as: Wood., D. A. Well-log attributes assist in the determination of reservoir formation tops in wells with sparse well-log data. Advances in Geo-Energy Research, 2023, 8(1): 45-60. https://doi.org/10.46690/ager.2023.04.0

    Groundwater prediction using machine-learning tools

    Get PDF
    Predicting groundwater availability is important to water sustainability and drought mitigation. Machine-learning tools have the potential to improve groundwater prediction, thus enabling resource planners to: (1) anticipate water quality in unsampled areas or depth zones; (2) design targeted monitoring programs; (3) inform groundwater protection strategies; and (4) evaluate the sustainability of groundwater sources of drinking water. This paper proposes a machine-learning approach to groundwater prediction with the following characteristics: (i) the use of a regression-based approach to predict full groundwater images based on sequences of monthly groundwater maps; (ii) strategic automatic feature selection (both local and global features) using extreme gradient boosting; and (iii) the use of a multiplicity of machine-learning techniques (extreme gradient boosting, multivariate linear regression, random forests, multilayer perceptron and support vector regression). Of these techniques, support vector regression consistently performed best in terms of minimizing root mean square error and mean absolute error. Furthermore, including a global feature obtained from a Gaussian Mixture Model produced models with lower error than the best which could be obtained with local geographical features

    Modified Genetic Algorithm for Feature Selection and Hyper Parameter Optimization: Case of XGBoost in Spam Prediction

    Full text link
    Recently, spam on online social networks has attracted attention in the research and business world. Twitter has become the preferred medium to spread spam content. Many research efforts attempted to encounter social networks spam. Twitter brought extra challenges represented by the feature space size, and imbalanced data distributions. Usually, the related research works focus on part of these main challenges or produce black-box models. In this paper, we propose a modified genetic algorithm for simultaneous dimensionality reduction and hyper parameter optimization over imbalanced datasets. The algorithm initialized an eXtreme Gradient Boosting classifier and reduced the features space of tweets dataset; to generate a spam prediction model. The model is validated using a 50 times repeated 10-fold stratified cross-validation, and analyzed using nonparametric statistical tests. The resulted prediction model attains on average 82.32\% and 92.67\% in terms of geometric mean and accuracy respectively, utilizing less than 10\% of the total feature space. The empirical results show that the modified genetic algorithm outperforms Chi2Chi^2 and PCAPCA feature selection methods. In addition, eXtreme Gradient Boosting outperforms many machine learning algorithms, including BERT-based deep learning model, in spam prediction. Furthermore, the proposed approach is applied to SMS spam modeling and compared to related works
    corecore