715 research outputs found

    An Evolutionary Optimization Algorithm for Automated Classical Machine Learning

    Get PDF
    Machine learning is an evolving branch of computational algorithms that allow computers to learn from experiences, make predictions, and solve different problems without being explicitly programmed. However, building a useful machine learning model is a challenging process, requiring human expertise to perform various proper tasks and ensure that the machine learning\u27s primary objective --determining the best and most predictive model-- is achieved. These tasks include pre-processing, feature selection, and model selection. Many machine learning models developed by experts are designed manually and by trial and error. In other words, even experts need the time and resources to create good predictive machine learning models. The idea of automated machine learning (AutoML) is to automate a machine learning pipeline to release the burden of substantial development costs and manual processes. The algorithms leveraged in these systems have different hyper-parameters. On the other hand, different input datasets have various features. In both cases, the final performance of the model is closely related to the final selected configuration of features and hyper-parameters. That is why they are considered as crucial tasks in the AutoML. The challenges regarding the computationally expensive nature of tuning hyper-parameters and optimally selecting features create significant opportunities for filling the research gaps in the AutoML field. This dissertation explores how to select the features and tune the hyper-parameters of conventional machine learning algorithms efficiently and automatically. To address the challenges in the AutoML area, novel algorithms for hyper-parameter tuning and feature selection are proposed. The hyper-parameter tuning algorithm aims to provide the optimal set of hyper-parameters in three conventional machine learning models (Random Forest, XGBoost and Support Vector Machine) to obtain best scores regarding performance. On the other hand, the feature selection algorithm looks for the optimal subset of features to achieve the highest performance. Afterward, a hybrid framework is designed for both hyper-parameter tuning and feature selection. The proposed framework can discover close to the optimal configuration of features and hyper-parameters. The proposed framework includes the following components: (1) an automatic feature selection component based on artificial bee colony algorithms and machine learning training, and (2) an automatic hyper-parameter tuning component based on artificial bee colony algorithms and machine learning training for faster training and convergence of the learning models. The whole framework has been evaluated using four real-world datasets in different applications. This framework is an attempt to alleviate the challenges of hyper-parameter tuning and feature selection by using efficient algorithms. However, distributed processing, distributed learning, parallel computing, and other big data solutions are not taken into consideration in this framework

    A Two-Stage Optimization Strategy for Fuzzy Object-Based Analysis Using Airborne LiDAR and High-Resolution Orthophotos for Urban Road Extraction

    Full text link
    Copyright © 2017 Maher Ibrahim Sameen and Biswajeet Pradhan. In the last decade, object-based image analysis (OBIA) has been extensively recognized as an effective classification method for very high spatial resolution images or integrated data from different sources. In this study, a two-stage optimization strategy for fuzzy object-based analysis using airborne LiDAR was proposed for urban road extraction. The method optimizes the two basic steps of OBIA, namely, segmentation and classification, to realize accurate land cover mapping and urban road extraction. This objective was achieved by selecting the optimum scale parameter to maximize class separability and the optimum shape and compactness parameters to optimize the final image segments. Class separability was maximized using the Bhattacharyya distance algorithm, whereas image segmentation was optimized using the Taguchi method. The proposed fuzzy rules were created based on integrated data and expert knowledge. Spectral, spatial, and texture features were used under fuzzy rules by implementing the particle swarm optimization technique. The proposed fuzzy rules were easy to implement and were transferable to other areas. An overall accuracy of 82% and a kappa index of agreement (KIA) of 0.79 were achieved on the studied area when results were compared with reference objects created via manual digitization in a geographic information system. The accuracy of road extraction using the developed fuzzy rules was 0.76 (producer), 0.85 (user), and 0.72 (KIA). Meanwhile, overall accuracy was decreased by approximately 6% when the rules were applied on a test site. A KIA of 0.70 was achieved on the test site using the same rules without any changes. The accuracy of the extracted urban roads from the test site was 0.72 (KIA), which decreased to approximately 0.16. Spatial information (i.e., elongation) and intensity from LiDAR were the most interesting properties for urban road extraction. The proposed method can be applied to a wide range of real applications through remote sensing by transferring object-based rules to other areas using optimization techniques

    Context-Aware Clustering and the Optimized Whale Optimization Algorithm: An Effective Predictive Model for the Smart Grid

    Get PDF
    For customers to participate in key peak pricing, period-of-use fees, and individualized responsiveness to demand programmes taken from multi-dimensional data flows, energy use projection and analysis must be done well. However, it is a difficult study topic to ascertain the knowledge of use of electricity as recorded in the electricity records' Multi-Dimensional Data Streams (MDDS). Context-Aware Clustering (CAC) and the Optimized Whale Optimization Algorithm were suggested by researchers as a fresh power usage knowledge finding model from the multi-dimensional data streams (MDDS) to resolve issue (OWOA). The proposed CAC-OWOA framework first performs the data cleaning to handle the noisy and null elements. The predictive features are extracted from the novel context-aware group formation algorithm using the statistical context parameters from the pre-processed MDDS electricity logs. To perform the energy consumption prediction, researchers have proposed the novel Artificial Neural Network (ANN) predictive algorithm using the bio-inspired optimization algorithm called OWOA. The OWOA is the modified algorithm of the existing WOA to overcome the problems of slow convergence speed and easily falling into the local optimal solutions. The ANN training method is used in conjunction with the suggested bio-inspired OWOA algorithm to lower error rates and boost overall prediction accuracy. The efficiency of the CAC-OWOA framework is evaluated using the publicly available smart grid electricity consumption logs. The experimental results demonstrate the effectiveness of the CAC-OWOA framework in terms of forecasting accuracy, precision, recall, and duration when compared to underlying approaches

    Computing Adaptive Feature Weights with PSO to Improve Android Malware Detection

    Get PDF
    © 2017 Yanping Xu et al. Android malware detection is a complex and crucial issue. In this paper, we propose a malware detection model using a support vector machine (SVM) method based on feature weights that are computed by information gain (IG) and particle swarm optimization (PSO) algorithms. The IG weights are evaluated based on the relevance between features and class labels, and the PSO weights are adaptively calculated to result in the best fitness (the performance of the SVM classification model). Moreover, to overcome the defects of basic PSO, we propose a new adaptive inertia weight method called fitness-based and chaotic adaptive inertia weight-PSO (FCAIW-PSO) that improves on basic PSO and is based on the fitness and a chaotic term. The goal is to assign suitable weights to the features to ensure the best Android malware detection performance. The results of experiments indicate that the IG weights and PSO weights both improve the performance of SVM and that the performance of the PSO weights is better than that of the IG weights

    FAKE NEWS DETECTION ON THE WEB: A DEEP LEARNING BASED APPROACH

    Get PDF
    The acceptance and popularity of social media platforms for the dispersion and proliferation of news articles have led to the spread of questionable and untrusted information (in part) due to the ease by which misleading content can be created and shared among the communities. While prior research has attempted to automatically classify news articles and tweets as credible and non-credible. This work complements such research by proposing an approach that utilizes the amalgamation of Natural Language Processing (NLP), and Deep Learning techniques such as Long Short-Term Memory (LSTM). Moreover, in Information System’s paradigm, design science research methodology (DSRM) has become the major stream that focuses on building and evaluating an artifact to solve emerging problems. Hence, DSRM can accommodate deep learning-based models with the availability of adequate datasets. Two publicly available datasets that contain labeled news articles and tweets have been used to validate the proposed model’s effectiveness. This work presents two distinct experiments, and the results demonstrate that the proposed model works well for both long sequence news articles and short-sequence texts such as tweets. Finally, the findings suggest that the sentiments, tagging, linguistics, syntactic, and text embeddings are the features that have the potential to foster fake news detection through training the proposed model on various dimensionality to learn the contextual meaning of the news content

    Optimization with Constraint Learning: A Framework and Survey

    Full text link
    Many real-life optimization problems frequently contain one or more constraints or objectives for which there are no explicit formulas. If data is however available, these data can be used to learn the constraints. The benefits of this approach are clearly seen, however there is a need for this process to be carried out in a structured manner. This paper therefore provides a framework for Optimization with Constraint Learning (OCL) which we believe will help to formalize and direct the process of learning constraints from data. This framework includes the following steps: (i) setup of the conceptual optimization model, (ii) data gathering and preprocessing, (iii) selection and training of predictive models, (iv) resolution of the optimization model, and (v) verification and improvement of the optimization model. We then review the recent OCL literature in light of this framework, and highlight current trends, as well as areas for future research

    Optimisation of Mobile Communication Networks - OMCO NET

    Get PDF
    The mini conference “Optimisation of Mobile Communication Networks” focuses on advanced methods for search and optimisation applied to wireless communication networks. It is sponsored by Research & Enterprise Fund Southampton Solent University. The conference strives to widen knowledge on advanced search methods capable of optimisation of wireless communications networks. The aim is to provide a forum for exchange of recent knowledge, new ideas and trends in this progressive and challenging area. The conference will popularise new successful approaches on resolving hard tasks such as minimisation of transmit power, cooperative and optimal routing

    Big Data - Supply Chain Management Framework for Forecasting: Data Preprocessing and Machine Learning Techniques

    Full text link
    This article intends to systematically identify and comparatively analyze state-of-the-art supply chain (SC) forecasting strategies and technologies. A novel framework has been proposed incorporating Big Data Analytics in SC Management (problem identification, data sources, exploratory data analysis, machine-learning model training, hyperparameter tuning, performance evaluation, and optimization), forecasting effects on human-workforce, inventory, and overall SC. Initially, the need to collect data according to SC strategy and how to collect them has been discussed. The article discusses the need for different types of forecasting according to the period or SC objective. The SC KPIs and the error-measurement systems have been recommended to optimize the top-performing model. The adverse effects of phantom inventory on forecasting and the dependence of managerial decisions on the SC KPIs for determining model performance parameters and improving operations management, transparency, and planning efficiency have been illustrated. The cyclic connection within the framework introduces preprocessing optimization based on the post-process KPIs, optimizing the overall control process (inventory management, workforce determination, cost, production and capacity planning). The contribution of this research lies in the standard SC process framework proposal, recommended forecasting data analysis, forecasting effects on SC performance, machine learning algorithms optimization followed, and in shedding light on future research

    Benchmark of Bayesian Optimization and Metaheuristics for Control Engineering Tuning Problems with Crash Constraints

    Full text link
    Controller tuning based on black-box optimization allows to automatically tune performance-critical parameters w.r.t. mostly arbitrary high-level closed-loop control objectives. However, a comprehensive benchmark of different black-box optimizers for control engineering problems has not yet been conducted. Therefore, in this contribution, 11 different versions of Bayesian optimization (BO) are compared with seven metaheuristics and other baselines on a set of ten deterministic simulative single-objective tuning problems in control. Results indicate that deterministic noise, low multimodality, and substantial areas with infeasible parametrizations (crash constraints) characterize control engineering tuning problems. Therefore, a flexible method to handle crash constraints with BO is presented. A resulting increase in sample efficiency is shown in comparison to standard BO. Furthermore, benchmark results indicate that pattern search (PS) performs best on a budget of 25 d objective function evaluations and a problem dimensionality d of d = 2. Bayesian adaptive direct search, a combination of BO and PS, is shown to be most sample efficient for 3 <= d <= 5. Using these optimizers instead of random search increases controller performance by on average 6.6% and up to 16.1%.Comment: 13 pages, 9 figure
    • …
    corecore