6 research outputs found

    Evolutionary Computation, Optimization and Learning Algorithms for Data Science

    Get PDF
    A large number of engineering, science and computational problems have yet to be solved in a computationally efficient way. One of the emerging challenges is how evolving technologies grow towards autonomy and intelligent decision making. This leads to collection of large amounts of data from various sensing and measurement technologies, e.g., cameras, smart phones, health sensors, smart electricity meters, and environment sensors. Hence, it is imperative to develop efficient algorithms for generation, analysis, classification, and illustration of data. Meanwhile, data is structured purposefully through different representations, such as large-scale networks and graphs. We focus on data science as a crucial area, specifically focusing on a curse of dimensionality (CoD) which is due to the large amount of generated/sensed/collected data. This motivates researchers to think about optimization and to apply nature-inspired algorithms, such as evolutionary algorithms (EAs) to solve optimization problems. Although these algorithms look un-deterministic, they are robust enough to reach an optimal solution. Researchers do not adopt evolutionary algorithms unless they face a problem which is suffering from placement in local optimal solution, rather than global optimal solution. In this chapter, we first develop a clear and formal definition of the CoD problem, next we focus on feature extraction techniques and categories, then we provide a general overview of meta-heuristic algorithms, its terminology, and desirable properties of evolutionary algorithms

    Malware Detection using Artificial Bee Colony Algorithm

    Full text link
    Malware detection has become a challenging task due to the increase in the number of malware families. Universal malware detection algorithms that can detect all the malware families are needed to make the whole process feasible. However, the more universal an algorithm is, the higher number of feature dimensions it needs to work with, and that inevitably causes the emerging problem of Curse of Dimensionality (CoD). Besides, it is also difficult to make this solution work due to the real-time behavior of malware analysis. In this paper, we address this problem and aim to propose a feature selection based malware detection algorithm using an evolutionary algorithm that is referred to as Artificial Bee Colony (ABC). The proposed algorithm enables researchers to decrease the feature dimension and as a result, boost the process of malware detection. The experimental results reveal that the proposed method outperforms the state-of-the-art

    Improvement on KNN using genetic algorithm and combined feature extraction to identify COVID-19 sufferers based on CT scan image

    Get PDF
    Coronavirus disease 2019 (COVID-19) has spread throughout the world. The detection of this disease is usually carried out using the reverse transcriptase polymerase chain reaction (RT-PCR) swab test. However, limited resources became an obstacle to carrying out the massive test. To solve this problem, computerized tomography (CT) scan images are used as one of the solutions to detect the sufferer. This technique has been used by researchers but mostly using classifiers that required high resources, such as convolutional neural network (CNN). In this study, we proposed a way to classify the CT scan images by using the more efficient classifier, k-nearest neighbors (KNN), for images that are processed using a combination of these feature extraction methods, Haralick, histogram, and local binary pattern. Genetic algorithm is also used for feature selection. The results showed that the proposed method was able to improve KNN performance, with the best accuracy of 93.30% for the combination of Haralick and local binary pattern feature extraction, and the best area under the curve (AUC) for the combination of Haralick, histogram, and local binary pattern with a value of 0.948. The best accuracy of our models also outperforms CNN by a 4.3% margin

    Exploring the Time-efficient Evolutionary-based Feature Selection Algorithms for Speech Data under Stressful Work Condition

    Get PDF
    Initially, the goal of Machine Learning (ML) advancements is faster computation time and lower computation resources, while the curse of dimensionality burdens both computation time and resource. This paper describes the benefits of the Feature Selection Algorithms (FSA) for speech data under workload stress. FSA contributes to reducing both data dimension and computation time and simultaneously retains the speech information. We chose to use the robust Evolutionary Algorithm, Harmony Search, Principal Component Analysis, Genetic Algorithm, Particle Swarm Optimization, Ant Colony Optimization, and Bee Colony Optimization, which are then to be evaluated using the hierarchical machine learning models. These FSAs are explored with the conversational workload stress data of a Customer Service hotline, which has daily complaints that trigger stress in speaking. Furthermore, we employed precisely 223 acoustic-based features. Using Random Forest, our evaluation result showed computation time had improved 3.6 faster than the original 223 features employed. Evaluation using Support Vector Machine beat the record with 0.001 seconds of computation time

    Integrating supercomputing clusters into education: a case study in biotechnology

    Get PDF
    The integration of a Supercomputer in the educational process improves student鈥檚 technological skills. The aim of the paper is to study the interaction between sci-ence, technology, engineering, and mathematics (STEM) and non-STEM subjects for developing a course of study related to Supercomputing training. We propose a flowchart of the process to improve the performance of students attending courses related to Supercomputing. As a final result, this study highlights the analysis of the information obtained by the use of HPC infrastructures in courses implemented in higher education through a questionnaire that provides useful information about their attitudes, beliefs and evaluations. The results help us to understand how the collaboration between institutions enhances outcomes in the education context. The conclusion provides a description of the resources needed for the improvement of Supercomputing Education (SE), proposing future research directions. 2018-1-ES01-KA201-05093SIComisi贸n EuropeaMinisterio de Ciencia e Innovaci贸nMinisterio de Econom铆a y CompetitividadFundaci贸n Centro de Supercomputaci贸n de Castilla y Le贸

    An谩lisis y evaluaci贸n del uso de la supercomputaci贸n en la mejora del desempe帽o formativo = Analysis and evaluation of supercomputing for training performance improvement

    Get PDF
    205 p.Los recursos de supercomputaci贸n son en la actualidad el pilar fundamental para el desarrollo de la investigaci贸n en diversos campos. Su impacto se basa en la capacidad de c谩lculo, que permite realizar simulaciones computacionales que permiten mejorar la precisi贸n de los experimentos. La presente Tesis Doctoral pretende, en primer lugar, realizar un estudio de la evoluci贸n de la supercomputaci贸n y su aplicaci贸n a diversos campos para, posteriormente, estudiar los factores determinantes que permitan analizar los aspectos m谩s relevantes a la hora de estudiar la relaci贸n existente entre los estudios de supercomputaci贸n con los aspectos pedag贸gicos, de conocimiento y de contenido, bas谩ndose en el modelo TPACK. El estudio se realiz贸 con informaci贸n procedente de la base de datos de estudiantes del Centro de Supercomputaci贸n de Castilla y Le贸n (SCAYLE), de la que se obtuvieron 97 participantes. En el estudio se realiz贸 un an谩lisis factorial para comprobar que la estructura de datos obtenida era coherente con el modelo TPACK usado como referencia. Los resultados obtenidos del an谩lisis relacionan las dimensiones tecnol贸gicas con las de conocimiento, pedag贸gicas y de contenido
    corecore