596 research outputs found

    Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives

    Get PDF
    The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions

    Literature Review on Big Data Analytics Methods

    Get PDF
    Companies and industries are faced with a huge amount of raw data, which have information and knowledge in their hidden layer. Also, the format, size, variety, and velocity of generated data bring complexity for industries to apply them in an efficient and effective way. So, complexity in data analysis and interpretation incline organizations to deploy advanced tools and techniques to overcome the difficulties of managing raw data. Big data analytics is the advanced method that has the capability for managing data. It deploys machine learning techniques and deep learning methods to benefit from gathered data. In this research, the methods of both ML and DL have been discussed, and an ML/DL deployment model for IOT data has been proposed

    Parallel ant colony optimization for the training of cell signaling networks

    Get PDF
    [Abstract]: Acquiring a functional comprehension of the deregulation of cell signaling networks in disease allows progress in the development of new therapies and drugs. Computational models are becoming increasingly popular as a systematic tool to analyze the functioning of complex biochemical networks, such as those involved in cell signaling. CellNOpt is a framework to build predictive logic-based models of signaling pathways by training a prior knowledge network to biochemical data obtained from perturbation experiments. This training can be formulated as an optimization problem that can be solved using metaheuristics. However, the genetic algorithm used so far in CellNOpt presents limitations in terms of execution time and quality of solutions when applied to large instances. Thus, in order to overcome those issues, in this paper we propose the use of a method based on ant colony optimization, adapted to the problem at hand and parallelized using a hybrid approach. The performance of this novel method is illustrated with several challenging benchmark problems in the study of new therapies for liver cancer

    An Efficient Ant Colony Optimization Framework for HPC Environments

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Combinatorial optimization problems arise in many disciplines, both in the basic sciences and in applied fields such as engineering and economics. One of the most popular combinatorial optimization methods is the Ant Colony Optimization (ACO) metaheuristic. Its parallel nature makes it especially attractive for implementation and execution in High Performance Computing (HPC) environments. Here we present a novel parallel ACO strategy making use of efficient asynchronous decentralized cooperative mechanisms. This strategy seeks to fulfill two objectives: (i) acceleration of the computations by performing the ants’ solution construction in parallel; (ii) convergence improvement through the stimulation of the diversification in the search and the cooperation between different colonies. The two main features of the proposal, decentralization and desynchronization, enable a more effective and efficient response in environments where resources are highly coupled. Examples of such infrastructures include both traditional HPC clusters, and also new distributed environments, such as cloud infrastructures, or even local computer networks. The proposal has been evaluated using the popular Traveling Salesman Problem (TSP), as a well-known NP-hard problem widely used in the literature to test combinatorial optimization methods. An exhaustive evaluation has been carried out using three medium and large size instances from the TSPLIB library, and the experiments show encouraging results with superlinear speedups compared to the sequential algorithm (e.g. speedups of 18 with 16 cores), and a very good scalability (experiments were performed with up to 384 cores improving execution time even at that scale).This work was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00 / AEI / 10.13039/501100011033), and by Xunta de Galicia and FEDER funds of the EU (Centro de Investigación de Galicia accreditation 2019–2022, ref. ED431G 2019/01; Consolidation Program of Competitive Reference Groups, ref. ED431C 2021/30). JRB acknowledges funding from the Ministry of Science and Innovation of Spain MCIN / AEI / 10.13039/501100011033 through grant PID2020-117271RB-C22 (BIODYNAMICS), and from MCIN / AEI / 10.13039/501100011033 and “ERDF A way of making Europe” through grant DPI2017-82896-C2-2-R (SYNBIOCONTROL). Authors also acknowledge the Galician Supercomputing Center (CESGA) for the access to its facilities. Funding for open access charge: Universidade da Coruña/CISUGXunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2021/3

    DYNAMIC THRESHOLDING GA-BASED ECG FEATURE SELECTION IN CARDIOVASCULAR DISEASE DIAGNOSIS

    Get PDF
    Electrocardiogram (ECG) data are usually used to diagnose cardiovascular disease (CVD) with the help of a revolutionary algorithm. Feature selection is a crucial step in the development of accurate and reliable diagnostic models for CVDs. This research introduces the dynamic threshold genetic algorithm (DTGA) algorithm, a type of genetic algorithm that is used for optimization problems and discusses its use in the context of feature selection. This research reveals the success of DTGA in selecting relevant ECG features that ultimately enhance accuracy and efficiency in the diagnosis of CVD. This work also proves the benefits of employing DTGA in clinical practice, including a reduction in the amount of time spent diagnosing patients and an increase in the precision with which individuals who are at risk of CVD can be identified

    Task Scheduling with Altered Grey Wolf Optimization (AGWO) in Mobile Cloud Computing using Cloudlet

    Get PDF
    Mobile devices can improve their battery life by offloading their tasks to a nearby cloudlet instead of executing tasks on the mobile device. Because mobile devices have low-speed processors, small-size memory, and limited battery. As the mobile devices are moving, they are connected and disconnected from the cloudlets. So, their tasks are offloaded to the new cloudlets and also migrated from one cloudlet to another until the tasks finish their execution. Scheduling these tasks in the cloudlet will reduce the tasks\u27 execution time and the mobile device\u27s power consumption using this proposed new method (AGWO). The GWO algorithm is modified to accept the inputs from a two-dimensional array instead of sequence inputs and search for the prey within the two-dimensional array instead of an unknown circle area. This method deals with the arrival time of the task, task size, and big task. The migration of the partially executed task dynamically to other VMs is also examined. This proposed method also reduces the average scheduling delay and increases the percentage of requests executed by the cloudlet than other variations of GWO and other research algorithms

    Review and Classification of Bio-inspired Algorithms and Their Applications

    Get PDF
    Scientists have long looked to nature and biology in order to understand and model solutions for complex real-world problems. The study of bionics bridges the functions, biological structures and functions and organizational principles found in nature with our modern technologies, numerous mathematical and metaheuristic algorithms have been developed along with the knowledge transferring process from the lifeforms to the human technologies. Output of bionics study includes not only physical products, but also various optimization computation methods that can be applied in different areas. Related algorithms can broadly be divided into four groups: evolutionary based bio-inspired algorithms, swarm intelligence-based bio-inspired algorithms, ecology-based bio-inspired algorithms and multi-objective bio-inspired algorithms. Bio-inspired algorithms such as neural network, ant colony algorithms, particle swarm optimization and others have been applied in almost every area of science, engineering and business management with a dramatic increase of number of relevant publications. This paper provides a systematic, pragmatic and comprehensive review of the latest developments in evolutionary based bio-inspired algorithms, swarm intelligence based bio-inspired algorithms, ecology based bio-inspired algorithms and multi-objective bio-inspired algorithms

    A Lite Fireworks Algorithm with Fractal Dimension Constraint for Feature Selection

    Full text link
    As the use of robotics becomes more widespread, the huge amount of vision data leads to a dramatic increase in data dimensionality. Although deep learning methods can effectively process these high-dimensional vision data. Due to the limitation of computational resources, some special scenarios still rely on traditional machine learning methods. However, these high-dimensional visual data lead to great challenges for traditional machine learning methods. Therefore, we propose a Lite Fireworks Algorithm with Fractal Dimension constraint for feature selection (LFWA+FD) and use it to solve the feature selection problem driven by robot vision. The "LFWA+FD" focuses on searching the ideal feature subset by simplifying the fireworks algorithm and constraining the dimensionality of selected features by fractal dimensionality, which in turn reduces the approximate features and reduces the noise in the original data to improve the accuracy of the model. The comparative experimental results of two publicly available datasets from UCI show that the proposed method can effectively select a subset of features useful for model inference and remove a large amount of noise noise present in the original data to improve the performance.Comment: International Conference on Pharmaceutical Sciences 202
    corecore