58,046 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction
This paper proposes a genetic programming (GP) framework for two major data mining tasks, namely classification and generalized rule induction. The framework emphasizes the integration between a GP algorithm and relational database systems. In particular, the fitness of individuals is computed by submitting SQL queries to a (parallel) database server. Some advantages of this integration from a data mining viewpoint are scalability, data-privacy control and automatic parallelization
Genetic Algorithm (GA) in Feature Selection for CRF Based Manipuri Multiword Expression (MWE) Identification
This paper deals with the identification of Multiword Expressions (MWEs) in
Manipuri, a highly agglutinative Indian Language. Manipuri is listed in the
Eight Schedule of Indian Constitution. MWE plays an important role in the
applications of Natural Language Processing(NLP) like Machine Translation, Part
of Speech tagging, Information Retrieval, Question Answering etc. Feature
selection is an important factor in the recognition of Manipuri MWEs using
Conditional Random Field (CRF). The disadvantage of manual selection and
choosing of the appropriate features for running CRF motivates us to think of
Genetic Algorithm (GA). Using GA we are able to find the optimal features to
run the CRF. We have tried with fifty generations in feature selection along
with three fold cross validation as fitness function. This model demonstrated
the Recall (R) of 64.08%, Precision (P) of 86.84% and F-measure (F) of 73.74%,
showing an improvement over the CRF based Manipuri MWE identification without
GA application.Comment: 14 pages, 6 figures, see
http://airccse.org/journal/jcsit/1011csit05.pd
PasMoQAP: A Parallel Asynchronous Memetic Algorithm for solving the Multi-Objective Quadratic Assignment Problem
Multi-Objective Optimization Problems (MOPs) have attracted growing attention
during the last decades. Multi-Objective Evolutionary Algorithms (MOEAs) have
been extensively used to address MOPs because are able to approximate a set of
non-dominated high-quality solutions. The Multi-Objective Quadratic Assignment
Problem (mQAP) is a MOP. The mQAP is a generalization of the classical QAP
which has been extensively studied, and used in several real-life applications.
The mQAP is defined as having as input several flows between the facilities
which generate multiple cost functions that must be optimized simultaneously.
In this study, we propose PasMoQAP, a parallel asynchronous memetic algorithm
to solve the Multi-Objective Quadratic Assignment Problem. PasMoQAP is based on
an island model that structures the population by creating sub-populations. The
memetic algorithm on each island individually evolve a reduced population of
solutions, and they asynchronously cooperate by sending selected solutions to
the neighboring islands. The experimental results show that our approach
significatively outperforms all the island-based variants of the
multi-objective evolutionary algorithm NSGA-II. We show that PasMoQAP is a
suitable alternative to solve the Multi-Objective Quadratic Assignment Problem.Comment: 8 pages, 3 figures, 2 tables. Accepted at Conference on Evolutionary
Computation 2017 (CEC 2017
A Survey on Software Testing Techniques using Genetic Algorithm
The overall aim of the software industry is to ensure delivery of high
quality software to the end user. To ensure high quality software, it is
required to test software. Testing ensures that software meets user
specifications and requirements. However, the field of software testing has a
number of underlying issues like effective generation of test cases,
prioritisation of test cases etc which need to be tackled. These issues demand
on effort, time and cost of the testing. Different techniques and methodologies
have been proposed for taking care of these issues. Use of evolutionary
algorithms for automatic test generation has been an area of interest for many
researchers. Genetic Algorithm (GA) is one such form of evolutionary
algorithms. In this research paper, we present a survey of GA approach for
addressing the various issues encountered during software testing.Comment: 13 Page
A survey on utilization of data mining approaches for dermatological (skin) diseases prediction
Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data
MaaSim: A Liveability Simulation for Improving the Quality of Life in Cities
Urbanism is no longer planned on paper thanks to powerful models and 3D
simulation platforms. However, current work is not open to the public and lacks
an optimisation agent that could help in decision making. This paper describes
the creation of an open-source simulation based on an existing Dutch
liveability score with a built-in AI module. Features are selected using
feature engineering and Random Forests. Then, a modified scoring function is
built based on the former liveability classes. The score is predicted using
Random Forest for regression and achieved a recall of 0.83 with 10-fold
cross-validation. Afterwards, Exploratory Factor Analysis is applied to select
the actions present in the model. The resulting indicators are divided into 5
groups, and 12 actions are generated. The performance of four optimisation
algorithms is compared, namely NSGA-II, PAES, SPEA2 and eps-MOEA, on three
established criteria of quality: cardinality, the spread of the solutions,
spacing, and the resulting score and number of turns. Although all four
algorithms show different strengths, eps-MOEA is selected to be the most
suitable for this problem. Ultimately, the simulation incorporates the model
and the selected AI module in a GUI written in the Kivy framework for Python.
Tests performed on users show positive responses and encourage further
initiatives towards joining technology and public applications.Comment: 16 page
- …