6,985 research outputs found
NeuroSVM: A Graphical User Interface for Identification of Liver Patients
Diagnosis of liver infection at preliminary stage is important for better
treatment. In todays scenario devices like sensors are used for detection of
infections. Accurate classification techniques are required for automatic
identification of disease samples. In this context, this study utilizes data
mining approaches for classification of liver patients from healthy
individuals. Four algorithms (Naive Bayes, Bagging, Random forest and SVM) were
implemented for classification using R platform. Further to improve the
accuracy of classification a hybrid NeuroSVM model was developed using SVM and
feed-forward artificial neural network (ANN). The hybrid model was tested for
its performance using statistical parameters like root mean square error (RMSE)
and mean absolute percentage error (MAPE). The model resulted in a prediction
accuracy of 98.83%. The results suggested that development of hybrid model
improved the accuracy of prediction. To serve the medicinal community for
prediction of liver disease among patients, a graphical user interface (GUI)
has been developed using R. The GUI is deployed as a package in local
repository of R platform for users to perform prediction.Comment: 9 pages, 6 figure
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
A New Methodology for Generalizing Unweighted Network Measures
Several important complex network measures that helped discovering common
patterns across real-world networks ignore edge weights, an important
information in real-world networks. We propose a new methodology for
generalizing measures of unweighted networks through a generalization of the
cardinality concept of a set of weights. The key observation here is that many
measures of unweighted networks use the cardinality (the size) of some subset
of edges in their computation. For example, the node degree is the number of
edges incident to a node. We define the effective cardinality, a new metric
that quantifies how many edges are effectively being used, assuming that an
edge's weight reflects the amount of interaction across that edge. We prove
that a generalized measure, using our method, reduces to the original
unweighted measure if there is no disparity between weights, which ensures that
the laws that govern the original unweighted measure will also govern the
generalized measure when the weights are equal. We also prove that our
generalization ensures a partial ordering (among sets of weighted edges) that
is consistent with the original unweighted measure, unlike previously developed
generalizations. We illustrate the applicability of our method by generalizing
four unweighted network measures. As a case study, we analyze four real-world
weighted networks using our generalized degree and clustering coefficient. The
analysis shows that the generalized degree distribution is consistent with the
power-law hypothesis but with steeper decline and that there is a common
pattern governing the ratio between the generalized degree and the traditional
degree. The analysis also shows that nodes with more uniform weights tend to
cluster with nodes that also have more uniform weights among themselves.Comment: 23 pages, 10 figure
Evolving Ensemble Fuzzy Classifier
The concept of ensemble learning offers a promising avenue in learning from
data streams under complex environments because it addresses the bias and
variance dilemma better than its single model counterpart and features a
reconfigurable structure, which is well suited to the given context. While
various extensions of ensemble learning for mining non-stationary data streams
can be found in the literature, most of them are crafted under a static base
classifier and revisits preceding samples in the sliding window for a
retraining step. This feature causes computationally prohibitive complexity and
is not flexible enough to cope with rapidly changing environments. Their
complexities are often demanding because it involves a large collection of
offline classifiers due to the absence of structural complexities reduction
mechanisms and lack of an online feature selection mechanism. A novel evolving
ensemble classifier, namely Parsimonious Ensemble pENsemble, is proposed in
this paper. pENsemble differs from existing architectures in the fact that it
is built upon an evolving classifier from data streams, termed Parsimonious
Classifier pClass. pENsemble is equipped by an ensemble pruning mechanism,
which estimates a localized generalization error of a base classifier. A
dynamic online feature selection scenario is integrated into the pENsemble.
This method allows for dynamic selection and deselection of input features on
the fly. pENsemble adopts a dynamic ensemble structure to output a final
classification decision where it features a novel drift detection scenario to
grow the ensemble structure. The efficacy of the pENsemble has been numerically
demonstrated through rigorous numerical studies with dynamic and evolving data
streams where it delivers the most encouraging performance in attaining a
tradeoff between accuracy and complexity.Comment: this paper has been published by IEEE Transactions on Fuzzy System
- …