26,095 research outputs found
Datamining for Web-Enabled Electronic Business Applications
Web-Enabled Electronic Business is generating massive amount of data on customer purchases, browsing patterns, usage times and preferences at an increasing rate. Data mining techniques can be applied to all the data being collected for obtaining useful information. This chapter attempts to present issues associated with data mining for web-enabled electronic-business
A Survey of Parallel Data Mining
With the fast, continuous increase in the number and size of databases, parallel data mining is a natural and cost-effective approach to tackle the problem of scalability in data mining. Recently there has been a considerable research on parallel data mining. However, most projects focus on the parallelization of a single kind of data mining algorithm/paradigm. This paper surveys parallel data mining with a broader perspective. More precisely, we discuss the parallelization of data mining algorithms of four knowledge discovery paradigms, namely rule induction, instance-based learning, genetic algorithms and neural networks. Using the lessons
learned from this discussion, we also derive a set of heuristic principles for designing efficient parallel data mining algorithms
On the combination of omics data for prediction of binary outcomes
Enrichment of predictive models with new biomolecular markers is an important
task in high-dimensional omic applications. Increasingly, clinical studies
include several sets of such omics markers available for each patient,
measuring different levels of biological variation. As a result, one of the
main challenges in predictive research is the integration of different sources
of omic biomarkers for the prediction of health traits. We review several
approaches for the combination of omic markers in the context of binary outcome
prediction, all based on double cross-validation and regularized regression
models. We evaluate their performance in terms of calibration and
discrimination and we compare their performance with respect to single-omic
source predictions. We illustrate the methods through the analysis of two real
datasets. On the one hand, we consider the combination of two fractions of
proteomic mass spectrometry for the calibration of a diagnostic rule for the
detection of early-stage breast cancer. On the other hand, we consider
transcriptomics and metabolomics as predictors of obesity using data from the
Dietary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome
(DILGOM) study, a population-based cohort, from Finland
- …