Search CORE

92,708 research outputs found

On the role of pre and post-processing in environmental data mining

Author: Athanasiadis Ioannis
Comas Joaquim
Gibert Karina
Holmes Geoffrey
Izquierdo Joaquin
Sanchez-Marre Miquel
Publication venue: International Environmental Modelling and Software Society
Publication date: 01/01/2008
Field of study

The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

Research Commons@Waikato

A Survey on Compiler Autotuning using Machine Learning

Author: Ashouri Amir H.
Cavazos John
Killian William
Palermo Gianluca
Silvano Cristina
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/09/2018
Field of study

Since the mid-1990s, researchers have been trying to use machine-learning based approaches to solve a number of different compiler optimization problems. These techniques primarily enhance the quality of the obtained results and, more importantly, make it feasible to tackle two main compiler optimization problems: optimization selection (choosing which optimizations to apply) and phase-ordering (choosing the order of applying optimizations). The compiler optimization space continues to grow due to the advancement of applications, increasing number of compiler optimizations, and new target architectures. Generic optimization passes in compilers cannot fully leverage newly introduced optimizations and, therefore, cannot keep up with the pace of increasing options. This survey summarizes and classifies the recent advances in using machine learning for the compiler optimization field, particularly on the two major problems of (1) selecting the best optimizations and (2) the phase-ordering of optimizations. The survey highlights the approaches taken so far, the obtained results, the fine-grain classification among different approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated quarterly here (Send me your new published papers to be added in the subsequent version) History: Received November 2016; Revised August 2017; Revised February 2018; Accepted March 2018

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

A data-driven functional projection approach for the selection of feature ranges in spectra with ICA or cluster analysis

Author: Alsberg
Alsberg
Alsberg
Barnes
Benoudjit
C. Krier
Caelen
D. François
F. Rossi
Geladi
Kraskov
M. Verleysen
Mevik
Moody
Pelckmans
R Development Core Team
Ralf
Rossi
Rossi
Rossi
Suykens
Van Dijk
Walczak
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

Prediction problems from spectra are largely encountered in chemometry. In addition to accurate predictions, it is often needed to extract information about which wavelengths in the spectra contribute in an effective way to the quality of the prediction. This implies to select wavelengths (or wavelength intervals), a problem associated to variable selection. In this paper, it is shown how this problem may be tackled in the specific case of smooth (for example infrared) spectra. The functional character of the spectra (their smoothness) is taken into account through a functional variable projection procedure. Contrarily to standard approaches, the projection is performed on a basis that is driven by the spectra themselves, in order to best fit their characteristics. The methodology is illustrated by two examples of functional projection, using Independent Component Analysis and functional variable clustering, respectively. The performances on two standard infrared spectra benchmarks are illustrated.Comment: A paraitr

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

DIAL UCLouvain

Data mining as a tool for environmental scientists

Author: Athanasiadis Ioannis
Comas Joaquim
Frank Eibe
Gibert Karina
Letcher Rebecca
Spate Jessica
Sànchez-Marrè Miquel
Publication venue: International Environmental Modelling and Software Society
Publication date: 01/01/2006
Field of study

Over recent years a huge library of data mining algorithms has been developed to tackle a variety of problems in fields such as medical imaging and network traffic analysis. Many of these techniques are far more flexible than more classical modelling approaches and could be usefully applied to data-rich environmental problems. Certain techniques such as Artificial Neural Networks, Clustering, Case-Based Reasoning and more recently Bayesian Decision Networks have found application in environmental modelling while other methods, for example classification and association rule extraction, have not yet been taken up on any wide scale. We propose that these and other data mining techniques could be usefully applied to difficult problems in the field. This paper introduces several data mining concepts and briefly discusses their application to environmental modelling, where data may be sparse, incomplete, or heterogenous

Research Commons@Waikato