60,631 research outputs found

    Experimental Comparison of Classification Uncertainty for Randomised and Bayesian Decision Tree Ensembles

    Get PDF
    Copyright © 2004 Springer-Verlag Berlin Heidelberg. The final publication is available at link.springer.comBook title: Intelligent Data Engineering and Automated Learning – IDEAL 20045th International Conference on Intelligent Data Engineering and Automated Learning – IDEAL 2004, Exeter, UK. August 25-27, 2004In this paper we experimentally compare the classification uncertainty of the randomised Decision Tree (DT) ensemble technique and the Bayesian DT technique with a restarting strategy on a synthetic dataset as well as on some datasets commonly used in the machine learning community. For quantitative evaluation of classification uncertainty, we use an Uncertainty Envelope dealing with the class posterior distribution and a given confidence probability. Counting the classifier outcomes, this technique produces feasible evaluations of the classification uncertainty. Using this technique in our experiments, we found that the Bayesian DT technique is superior to the randomised DT ensemble technique

    Cardinality constrained portfolio optimisation

    Get PDF
    Copyright © 2004 Springer-Verlag Berlin Heidelberg. The final publication is available at link.springer.comBook title: Intelligent Data Engineering and Automated Learning – IDEAL 20045th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2004), Exeter, UK. August 25-27, 2004The traditional quadratic programming approach to portfolio optimisation is difficult to implement when there are cardinality constraints. Recent approaches to resolving this have used heuristic algorithms to search for points on the cardinality constrained frontier. However, these can be computationally expensive when the practitioner does not know a priori exactly how many assets they may desire in a portfolio, or what level of return/risk they wish to be exposed to without recourse to analysing the actual trade-off frontier.This study introduces a parallel solution to this problem. By extending techniques developed in the multi-objective evolutionary optimisation domain, a set of portfolios representing estimates of all possible cardinality constrained frontiers can be found in a single search process, for a range of portfolio sizes and constraints. Empirical results are provided on emerging markets and US asset data, and compared to unconstrained frontiers found by quadratic programming

    Adversarial Edit Attacks for Tree Data

    Full text link
    Many machine learning models can be attacked with adversarial examples, i.e. inputs close to correctly classified examples that are classified incorrectly. However, most research on adversarial attacks to date is limited to vectorial data, in particular image data. In this contribution, we extend the field by introducing adversarial edit attacks for tree-structured data with potential applications in medicine and automated program analysis. Our approach solely relies on the tree edit distance and a logarithmic number of black-box queries to the attacked classifier without any need for gradient information. We evaluate our approach on two programming and two biomedical data sets and show that many established tree classifiers, like tree-kernel-SVMs and recursive neural networks, can be attacked effectively.Comment: accepted at the 20th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL

    Spectral high resolution feature selection for retrieval of combustion temperature profiles

    Get PDF
    Proceeding of: 7th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2006 (Burgos, Spain, September 20-23, 2006)The use of high spectral resolution measurements to obtain a retrieval of certain physical properties related with the radiative transfer of energy leads a priori to a better accuracy. But this improvement in accuracy is not easy to achieve due to the great amount of data which makes difficult any treatment over it and it's redundancies. To solve this problem, a pick selection based on principal component analysis has been adopted in order to make the mandatory feature selection over the different channels. In this paper, the capability to retrieve the temperature profile in a combustion environment using neural networks jointly with this spectral high resolution feature selection method is studied.Publicad

    10th International Conference, Burgos, Spain, September 23-26, 2009. Proceedings

    Get PDF
    This book constitutes the refereed proceedings of the 10th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2009, held in Burgos, Sapin, in September 2009. The 100 revised full papers presented were carefully reviewed and selected from over 200 submissions for inclusion in the book. The papers are organized in topical sections on learning and information processing; data mining and information management; neuro-informatics, bio-informatics, and bio-inspired models; agents and hybrid systems; soft computing techniques in data mining; recent advances on swarm-based computing; intelligent computational techniques in medical image processing; advances on ensemble learning and information fursion; financial and business engineering (modeling and applications); MIR day 2009 - Burgos; and nature inspired models for industrial applications

    15th International Conference, Salamanca, Spain, September 10-12, 2014. Proceedings

    Get PDF
    This book constitutes the refereed proceedings of the 15th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2014, held in Salamanca, Spain, in September 2014. The 60 revised full papers presented were carefully reviewed and selected from about 120 submissions. These papers provided a valuable collection of recent research outcomes in data engineering and automated learning, from methodologies, frameworks, and techniques to applications. In addition the conference provided a good sample of current topics from methodologies, frameworks, and techniques to applications and case studies. The techniques include computational intelligence, big data analytics, social media techniques, multi-objective optimization, regression, classification, clustering, biological data processing, text processing, and image/video analysis

    A Variable Metric Probabilistic k-Nearest-Neighbours Classifier

    Get PDF
    Copyright © 2004 Springer Verlag. The final publication is available at link.springer.com5th International Conference, Exeter, UK. August 25-27, 2004. ProceedingsBook title: Intelligent Data Engineering and Automated Learning – IDEAL 2004k-nearest neighbour (k-nn) model is a simple, popular classifier. Probabilistic k-nn is a more powerful variant in which the model is cast in a Bayesian framework using (reversible jump) Markov chain Monte Carlo methods to average out the uncertainy over the model parameters.The k-nn classifier depends crucially on the metric used to determine distances between data points. However, scalings between features, and indeed whether some subset of features is redundant, are seldom known a priori. Here we introduce a variable metric extension to the probabilistic k-nn classifier, which permits averaging over all rotations and scalings of the data. In addition, the method permits automatic rejection of irrelevant features. Examples are provided on synthetic data, illustrating how the method can deform feature space and select salient features, and also on real-world data

    Giving voice to the Internet by means of conversational agents

    Get PDF
    Proceedings of: 15th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2014), Salmanca, Spain,In this paper we present a proposal to develop conversational agents that avoids the effort of manually defining the dialog strategy for the agent and also takes into account the benefits of using current standards. In our proposal the dialog manager is trained by means of a POMDP-based methodology using a labeled dialog corpus automatically acquired using a user modeling technique. The statistical dialog model automatically selects the next system response. Thus, system developers only need to define a set of files, each including a system prompt and the associated grammar to recognize user responses. We have applied this technique to develop a conversational agent in VoiceXML that provides information for planning a trip.This work has been supported in part by the Spanish Government under i-Support (Intelligent Agent Based Driver Decision Support) Project (TRA2011-29454-C03- 03), and Projects MINECO TEC2012-37832-C02-01, CICYT TEC2011-28626-C02- 02, and CAM CONTEXTS (S2009/TIC-1485

    Portfolio Optimization Using SPEA2 with Resampling

    Get PDF
    Proceeding of: Intelligent Data Engineering and Automated Learning – IDEAL 2011: 12th International Conference, Norwich, UK, September 7-9, 2011The subject of financial portfolio optimization under real-world constraints is a difficult problem that can be tackled using multiobjective evolutionary algorithms. One of the most problematic issues is the dependence of the results on the estimates for a set of parameters, that is, the robustness of solutions. These estimates are often inaccurate and this may result on solutions that, in theory, offered an appropriate risk/return balance and, in practice, resulted being very poor. In this paper we suggest that using a resampling mechanism may filter out the most unstable. We test this idea on real data using SPEA2 as optimization algorithm and the results show that the use of resampling increases significantly the reliability of the resulting portfolios.The authors acknowledge financial support granted by the Spanish Ministry of Science under contract TIN2008-06491-C04-03 (MSTAR) and Comunidad de Madrid (CCG10- UC3M/TIC-5029).Publicad

    Studying the Effect of Measured Solar Power on Evolutionary Multi-objective Prediction Intervals

    Get PDF
    This paper has been presented at: 19th Intelligent Data Engineering and Automated Learning (IDEAL 2018)While it is common to make point forecasts for solar energy generation, estimating the forecast uncertainty has received less attention. In this article, prediction intervals are computed within a multi-objective approach in order to obtain an optimal coverage/width tradeoff. In particular, it is studied whether using measured power as an another input, additionally to the meteorological forecast variables, is able to improve the properties of prediction intervals for short time horizons (up to three hours). Results show that they tend to be narrower (i.e. less uncertain), and the ratio between coverage and width is larger. The method has shown to obtain intervals with better properties than baseline Quantile Regression.This work has been funded by the Spanish Ministry of Science under contract ENE2014-56126-C2-2-R (AOPRIN-SOL project)
    • …
    corecore