Search CORE

109,478 research outputs found

One-Class Classification: Taxonomy of Study and Review of Techniques

Author: Khan Shehroz S.
Madden Michael G.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 29/11/2013
Field of study

One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

arXiv.org e-Print Archive

Access to Research at National University of Ireland, Galway

Cost-efficient modeling of antenna structures using Gradient Enhanced Kriging

Author: Bekasiewicz A
Couckuyt Ivo
Dhaene Tom
Koziel S
Laermans Eric
Ulaganathan Selvakumar
Publication venue
Publication date: 01/01/2015
Field of study

Reliable yet fast surrogate models are indispensable in the design of contemporary antenna structures. Data-driven models, e.g., based on Gaussian Processes or support-vector regression, offer sufficient flexibility and speed, however, their setup cost is large and grows very quickly with the dimensionality of the design space. In this paper, we propose cost-efficient modeling of antenna structures using Gradient-Enhanced Kriging. In our approach, the training data set contains, apart from the EM-simulation responses of the structure at hand, also derivative data at the respective training locations obtained at little extra cost using adjoint sensitivity techniques. We demonstrate that introduction of the derivative information into the model allows for considerable reduction of the model setup cost (in terms of the number of training points required) without compromising its predictive power. The Gradient-Enhanced Kriging technique is illustrated using a dielectric resonator antenna structure. Comparison with conventional Kriging interpolation is also provided

Ghent University Academic Bibliography

Multilevel Weighted Support Vector Machine for Classification on Healthcare Data with Missing Values

Author: Marko Nicholas
Razzaghi Talayeh
Roderick Oleg
Safro Ilya
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 07/04/2016
Field of study

This work is motivated by the needs of predictive analytics on healthcare data as represented by Electronic Medical Records. Such data is invariably problematic: noisy, with missing entries, with imbalance in classes of interests, leading to serious bias in predictive modeling. Since standard data mining methods often produce poor performance measures, we argue for development of specialized techniques of data-preprocessing and classification. In this paper, we propose a new method to simultaneously classify large datasets and reduce the effects of missing values. It is based on a multilevel framework of the cost-sensitive SVM and the expected maximization imputation method for missing values, which relies on iterated regression analyses. We compare classification results of multilevel SVM-based algorithms on public benchmark datasets with imbalanced classes and missing values as well as real data in health applications, and show that our multilevel SVM-based method produces fast, and more accurate and robust classification results.Comment: arXiv admin note: substantial text overlap with arXiv:1503.0625

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

An Algorithmic Framework for Computing Validation Performance Bounds by Using Suboptimal Models

Author: Ogawa Kohei
Shinmura Yuki
Suzuki Yoshiki
Takeuchi Ichiro
Publication venue
Publication date: 10/02/2014
Field of study

Practical model building processes are often time-consuming because many different models must be trained and validated. In this paper, we introduce a novel algorithm that can be used for computing the lower and the upper bounds of model validation errors without actually training the model itself. A key idea behind our algorithm is using a side information available from a suboptimal model. If a reasonably good suboptimal model is available, our algorithm can compute lower and upper bounds of many useful quantities for making inferences on the unknown target model. We demonstrate the advantage of our algorithm in the context of model selection for regularized learning problems

arXiv.org e-Print Archive

CiteSeerX

A brief network analysis of Artificial Intelligence publication

Author: Deng Yong
Li Yunpeng
Liu Jie
Publication venue
Publication date: 23/11/2013
Field of study

In this paper, we present an illustration to the history of Artificial Intelligence(AI) with a statistical analysis of publish since 1940. We collected and mined through the IEEE publish data base to analysis the geological and chronological variance of the activeness of research in AI. The connections between different institutes are showed. The result shows that the leading community of AI research are mainly in the USA, China, the Europe and Japan. The key institutes, authors and the research hotspots are revealed. It is found that the research institutes in the fields like Data Mining, Computer Vision, Pattern Recognition and some other fields of Machine Learning are quite consistent, implying a strong interaction between the community of each field. It is also showed that the research of Electronic Engineering and Industrial or Commercial applications are very active in California. Japan is also publishing a lot of papers in robotics. Due to the limitation of data source, the result might be overly influenced by the number of published articles, which is to our best improved by applying network keynode analysis on the research community instead of merely count the number of publish.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

CiteSeerX