Search CORE

8 research outputs found

Ensemble of a subset of kNN classifiers

Author: A Karatzoglou
Aris Perperoglou
Asma Gul
Berthold Lausen
C Müssel
D Mease
DF Nettleton
E Bauer
EW Steyerberg
J Hernández-Orallo
J Kruppa
L Breiman
L Lausser
Miftahuddin Miftahuddin
O Mahmoud
Osama Mahmoud
P Hall
P Melville
R Barandela
R Maclin
RJ Samworth
S Li
T Cover
T Hothorn
T Hothorn
T Hothorn
T Hothorn
T Khoshgoftaar
Werner Adler
Z Liu
Zardad Khan
ZH Zhou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines

University of Essex Research Repository

Crossref

Springer - Publisher Connector

Explore Bristol Research

Learning differential diagnosis of erythemato-squamous diseases using voting feature intervals

Author: Demiroz G.
Guvenir H. A.
Ilter N.
Publication venue: 'Elsevier BV'
Publication date: 01/01/1998
Field of study

Cataloged from PDF version of article.A new classification algorithm, called VFI5 (for Voting Feature Intervals), is developed and applied to problem of differential diagnosis of erythemato-squamous diseases. The domain contains records of patients with known diagnosis. Given a training set of such records, the VFI5 classifier learns how to differentiate a new case in the domain. VFI5 represents a concept in the form of feature intervals on each feature dimension separately. classification in the VFI5 algorithm is based on a real-valued voting. Each feature equally participates in the voting process and the class that receives the maximum amount of votes is declared to be the predicted class. The performance of the VFI5 classifier is evaluated empirically in terms of classification accuracy and running time. (C) 1998 Elsevier Science B.V. All rights reserved

Bilkent University Institutional Repository

Text categorization using feature projections

Author
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2002
Field of study

Crossref

A classification learning algorithm robust to irrelevant features

Author: Güvenir H.A.
Publication venue
Publication date: 01/01/1998
Field of study

Presence of irrelevant features is a fact of life in many realworld applications of classification learning. Although nearest-neighbor classification algorithms have emerged as a promising approach to machine learning tasks with their high predictive accuracy, they are adversely affected by the presence of such irrelevant features. In this paper, we describe a recently proposed classification algorithm called VFI5, which achieves comparable accuracy to nearest-neighbor classifiers while it is robust with respect to irrelevant features. The paper compares both the nearest-neighbor classifier and the VFI5 algorithms in the presence of irrelevant features on both artificially generated and real-world data sets selected from the UCI repository

Bilkent University Institutional Repository

Classification by voting feature intervals

Author: Altay Güvenir H.
Demiröz G.
Publication venue: Springer Verlag
Publication date: 01/01/1997
Field of study

A new classification algorithm called VFI (for Voting Feature Intervals) is proposed. A concept is represented by a set of feature intervals on each feature dimension separately. Each feature participates in the classification by distributing real-valued votes among classes. The class receiving the highest vote is declared to be the predicted class. VFI is compared with the Naive Bayesian Classifier, which also considers each feature separately. Experiments on real-world datasets show that VFI achieves comparably and even better than NBC in terms of classification accuracy. Moreover, VFI is faster than NBC on all datasets. © Springer-Verlag Berlin Heidelberg 1997

Bilkent University Institutional Repository

Modélisation multi-agent dans un processus de gestion multi acteur, application au maintien à domicile

Author: Rammal Ali
Publication venue
Publication date: 13/12/2010
Field of study

Les systèmes de maintien ou de surveillance à domicile existants cherchent à répondre aux besoins de ce domaine, mais souffrent néanmoins de quelques limites, une de ces limites étant que ces systèmes sont centrés sur une seule personne et ne permettent pas la surveillance de plusieurs personnes en même temps. Notre objectif est de construire des patrons de comportement à partir des informations provenant du domicile des personnes suivies à l'aide des capteurs de mouvement, des capteurs physiologiques, des cahiers de liaison, et d'autres sources, dans le but d'avoir une vision macroscopique des personnes suivies. Pour ce faire nous déployons une architecture de classification utilisable à grande échelle et basée sur les technologies multi-agent. Nous avons opté pour une méthode de classification multi-agents car l'application des méthodes classiques centralisées (statistiques, neuronales, de formation de concept...) ne sont pas possibles quand les données nécessaires pour faire la classification, sont distribuées. De telles méthodes ne permettent pas le passage à l'échelle qui suppose de pouvoir prendre en compte de nombreuses personnes situées dans des environnements différents et suivies par de nombreux indicateurs dont le nombre et le domaine de valeur peuvent évoluer dans le temps. Un tel passage à l'échelle est possible avec les méthodes multi-agents où chaque agent gère une partie de données sur un sous-ensemble de la population suivie. L'évolution du nombre ou du domaine des indicateurs peut induire la suppression ou l'ajout d'un nouvel agent sans l'obligation de refaire tout le calcul.This research can be seen as a macroscopic approach to a large-scale distributed data gathering. We propose a software architecture to monitor elderly or dependent people in their own house. Many studies have been done on hardware aspects resulting in operational products. But there is a lack of adaptive algorithms to handle all the data generated by these products, because such data is distributed and heterogeneous in a large scale environment. We propose a multi-agent classification method to collect and to aggregate data about activity, movements and physiological information of the monitored people: agent's know-how consists in a simple classification algorithm. Data generated at this local level are communicated and adjusted between agents to obtain a set of patterns. This data is dynamic; the system has to store the built patterns and has to create new patterns when new data is available. Therefore, the system is adaptive and can be spread on a large scale. The generated data is used at a local level, for example to raise an alert, but also to evaluate global risks. We present the specification choices and the massively multi-agent architecture we developed

Thèses en ligne de l'Université Toulouse III - Paul Sabatier

BUILDING DSS USING KNOWLEDGE DISCOVERY IN DATABASE APPLIED TO ADMISSION & REGISTRATION FUNCTIONS

Author: EL-RAGAL AHMED ABDEL HAMEED HASSAN
Publication venue: 'University of Plymouth'
Publication date: 01/01/2001
Field of study

This research investigates the practical issues surrounding the development and implementation of Decision Support Systems (DSS). The research describes the traditional development approaches analyzing their drawbacks and introduces a new DSS development methodology. The proposed DSS methodology is based upon four modules; needs' analysis, data warehouse (DW), knowledge discovery in database (KDD), and a DSS module. The proposed DSS methodology is applied to and evaluated using the admission and registration functions in Egyptian Universities. The research investigates the organizational requirements that are required to underpin these functions in Egyptian Universities. These requirements have been identified following an in-depth survey of the recruitment process in the Egyptian Universities. This survey employed a multi-part admission and registration DSS questionnaire (ARDSSQ) to identify the required data sources together with the likely users and their information needs. The questionnaire was sent to senior managers within the Egyptian Universities (both private and government) with responsibility for student recruitment, in particular admission and registration. Further, access to a large database has allowed the evaluation of the practical suitability of using a data warehouse structure and knowledge management tools within the decision making framework. 1600 students' records have been analyzed to explore the KDD process, and another 2000 records have been used to build and test the data mining techniques within the KDD process. Moreover, the research has analyzed the key characteristics of data warehouses and explored the advantages and disadvantages of such data structures. This evaluation has been used to build a data warehouse for the Egyptian Universities that handle their admission and registration related archival data. The decision makers' potential benefits of the data warehouse within the student recruitment process will be explored. The design of the proposed admission and registration DSS (ARDSS) will be developed and tested using Cool: Gen (5.0) CASE tools by Computer Associates (CA), connected to a MSSQL Server (6.5), in a Windows NT (4.0) environment. Crystal Reports (4.6) by Seagate will be used as a report generation tool. CLUST AN Graphics (5.0) by CLUST AN software will also be used as a clustering package. Finally, the contribution of this research is found in the following areas: A new DSS development methodology; The development and validation of a new research questionnaire (i.e. ARDSSQ); The development of the admission and registration data warehouse; The evaluation and use of cluster analysis proximities and techniques in the KDD process to find knowledge in the students' records; And the development of the ARDSS software that encompasses the advantages of the KDD and DW and submitting these advantages to the senior admission and registration managers in the Egyptian Universities. The ARDSS software could be adjusted for usage in different countries for the same purpose, it is also scalable to handle new decision situations and can be integrated with other systems

Plymouth Electronic Archive and Research Library