Search CORE

10,880 research outputs found

Efficient Feature Subset Selection and Subset Size Optimization

Author: Jana Novovicova
Pavel Pudil
Petr Somol
Publication venue: 'IntechOpen'
Publication date: 01/02/2010
Field of study

Compression and Conditional Emulation of Climate Model Output

Author: Guinness Joseph
Hammerling Dorit
Publication venue: 'Informa UK Limited'
Publication date: 30/10/2017
Field of study

Numerical climate model simulations run at high spatial and temporal resolutions generate massive quantities of data. As our computing capabilities continue to increase, storing all of the data is not sustainable, and thus it is important to develop methods for representing the full datasets by smaller compressed versions. We propose a statistical compression and decompression algorithm based on storing a set of summary statistics as well as a statistical model describing the conditional distribution of the full dataset given the summary statistics. The statistical model can be used to generate realizations representing the full dataset, along with characterizations of the uncertainties in the generated data. Thus, the methods are capable of both compression and conditional emulation of the climate models. Considerable attention is paid to accurately modeling the original dataset--one year of daily mean temperature data--particularly with regard to the inherent spatial nonstationarity in global fields, and to determining the statistics to be stored, so that the variation in the original data can be closely captured, while allowing for fast decompression and conditional emulation on modest computers

arXiv.org e-Print Archive

FigShare

Adaptive Multi-level Backward Tracking for Sequential Feature Selection

Author: Chotchantarakun Knitchepon
Sornil Ohm
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 29/06/2021
Field of study

In the past few decades, the large amount of available data has become a major challenge in data mining and machine learning. Feature selection is a significant preprocessing step for selecting the most informative features by removing irrelevant and redundant features, especially for large datasets. These selected features play an important role in information searching and enhancing the performance of machine learning models. In this research, we propose a new technique called One-level Forward Multi-level Backward Selection (OFMB). The proposed algorithm consists of two phases. The first phase aims to create preliminarily selected subsets. The second phase provides an improvement on the previous result by an adaptive multi-level backward searching technique. Hence, the idea is to apply an improvement step during the feature addition and an adaptive search method on the backtracking step. We have tested our algorithm on twelve standard UCI datasets based on k-nearest neighbor and naive Bayes classifiers. Their accuracy was then compared with some popular methods. OFMB showed better results than the other sequential forward searching techniques for most of the tested datasets

Journal of ICT Research and Applications

ITB Journal

Improving Floating Search Feature Selection using Genetic Algorithm

Author: Homsapaya Kanyanut
Sornil Ohm
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 01/12/2017
Field of study

Classification, a process for predicting the class of a given input data, is one of the most fundamental tasks in data mining. Classification performance is negatively affected by noisy data and therefore selecting features relevant to the problem is a critical step in classification, especially when applied to large datasets. In this article, a novel filter-based floating search technique for feature selection to select an optimal set of features for classification purposes is proposed. A genetic algorithm is employed to improve the quality of the features selected by the floating search method in each iteration. A criterion function is applied to select relevant and high-quality features that can improve classification accuracy. The proposed method was evaluated using 20 standard machine learning datasets of various size and complexity. The results show that the proposed method is effective in general across different classifiers and performs well in comparison with recently reported techniques. In addition, the application of the proposed method with support vector machine provides the best performance among the classifiers studied and outperformed previous researches with the majority of data sets

Crossref

Journal of ICT Research and Applications

Directory of Open Access Journals

ITB Journal

Feature Selection For The Fuzzy Artmap Neural Network Using A Hybrid Genetic Algorithm And Tabu Search

Author: Tang Weng Chin
Publication venue
Publication date: 01/07/2007
Field of study

Prestasi pengelas rangkaian neural amat bergantung kepada set data yang digunakan dalam process pembelajaran. The performance of Neural-Network (NN)-based classifiers is strongly dependent on the data set used for learning

Repository@USM

Feature Selection For The Fuzzy Artmap Neural Network Using A Hybrid Genetic Algorithm And Tabu Search [QA76.87. T164 2007 f rb].

Author: Tang Weng Chin
Publication venue
Publication date: 01/07/2007
Field of study

Prestasi pengelas rangkaian neural amat bergantung kepada set data yang digunakan dalam process pembelajaran. Secara praktik, set data berkemungkinan mengandungi maklumat yang tidak diperlukan. Dengan itu, pencarian ciri merupakan suatu langkah yang penting dalam pembinaan suatu pengelas berdasarkan rangkaian neural yang efektif. The performance of Neural-Network (NN)-based classifiers is strongly dependent on the data set used for learning. In practice, a data set may contain noisy or redundant data items. Thus, feature selection is an important step in building an effective and efficient NN-based classifier

Repository@USM

Adaptive Basis Function Construction: An Approach for Adaptive Building of Sparse Polynomial Regression Models

Author: Gints Jekabsons
Publication venue: 'IntechOpen'
Publication date: 01/02/2010
Field of study

IntechOpen

Crossref