Search CORE

344 research outputs found

Linear and Order Statistics Combiners for Pattern Classification

Author: Ghosh Joydeep
Tumer Kagan
Publication venue
Publication date: 01/01/1999
Field of study

Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the "added" error. If N unbiased classifiers are combined by simple averaging, the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the ith order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.Comment: 31 page

arXiv.org e-Print Archive

CiteSeerX

Ensemble learning for blending gridded satellite and gauge-measured precipitation data

Author: Doulamis Anastasios
Doulamis Nikolaos
Papacharalampous Georgia
Tyralis Hristos
Publication venue
Publication date: 09/07/2023
Field of study

Regression algorithms are regularly used for improving the accuracy of satellite precipitation products. In this context, ground-based measurements are the dependent variable and the satellite data are the predictor variables, together with topography factors. Alongside this, it is increasingly recognised in many fields that combinations of algorithms through ensemble learning can lead to substantial predictive performance improvements. Still, a sufficient number of ensemble learners for improving the accuracy of satellite precipitation products and their large-scale comparison are currently missing from the literature. In this work, we fill this specific gap by proposing 11 new ensemble learners in the field and by extensively comparing them for the entire contiguous United States and for a 15-year period. We use monthly data from the PERSIANN (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks) and IMERG (Integrated Multi-satellitE Retrievals for GPM) gridded datasets. We also use gauge-measured precipitation data from the Global Historical Climatology Network monthly database, version 2 (GHCNm). The ensemble learners combine the predictions by six regression algorithms (base learners), namely the multivariate adaptive regression splines (MARS), multivariate adaptive polynomial splines (poly-MARS), random forests (RF), gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and Bayesian regularized neural networks (BRNN), and each of them is based on a different combiner. The combiners include the equal-weight combiner, the median combiner, two best learners and seven variants of a sophisticated stacking method. The latter stacks a regression algorithm on the top of the base learners to combine their independent predictions...Comment: arXiv admin note: text overlap with arXiv:2301.0125

arXiv.org e-Print Archive

Recommended from our members

Predicting business failure using artificial intelligence system

Author: Allozi Yaser
Publication venue: Brunel University London
Publication date: 01/01/2021
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonPredicting business insolvency is considered one of the main supportive sources of information for decision making for financial institutions, investors, creditors, and other participants in the business market. Financial reporting systems provide relevant information that can be used to assess the financial position of firms. It is crucial to have classification and prediction models that can analyse this financial information and provide accurate assurance for users about business health. Recent studies have explored the use of machine learning tools as substitute for traditional statistical methods to develop classification models to classify firm insolvency according to financial statement information. However, these models have no ideal classifier, since each provides a certain percentage of wrong outputs, which is a crucial consideration; every percentage of wrong response can mean massive financial losses for stakeholders. Therefore, this study proposes new insolvency classification and perdition models based on machine learning modelling techniques to develop an improved classifier. Individual modelling techniques using statistical methods and machine learning were used to develop the classification model of business insolvency. The results showed that machine learning method outperformed statistical methods. Deep Learning (DPL) achieved the highest performance based on all performance measurements used in the study, and it was the best individual classifier, with average accuracy of 97.2% using all-years dataset. Ensemble- Boosted Decision Tree classifier ranked second, followed by Decision Tree classifier. Thus, it has been proven that DPL modelling approach is useful for business insolvency classification. A key contribution in enhancing individual classifier outputs is the use of traditional combining methods with two new aggregation methods in business insolvency (Fuzzy Logic and Consensus Approach). The Consensus Approach showed the best improvement in the results of all individual classifiers with average accuracy of 97.7%, and it is considered the best classification method not only in comparison with individual classifiers, but also with traditional combiners. This study pioneers the development of a time series business insolvency prediction model with Big Data for UK businesses. The aim of the model is to provide early prediction about a business health. Three prediction models were developed based on Nonlinear Autoregressive with Exogenous Input models (NARX), Nonlinear Autoregressive Neural Network (NAR), and Deep Learning Time-series model (DPL-SA) and achieved average accuracy rates of 83.6%, 89.5%, and 91.35%, respectively. The results show relatively high performance in comparison with the best individual classifier (deep learning)

Brunel University Research Archive

A Network Topology for Composable Infrastructures

Author: Ajibola OO
El-Gorashi TEH
Elmirghani JMH
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/04/2020
Field of study

This paper proposes a passive optical backplane as a new network topology for composable computing infrastructures. The topology provides a high capacity, low-latency and flexible fabric that interconnects disaggregated resource components. The network topology is dedicated to inter-resource communication between composed logical hosts to ensure effective performance. We formulated a mixed integer linear programming (MILP) model that dynamically creates logical networks to support intra logical host communication over the physical network topology. The MILP performs energy efficient logical network instantiation given each application's resource demand. The topology can achieve 1 Tbps capacity per resource node given appropriate wavelength transmission data rate and the right number of wavelengths per node

arXiv.org e-Print Archive

University of Canberra Research Repository

Neural-Based Ensembles and Unorganized Machines to Predict Streamflow Series from Hydroelectric Plants

Author: Attilio Converti
F??bio Usberti
Hugo Siqueira
J??natas Belotti
Jo??o Fausto L. de Oliveira
Leonie Asfora Sarubbo
Lilian Araujo
Manoel H. N. Marinho
Marcos de Almeida Leone Filho
Paulo S. G. de Mattos Neto
S??rgio L. Stevan
Publication venue: place:Basilea
Publication date: 01/01/2020
Field of study

Estimating future streamflows is a key step in producing electricity for countries with hydroelectric plants. Accurate predictions are particularly important due to environmental and economic impact they lead. In order to analyze the forecasting capability of models regarding monthly seasonal streamflow series, we realized an extensive investigation considering: six versions of unorganized machines—extreme learning machines (ELM) with and without regularization coefficient (RC), and echo state network (ESN) using the reservoirs from Jaeger’s and Ozturk et al., with and without RC. Additionally, we addressed the ELM as the combiner of a neural-based ensemble, an investigation not yet accomplished in such context. A comparative analysis was performed utilizing two linear approaches (autoregressive model (AR) and autoregressive and moving average model (ARMA)), four artificial neural networks (multilayer perceptron, radial basis function, Elman network, and Jordan network), and four ensembles. The tests were conducted at five hydroelectric plants, using horizons of 1, 3, 6, and 12 steps ahead. The results indicated that the unorganized machines and the ELM ensembles performed better than the linear models in all simulations. Moreover, the errors showed that the unorganized machines and the ELM-based ensembles reached the best general performances

Archivio istituzionale della ricerca - Università di Genova

Deep Space Network information system architecture study

Author: Atkinson D. J.
Beswick C. A.
Cooper L. P.
Crowe R. A.
Jenkins J. S.
Markley R. W.
Masline R. C.
Stoloff M. J.
Tausworthe R. C.
Thomas J. L.
Publication venue
Publication date: 15/05/1992
Field of study

The purpose of this article is to describe an architecture for the Deep Space Network (DSN) information system in the years 2000-2010 and to provide guidelines for its evolution during the 1990s. The study scope is defined to be from the front-end areas at the antennas to the end users (spacecraft teams, principal investigators, archival storage systems, and non-NASA partners). The architectural vision provides guidance for major DSN implementation efforts during the next decade. A strong motivation for the study is an expected dramatic improvement in information-systems technologies, such as the following: computer processing, automation technology (including knowledge-based systems), networking and data transport, software and hardware engineering, and human-interface technology. The proposed Ground Information System has the following major features: unified architecture from the front-end area to the end user; open-systems standards to achieve interoperability; DSN production of level 0 data; delivery of level 0 data from the Deep Space Communications Complex, if desired; dedicated telemetry processors for each receiver; security against unauthorized access and errors; and highly automated monitor and control

Improving binary classification using filtering based on k-NN proximity graphs

Author: Abbod Maysam F.
Ala’raj Maher
Majdalawieh Munir
Publication venue: ZU Scholars
Publication date: 01/12/2020
Field of study

© 2020, The Author(s). One of the ways of increasing recognition ability in classification problem is removing outlier entries as well as redundant and unnecessary features from training set. Filtering and feature selection can have large impact on classifier accuracy and area under the curve (AUC), as noisy data can confuse classifier and lead it to catch wrong patterns in training data. The common approach in data filtering is using proximity graphs. However, the problem of the optimal filtering parameters selection is still insufficiently researched. In this paper filtering procedure based on k-nearest neighbours proximity graph was used. Filtering parameters selection was adopted as the solution of outlier minimization problem: k-NN proximity graph, power of distance and threshold parameters are selected in order to minimize outlier percentage in training data. Then performance of six commonly used classifiers (Logistic Regression, Naïve Bayes, Neural Network, Random Forest, Support Vector Machine and Decision Tree) and one heterogeneous classifiers combiner (DES-LA) are compared with and without filtering. Dynamic ensemble selection (DES) systems work by estimating the level of competence of each classifier from a pool of classifiers. Only the most competent ones are selected to classify a given test sample. This is achieved by defining a criterion to measure the level of competence of base classifiers, such as, its accuracy in local regions of the feature space around the query instance. In our case the combiner is based on the local accuracy of single classifiers and its output is a linear combination of single classifiers ranking. As results of filtering, accuracy of DES-LA combiner shows big increase for low-accuracy datasets. But filtering doesn’t have sufficient impact on DES-LA performance while working with high-accuracy datasets. The results are discussed, and classifiers, which performance was highly affected by pre-processing filtering step, are defined. The main contribution of the paper is introducing modifications to the DES-LA combiner, as well as comparative analysis of filtering impact on the classifiers of various type. Testing the filtering algorithm on real case dataset (Taiwan default credit card dataset) confirmed the efficiency of automatic filtering approach

Revisiting the Energy-Efficient Hybrid D-A Precoding and Combining Design For mm-Wave Systems

Author: Ahmed Qasim Zeeshan
Alluhaibi Osama
Higgins Matthew D.
Kampert Erik
Wang Jiangzhou
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/02/2020
Field of study

Hybrid digital to analog (D-A) precoding is widely used in millimeter wave systems to reduce the power consumption and implementation complexity incurred by the number of radio frequency (RF) chains that consume a lot of the transmitted power in this system. In this paper, an optimal number of RF chains is proposed to achieve the desired energy efficiency (EE). Here, the optimization problem is formulated in terms of fractional programming maximization, resulting in a method with a twofold novelty: First, the optimal number of RF chains is determined by the proposed bisection algorithm, which results in an optimized number of data streams. Second, the optimal analog precoders/combiners are designed by eigenvalue decomposition and a power iteration algorithm, followed by the digital precoders/combiners which are designed based on the singular value decomposition of the proposed effective uplink and downlink channel gains. Furthermore, the proposed D-A systems are designed carefully to attain a lower complexity than the existing D-A algorithms while achieving reasonable performance. Finally, the impact of utilizing a different number of quantized bits of resolution on the EE is investigated. Simulation results show that the proposed algorithms outperform existing algorithms in terms of EE, spectral efficiency, and computational complexity

Warwick Research Archives Portal Repository

Huddersfield Research Portal