344 research outputs found

    Linear and Order Statistics Combiners for Pattern Classification

    Full text link
    Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the "added" error. If N unbiased classifiers are combined by simple averaging, the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the ith order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.Comment: 31 page

    Ensemble learning for blending gridded satellite and gauge-measured precipitation data

    Full text link
    Regression algorithms are regularly used for improving the accuracy of satellite precipitation products. In this context, ground-based measurements are the dependent variable and the satellite data are the predictor variables, together with topography factors. Alongside this, it is increasingly recognised in many fields that combinations of algorithms through ensemble learning can lead to substantial predictive performance improvements. Still, a sufficient number of ensemble learners for improving the accuracy of satellite precipitation products and their large-scale comparison are currently missing from the literature. In this work, we fill this specific gap by proposing 11 new ensemble learners in the field and by extensively comparing them for the entire contiguous United States and for a 15-year period. We use monthly data from the PERSIANN (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks) and IMERG (Integrated Multi-satellitE Retrievals for GPM) gridded datasets. We also use gauge-measured precipitation data from the Global Historical Climatology Network monthly database, version 2 (GHCNm). The ensemble learners combine the predictions by six regression algorithms (base learners), namely the multivariate adaptive regression splines (MARS), multivariate adaptive polynomial splines (poly-MARS), random forests (RF), gradient boosting machines (GBM), extreme gradient boosting (XGBoost) and Bayesian regularized neural networks (BRNN), and each of them is based on a different combiner. The combiners include the equal-weight combiner, the median combiner, two best learners and seven variants of a sophisticated stacking method. The latter stacks a regression algorithm on the top of the base learners to combine their independent predictions...Comment: arXiv admin note: text overlap with arXiv:2301.0125

    A Network Topology for Composable Infrastructures

    Get PDF
    This paper proposes a passive optical backplane as a new network topology for composable computing infrastructures. The topology provides a high capacity, low-latency and flexible fabric that interconnects disaggregated resource components. The network topology is dedicated to inter-resource communication between composed logical hosts to ensure effective performance. We formulated a mixed integer linear programming (MILP) model that dynamically creates logical networks to support intra logical host communication over the physical network topology. The MILP performs energy efficient logical network instantiation given each application's resource demand. The topology can achieve 1 Tbps capacity per resource node given appropriate wavelength transmission data rate and the right number of wavelengths per node

    Neural-Based Ensembles and Unorganized Machines to Predict Streamflow Series from Hydroelectric Plants

    Get PDF
    Estimating future streamflows is a key step in producing electricity for countries with hydroelectric plants. Accurate predictions are particularly important due to environmental and economic impact they lead. In order to analyze the forecasting capability of models regarding monthly seasonal streamflow series, we realized an extensive investigation considering: six versions of unorganized machines—extreme learning machines (ELM) with and without regularization coefficient (RC), and echo state network (ESN) using the reservoirs from Jaeger’s and Ozturk et al., with and without RC. Additionally, we addressed the ELM as the combiner of a neural-based ensemble, an investigation not yet accomplished in such context. A comparative analysis was performed utilizing two linear approaches (autoregressive model (AR) and autoregressive and moving average model (ARMA)), four artificial neural networks (multilayer perceptron, radial basis function, Elman network, and Jordan network), and four ensembles. The tests were conducted at five hydroelectric plants, using horizons of 1, 3, 6, and 12 steps ahead. The results indicated that the unorganized machines and the ELM ensembles performed better than the linear models in all simulations. Moreover, the errors showed that the unorganized machines and the ELM-based ensembles reached the best general performances

    Deep Space Network information system architecture study

    Get PDF
    The purpose of this article is to describe an architecture for the Deep Space Network (DSN) information system in the years 2000-2010 and to provide guidelines for its evolution during the 1990s. The study scope is defined to be from the front-end areas at the antennas to the end users (spacecraft teams, principal investigators, archival storage systems, and non-NASA partners). The architectural vision provides guidance for major DSN implementation efforts during the next decade. A strong motivation for the study is an expected dramatic improvement in information-systems technologies, such as the following: computer processing, automation technology (including knowledge-based systems), networking and data transport, software and hardware engineering, and human-interface technology. The proposed Ground Information System has the following major features: unified architecture from the front-end area to the end user; open-systems standards to achieve interoperability; DSN production of level 0 data; delivery of level 0 data from the Deep Space Communications Complex, if desired; dedicated telemetry processors for each receiver; security against unauthorized access and errors; and highly automated monitor and control

    Improving binary classification using filtering based on k-NN proximity graphs

    Get PDF
    © 2020, The Author(s). One of the ways of increasing recognition ability in classification problem is removing outlier entries as well as redundant and unnecessary features from training set. Filtering and feature selection can have large impact on classifier accuracy and area under the curve (AUC), as noisy data can confuse classifier and lead it to catch wrong patterns in training data. The common approach in data filtering is using proximity graphs. However, the problem of the optimal filtering parameters selection is still insufficiently researched. In this paper filtering procedure based on k-nearest neighbours proximity graph was used. Filtering parameters selection was adopted as the solution of outlier minimization problem: k-NN proximity graph, power of distance and threshold parameters are selected in order to minimize outlier percentage in training data. Then performance of six commonly used classifiers (Logistic Regression, Naïve Bayes, Neural Network, Random Forest, Support Vector Machine and Decision Tree) and one heterogeneous classifiers combiner (DES-LA) are compared with and without filtering. Dynamic ensemble selection (DES) systems work by estimating the level of competence of each classifier from a pool of classifiers. Only the most competent ones are selected to classify a given test sample. This is achieved by defining a criterion to measure the level of competence of base classifiers, such as, its accuracy in local regions of the feature space around the query instance. In our case the combiner is based on the local accuracy of single classifiers and its output is a linear combination of single classifiers ranking. As results of filtering, accuracy of DES-LA combiner shows big increase for low-accuracy datasets. But filtering doesn’t have sufficient impact on DES-LA performance while working with high-accuracy datasets. The results are discussed, and classifiers, which performance was highly affected by pre-processing filtering step, are defined. The main contribution of the paper is introducing modifications to the DES-LA combiner, as well as comparative analysis of filtering impact on the classifiers of various type. Testing the filtering algorithm on real case dataset (Taiwan default credit card dataset) confirmed the efficiency of automatic filtering approach

    Revisiting the Energy-Efficient Hybrid D-A Precoding and Combining Design For mm-Wave Systems

    Get PDF
    Hybrid digital to analog (D-A) precoding is widely used in millimeter wave systems to reduce the power consumption and implementation complexity incurred by the number of radio frequency (RF) chains that consume a lot of the transmitted power in this system. In this paper, an optimal number of RF chains is proposed to achieve the desired energy efficiency (EE). Here, the optimization problem is formulated in terms of fractional programming maximization, resulting in a method with a twofold novelty: First, the optimal number of RF chains is determined by the proposed bisection algorithm, which results in an optimized number of data streams. Second, the optimal analog precoders/combiners are designed by eigenvalue decomposition and a power iteration algorithm, followed by the digital precoders/combiners which are designed based on the singular value decomposition of the proposed effective uplink and downlink channel gains. Furthermore, the proposed D-A systems are designed carefully to attain a lower complexity than the existing D-A algorithms while achieving reasonable performance. Finally, the impact of utilizing a different number of quantized bits of resolution on the EE is investigated. Simulation results show that the proposed algorithms outperform existing algorithms in terms of EE, spectral efficiency, and computational complexity
    • …
    corecore