2,773 research outputs found
On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems
We present a new distributed fuzzy partitioning method to reduce the
complexity of multi-way fuzzy decision trees in Big Data classification
problems. The proposed algorithm builds a fixed number of fuzzy sets for all
variables and adjusts their shape and position to the real distribution of
training data. A two-step process is applied : 1) transformation of the
original distribution into a standard uniform distribution by means of the
probability integral transform. Since the original distribution is generally
unknown, the cumulative distribution function is approximated by computing the
q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy
partition in the transformed attribute space using a fixed number of equally
distributed triangular membership functions. Despite the aforementioned
transformation, the definition of every fuzzy set in the original space can be
recovered by applying the inverse cumulative distribution function (also known
as quantile function). The experimental results reveal that the proposed
methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT)
induction algorithm to maintain classification accuracy with up to 6 million
fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData
Congress). arXiv admin note: text overlap with arXiv:1902.0935
Regional flood frequency analysis using the FCM-ANFIS algorithm : a case study in south-eastern Australia
Regional flood frequency analysis (RFFA) is widely used to estimate design floods in ungauged catchments. Both linear and non-linear methods are adopted in RFFA. The development of the non-linear RFFA method Adaptive Neuro-fuzzy Inference System (ANFIS) using data from 181 gauged catchments in south-eastern Australia is presented in this study. Three different types of ANFIS models, Fuzzy C-mean (FCM), Subtractive Clustering (SC), and Grid Partitioning (GP) were adopted, and the results were compared with the Quantile Regression Technique (QRT). It was found that FCM performs better (with relative error (RE) values in the range of 38-60%) than the SC (RE of 44-69%) and GP (RE of 42-78%) models. The FCM performs better for smaller to medium ARIs (2 to 20 years) (ARI of five years having the best performance), and in New South Wales, over Victoria. In many aspects, the QRT and FCM models perform very similarly. These developed RFFA models can be used in south-eastern Australia to derive more accurate flood quantiles. The developed method can easily be adapted to other parts of Australia and other countries. The results of this study will assist in updating the Australian Rainfall Runoff (national guide)-recommended RFFA technique
Characterisation of large changes in wind power for the day-ahead market using a fuzzy logic approach
Wind power has become one of the renewable resources with a major growth in the electricity market. However, due to its inherent variability, forecasting techniques are necessary for the optimum scheduling of the electric grid, specially during ramp events. These large changes in wind power may not be captured by wind power point forecasts even with very high resolution Numerical Weather Prediction (NWP) models. In this paper, a fuzzy approach for wind power ramp characterisation is presented. The main benefit of this technique is that it avoids the binary definition of ramp event, allowing to identify changes in power out- put that can potentially turn into ramp events when the total percentage of change to be considered a ramp event is not met. To study the application of this technique, wind power forecasts were obtained and their corresponding error estimated using Genetic Programming (GP) and Quantile Regression Forests. The error distributions were incorporated into the characterisation process, which according to the results, improve significantly the ramp capture. Results are presented using colour maps, which provide a useful way to interpret the characteristics of the ramp events
Deep Generative Models for Reject Inference in Credit Scoring
Credit scoring models based on accepted applications may be biased and their
consequences can have a statistical and economic impact. Reject inference is
the process of attempting to infer the creditworthiness status of the rejected
applications. In this research, we use deep generative models to develop two
new semi-supervised Bayesian models for reject inference in credit scoring, in
which we model the data generating process to be dependent on a Gaussian
mixture. The goal is to improve the classification accuracy in credit scoring
models by adding reject applications. Our proposed models infer the unknown
creditworthiness of the rejected applications by exact enumeration of the two
possible outcomes of the loan (default or non-default). The efficient
stochastic gradient optimization technique used in deep generative models makes
our models suitable for large data sets. Finally, the experiments in this
research show that our proposed models perform better than classical and
alternative machine learning models for reject inference in credit scoring
A robust fault diagnosis and forecasting approach based on Kalman filter and interval type-2 fuzzy logic for efficiency improvement of centrifugal gas compressor system
The paper proposes a robust faults detection and forecasting approach for a centrifugal gas compressor system, the mechanism of this approach used the Kalman filter to estimate and filtering the unmeasured states of the studied system based on signals data of the inputs and the outputs that have been collected experimentally on site. The intelligent faults detection expert system is designed based on the interval type-2 fuzzy logic. The present work is achieved by an important task which is the prediction of the remaining time of the system under study to reach the danger and/or the failure stage based on the Auto-regressive Integrated Moving Average (ARIMA) model, where the objective within the industrial application is to set the maintenance schedules in precisely time. The obtained results prove the performance of the proposed faults diagnosis and detection approach which can be used in several heavy industrial systemsPeer ReviewedPostprint (published version
Development of Neurofuzzy Architectures for Electricity Price Forecasting
In 20th century, many countries have liberalized their electricity market. This power markets liberalization has directed generation companies as well as wholesale buyers to undertake a greater intense risk exposure compared to the old centralized framework. In this framework, electricity price prediction has become crucial for any market player in their decisionâmaking process as well as strategic planning. In this study, a prototype asymmetricâbased neuroâfuzzy network (AGFINN) architecture has been implemented for shortâterm electricity prices forecasting for ISO New England market. AGFINN framework has been designed through two different defuzzification schemes. Fuzzy clustering has been explored as an initial step for defining the fuzzy rules while an asymmetric Gaussian membership function has been utilized in the fuzzification part of the model. Results related to the minimum and maximum electricity prices for ISO New England, emphasize the superiority of the proposed model over wellâestablished learningâbased models
Sustainable Assessment in Supply Chain and Infrastructure Management
In the competitive business environment or public domain, the sustainability assessment in supply chain and infrastructure management are important for any organization. Organizations are currently striving to improve their sustainable strategies through preparedness, response, and recovery because of increasing competitiveness, community, and regulatory pressure. Thus, it is necessary to develop a meaningful and more focused understanding of sustainability in supply chain management and infrastructure management practices. In the context of a supply chain, sustainability implies that companies identify, assess, and manage impacts and risks in all the echelons of the supply chain, considering downstream and upstream activities. Similarly, the sustainable infrastructure management indicates the ability of infrastructure to meet the requirements of the present without sacrificing the ability of future generations to address their needs. The complexities regarding sustainable supply chain and infrastructure management have driven managers and professionals to seek different solutions. This Special Issue aims to provide readers with the most recent research results on the aforementioned subjects. In addition, it offers some solutions and also raises some questions for further research and development toward sustainable supply chain and infrastructure management
Quantile regression forests-based modeling and environmental indicators for decision support in broiler farming
An efficient and sustainable animal production requires fine-tuning and control of all the parameters involved. But this is not a simple task. Animal farming is a complex biological system in which environmental parameters and management practices interact in a dynamic way. In addition, the typical non-linear response of biological processes implies that relationships across parameters that are critical to assure animal welfare and performance are difficult to determine. In this paper a novel decision support system based on environmental indicators and on weights, leg problems and mortality rates is proposed to address this issue. The data-driven modeling process is performed by a quantile regression forests approach that allows estimating growth, welfare and mortality parameters on the basis of environmental deviations from optimal farm conditions. Resulting models also provide confidence intervals able to deal with uncertainty. They are deployed in farm, offering an accessible tool for farmers, veterinarians and technical personnel. Experimental results involving 20 flocks of broiler meat chickens from different farms show the validity of the system, obtaining robust prediction intervals and high accuracy, namely over 81% for every model. The in-field use of the proposed approach will facilitate an efficient and animal welfare-friendly production management.This project was funded by the Spanish Ministry of Economy and Competitivity, General Directorate for Science and Technology, National Research Program âRetos de la Sociedadâ Project #AGL2013-49173-C2-1-R P.I. Inma Estevez and #AGL2013-49173-C2-2-R. The authors wish to thank to AN and the farmers for facilitating access to their farms for data collection
- âŚ