5 research outputs found

    A Metahierarchical Rule Decision System to Design Robust Fuzzy Classifiers Based on Data Complexity

    Get PDF
    There is a wide variety of studies that propose different classifiers to solve a large amount of problems in distinct classification scenarios. The no free lunch theorem states that if we use a big enough set of varied problems, all classifiers would be equivalent in performance. From another point of view, the performance of the classifiers is dependant of the scope and properties of the datasets. In this sense, new proposals on the topic often focus on a given context, aiming at improving the related state-of-the-art approaches. Data complexity metrics have been traditionally used to determine the inner characteristics of datasets. This way, researchers are able to categorize the problems in different scenarios. Then, this taxonomy can be applied to determine inner characteristics of the datasets in order to determine intervals of good and bad behavior for a given classifier. In this paper, we will take advantage of the data complexity metrics in order to design a fuzzy metaclassifier. The final goal is to create decision rules based on the inner characteristics of the data to apply a different version of the fuzzy classifier for a given problem. To do so, we will make use of the FARC-HD classifier, an evolutionary fuzzy system that has led to different extensions in the specialized literature. Experimental results show the goodness of this novel approach as it is able to outperform all versions of FARC-HD on a wide set of problems, and obtain competitive results (in terms of performance and interpretability) versus two selected state-of-the-art rule-based classification system, C4.5 and FURIA

    Revisiting Data Complexity Metrics Based on Morphology for Overlap and Imbalance: Snapshot, New Overlap Number of Balls Metrics and Singular Problems Prospect

    Full text link
    Data Science and Machine Learning have become fundamental assets for companies and research institutions alike. As one of its fields, supervised classification allows for class prediction of new samples, learning from given training data. However, some properties can cause datasets to be problematic to classify. In order to evaluate a dataset a priori, data complexity metrics have been used extensively. They provide information regarding different intrinsic characteristics of the data, which serve to evaluate classifier compatibility and a course of action that improves performance. However, most complexity metrics focus on just one characteristic of the data, which can be insufficient to properly evaluate the dataset towards the classifiers' performance. In fact, class overlap, a very detrimental feature for the classification process (especially when imbalance among class labels is also present) is hard to assess. This research work focuses on revisiting complexity metrics based on data morphology. In accordance to their nature, the premise is that they provide both good estimates for class overlap, and great correlations with the classification performance. For that purpose, a novel family of metrics have been developed. Being based on ball coverage by classes, they are named after Overlap Number of Balls. Finally, some prospects for the adaptation of the former family of metrics to singular (more complex) problems are discussed.Comment: 23 pages, 9 figures, preprin

    Policy resolution of shared data in online social networks

    Get PDF
    Online social networks have practically a go-to source for information divulging, social exchanges and finding new friends. The popularity of such sites is so profound that they are widely used by people belonging to different age groups and various regions. Widespread use of such sites has given rise to privacy and security issues. This paper proposes a set of rules to be incorporated to safeguard the privacy policies of related users while sharing information and other forms of media online. The proposed access control network takes into account the content sensitivity and confidence level of the accessor to resolve the conflicting privacy policies of the co-owners

    An evolutionary fuzzy system to support the replacement policy in water supply networks: The ranking of pipes according to their failure risk

    Get PDF
    Article number 107731In this study, an evolutionary fuzzy system is proposed to predict unexpected pipe failures in water supply networks. The system seeks to underpin the decisions of management companies regarding the maintenance and replacement plans of pipes. On the one hand, fuzzy logic provides high degrees of interpretability over other black box models, which is requested in engineering application where decisions have social consequences. On the other hand, the genetic algorithm helps to optimize the parameters that govern the model, specifically, for two purposes: (i) the selection of variables; and (ii) the optimization of membership functions. Data from a real water supply network are used to evaluate the accuracy of the developed system. Several graphs that depict the ranking of pipes according to their risk of failure against the network length to be replaced support the choice of the most successful model. In fact, results demonstrate that the annual replacement of 6.75% of the network length makes it possible to prevent 41.14% of unexpected pipe failuresEm
    corecore