26,706 research outputs found

    On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

    Full text link
    We present a new distributed fuzzy partitioning method to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems. The proposed algorithm builds a fixed number of fuzzy sets for all variables and adjusts their shape and position to the real distribution of training data. A two-step process is applied : 1) transformation of the original distribution into a standard uniform distribution by means of the probability integral transform. Since the original distribution is generally unknown, the cumulative distribution function is approximated by computing the q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy partition in the transformed attribute space using a fixed number of equally distributed triangular membership functions. Despite the aforementioned transformation, the definition of every fuzzy set in the original space can be recovered by applying the inverse cumulative distribution function (also known as quantile function). The experimental results reveal that the proposed methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT) induction algorithm to maintain classification accuracy with up to 6 million fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData Congress). arXiv admin note: text overlap with arXiv:1902.0935

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

    A new and efficient intelligent collaboration scheme for fashion design

    Get PDF
    Technology-mediated collaboration process has been extensively studied for over a decade. Most applications with collaboration concepts reported in the literature focus on enhancing efficiency and effectiveness of the decision-making processes in objective and well-structured workflows. However, relatively few previous studies have investigated the applications of collaboration schemes to problems with subjective and unstructured nature. In this paper, we explore a new intelligent collaboration scheme for fashion design which, by nature, relies heavily on human judgment and creativity. Techniques such as multicriteria decision making, fuzzy logic, and artificial neural network (ANN) models are employed. Industrial data sets are used for the analysis. Our experimental results suggest that the proposed scheme exhibits significant improvement over the traditional method in terms of the time–cost effectiveness, and a company interview with design professionals has confirmed its effectiveness and significance

    Collaborative decision making by ensemble rule based classification systems

    Get PDF
    corecore