15,928 research outputs found

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    On the role of pre and post-processing in environmental data mining

    Get PDF
    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed

    New methods for the estimation of Takagi-Sugeno model based extended Kalman filter and its applications to optimal control for nonlinear systems

    Get PDF
    This paper describes new approaches to improve the local and global approximation (matching) and modeling capability of Takagi–Sugeno (T-S) fuzzy model. The main aim is obtaining high function approximation accuracy and fast convergence. The main problem encountered is that T-S identification method cannot be applied when the membership functions are overlapped by pairs. This restricts the application of the T-S method because this type of membership function has been widely used during the last 2 decades in the stability, controller design of fuzzy systems and is popular in industrial control applications. The approach developed here can be considered as a generalized version of T-S identification method with optimized performance in approximating nonlinear functions. We propose a noniterative method through weighting of parameters approach and an iterative algorithm by applying the extended Kalman filter, based on the same idea of parameters’ weighting. We show that the Kalman filter is an effective tool in the identification of T-S fuzzy model. A fuzzy controller based linear quadratic regulator is proposed in order to show the effectiveness of the estimation method developed here in control applications. An illustrative example of an inverted pendulum is chosen to evaluate the robustness and remarkable performance of the proposed method locally and globally in comparison with the original T-S model. Simulation results indicate the potential, simplicity, and generality of the algorithm. An illustrative example is chosen to evaluate the robustness. In this paper, we prove that these algorithms converge very fast, thereby making them very practical to use

    Taming Wild High Dimensional Text Data with a Fuzzy Lash

    Full text link
    The bag of words (BOW) represents a corpus in a matrix whose elements are the frequency of words. However, each row in the matrix is a very high-dimensional sparse vector. Dimension reduction (DR) is a popular method to address sparsity and high-dimensionality issues. Among different strategies to develop DR method, Unsupervised Feature Transformation (UFT) is a popular strategy to map all words on a new basis to represent BOW. The recent increase of text data and its challenges imply that DR area still needs new perspectives. Although a wide range of methods based on the UFT strategy has been developed, the fuzzy approach has not been considered for DR based on this strategy. This research investigates the application of fuzzy clustering as a DR method based on the UFT strategy to collapse BOW matrix to provide a lower-dimensional representation of documents instead of the words in a corpus. The quantitative evaluation shows that fuzzy clustering produces superior performance and features to Principal Components Analysis (PCA) and Singular Value Decomposition (SVD), two popular DR methods based on the UFT strategy

    Latent class analysis for segmenting preferences of investment bonds

    Get PDF
    Market segmentation is a key component of conjoint analysis which addresses consumer preference heterogeneity. Members in a segment are assumed to be homogenous in their views and preferences when worthing an item but distinctly heterogenous to members of other segments. Latent class methodology is one of the several conjoint segmentation procedures that overcome the limitations of aggregate analysis and a-priori segmentation. The main benefit of Latent class models is that market segment membership and regression parameters of each derived segment are estimated simultaneously. The Latent class model presented in this paper uses mixtures of multivariate conditional normal distributions to analyze rating data, where the likelihood is maximized using the EM algorithm. The application focuses on customer preferences for investment bonds described by four attributes; currency, coupon rate, redemption term and price. A number of demographic variables are used to generate segments that are accessible and actionable.peer-reviewe
    corecore