2,853 research outputs found

    Optimal Kullback-Leibler Aggregation via Information Bottleneck

    Full text link
    In this paper, we present a method for reducing a regular, discrete-time Markov chain (DTMC) to another DTMC with a given, typically much smaller number of states. The cost of reduction is defined as the Kullback-Leibler divergence rate between a projection of the original process through a partition function and a DTMC on the correspondingly partitioned state space. Finding the reduced model with minimal cost is computationally expensive, as it requires an exhaustive search among all state space partitions, and an exact evaluation of the reduction cost for each candidate partition. Our approach deals with the latter problem by minimizing an upper bound on the reduction cost instead of minimizing the exact cost; The proposed upper bound is easy to compute and it is tight if the original chain is lumpable with respect to the partition. Then, we express the problem in the form of information bottleneck optimization, and propose using the agglomerative information bottleneck algorithm for searching a sub-optimal partition greedily, rather than exhaustively. The theory is illustrated with examples and one application scenario in the context of modeling bio-molecular interactions.Comment: 13 pages, 4 figure

    Adaptive Evolutionary Clustering

    Full text link
    In many practical applications of clustering, the objects to be clustered evolve over time, and a clustering result is desired at each time step. In such applications, evolutionary clustering typically outperforms traditional static clustering by producing clustering results that reflect long-term trends while being robust to short-term variations. Several evolutionary clustering algorithms have recently been proposed, often by adding a temporal smoothness penalty to the cost function of a static clustering method. In this paper, we introduce a different approach to evolutionary clustering by accurately tracking the time-varying proximities between objects followed by static clustering. We present an evolutionary clustering framework that adaptively estimates the optimal smoothing parameter using shrinkage estimation, a statistical approach that improves a naive estimate using additional information. The proposed framework can be used to extend a variety of static clustering algorithms, including hierarchical, k-means, and spectral clustering, into evolutionary clustering algorithms. Experiments on synthetic and real data sets indicate that the proposed framework outperforms static clustering and existing evolutionary clustering algorithms in many scenarios.Comment: To appear in Data Mining and Knowledge Discovery, MATLAB toolbox available at http://tbayes.eecs.umich.edu/xukevin/affec

    Learning Hybrid Neuro-Fuzzy Classifier Models From Data: To Combine or Not to Combine?

    Get PDF
    To combine or not to combine? Though not a question of the same gravity as the Shakespeare’s to be or not to be, it is examined in this paper in the context of a hybrid neuro-fuzzy pattern classifier design process. A general fuzzy min-max neural network with its basic learning procedure is used within six different algorithm independent learning schemes. Various versions of cross-validation, resampling techniques and data editing approaches, leading to a generation of a single classifier or a multiple classifier system, are scrutinised and compared. The classification performance on unseen data, commonly used as a criterion for comparing different competing designs, is augmented by further four criteria attempting to capture various additional characteristics of classifier generation schemes. These include: the ability to estimate the true classification error rate, the classifier transparency, the computational complexity of the learning scheme and the potential for adaptation to changing environments and new classes of data. One of the main questions examined is whether and when to use a single classifier or a combination of a number of component classifiers within a multiple classifier system

    Splitting hybrid Make-To-Order and Make-To-Stock demand profiles

    Get PDF
    In this paper a demand time series is analysed to support Make-To-Stock (MTS) and Make-To-Order (MTO) production decisions. Using a purely MTS production strategy based on the given demand can lead to unnecessarily high inventory levels thus it is necessary to identify likely MTO episodes. This research proposes a novel outlier detection algorithm based on special density measures. We divide the time series' histogram into three clusters. One with frequent-low volume covers MTS items whilst a second accounts for high volumes which is dedicated to MTO items. The third cluster resides between the previous two with its elements being assigned to either the MTO or MTS class. The algorithm can be applied to a variety of time series such as stationary and non-stationary ones. We use empirical data from manufacturing to study the extent of inventory savings. The percentage of MTO items is reflected in the inventory savings which were shown to be an average of 18.1%.Comment: demand analysis; time series; outlier detection; production strategy; Make-To-Order(MTO); Make-To-Stock(MTS); 15 pages, 9 figure
    • …
    corecore