1,175 research outputs found

    Partitioning Relational Matrices of Similarities or Dissimilarities using the Value of Information

    Full text link
    In this paper, we provide an approach to clustering relational matrices whose entries correspond to either similarities or dissimilarities between objects. Our approach is based on the value of information, a parameterized, information-theoretic criterion that measures the change in costs associated with changes in information. Optimizing the value of information yields a deterministic annealing style of clustering with many benefits. For instance, investigators avoid needing to a priori specify the number of clusters, as the partitions naturally undergo phase changes, during the annealing process, whereby the number of clusters changes in a data-driven fashion. The global-best partition can also often be identified.Comment: Submitted to the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP

    Comparison and Weighted Summation Type of Fuzzy Cluster Validity Indices

    Get PDF
    Finding the optimal cluster number and validating the partition resultsof a data set are difficult tasks since clustering is an unsupervised learning process.Cluster validity index (CVI) is a kind of criterion function for evaluating the clusteringresults and determining the optimal number of clusters. In this paper, we present anextensive comparison of ten well-known CVIs for fuzzy clustering. Then we extendtraditional single CVIs by introducing the weighted method and propose a weightedsummation type of CVI (WSCVI). Experiments on nine synthetic data sets and fourreal-world UCI data sets demonstrate that no one CVI performs better on all datasets than others. Nevertheless, the proposed WSCVI is more effective by properlysetting the weights

    A User Recommendation Model for Answering Questions on Brainly Platform

    Get PDF
    Brainly is a Community Question Answer (CQA) application that allows students or parents to ask questions related to their homework. The current mechanism is that users ask questions, then other users who are in the same subject interest can see and answer it. As a reward for answering questions, Brainly gives points. The number of points varies by question. The greater of total points users have, Brainly will automatically display them in the smartest user leaderboard on the site's front page. But sometimes, some users do not have good activity in answering questions. Thus, it is possible to have an urgent question that has not been answered by anyone. This study implements Fuzzy C-Means cluster method to improve Brainly's feature regarding the speed and accuracy of answers. The idea is to create student clusters by utilizing the smartest students' leaderboard, subjects interest, and answering activities. The stages applied in this research started with Data Extraction, Preprocessing, Cluster Process, and User Recommender. The optimal number of clusters in the answerer recommendation in the Brainly platform is 2 clusters. The value of the fuzzy partition coefficient for two clusters reached 0.97 for Mathematics and 0.93 for Indonesian. Meanwhile, the results of the recommendations were influenced by answer ratings. Many numbers of the answer are not given rating because the possibility of the answers are not appropriate or user's insensitivity in giving ratings.Brainly is a Community Question Answer (CQA) application that allows students or parents to ask questions related to their homework. The current mechanism is that users ask questions, then other users who are in the same subject interest can see and answer it. As a reward for answering questions, Brainly gives points. The number of points varies by question. The greater of total points users have, Brainly will automatically display them in the smartest user leaderboard on the site's front page. But sometimes, some users do not have good activity in answering questions. Thus, it is possible to have an urgent question that has not been answered by anyone. This study implements Fuzzy C-Means cluster method to improve Brainly's feature regarding the speed and accuracy of answers. The idea is to create student clusters by utilizing the smartest students' leaderboard, subjects interest, and answering activities. The stages applied in this research started with Data Extraction, Preprocessing, Cluster Process, and User Recommender. The optimal number of clusters in the answerer recommendation in the Brainly platform is 2 clusters. The value of the fuzzy partition coefficient for two clusters reached 0.97 for Mathematics and 0.93 for Indonesian. Meanwhile, the results of the recommendations were influenced by answer ratings. Many numbers of the answer are not given rating because the possibility of the answers are not appropriate or user's insensitivity in giving ratings

    Enhanced Dark Block Extraction Method Performed Automatically to Determine the Number of Clusters in Unlabeled Data Sets

    Get PDF
    One of the major issues in data cluster analysis is to decide the number of clusters or groups from a set of unlabeled data. In addition, the presentation of cluster should be analyzed to provide the accuracy of clustering objects. This paper propose a new method called Enhanced-Dark Block Extraction (E-DBE), which automatically identifies the number of objects groups in unlabeled datasets. The proposed algorithm relies on the available algorithm for visual assessment of cluster tendency of a dataset, by using several common signal and image processing techniques. The method includes the following steps: 1.Generating an Enhanced Visual Assessment Tendency (E-VAT) image from a dissimilarity matrix which is the input for E-DBE algorithm. 2. Processing image segmentation on E-VAT image to obtain a binary image then performs filter techniques. 3. Performing distance transformation to the filtered binary image and projecting the pixels in the main diagonal alignment of the image to figure a projection signal. 4. Smoothing the outcrop signal, computing its first-order derivative and then detecting major peaks and valleys in the resulting signal to acquire the number of clusters. E-DBE is a parameter-free algorithm to perform cluster analysis. Experiments of the method are presented on several UCI, synthetic and real world datasets

    Cluster validity in clustering methods

    Get PDF

    The Giant Inflaton

    Full text link
    We investigate a new mechanism for realizing slow roll inflation in string theory, based on the dynamics of p anti-D3 branes in a class of mildly warped flux compactifications. Attracted to the bottom of a warped conifold throat, the anti-branes then cluster due to a novel mechanism wherein the background flux polarizes in an attempt to screen them. Once they are sufficiently close, the M units of flux cause the anti-branes to expand into a fuzzy NS5-brane, which for rather generic choices of p/M will unwrap around the geometry, decaying into D3-branes via a classical process. We find that the effective potential governing this evolution possesses several epochs that can potentially support slow-roll inflation, provided the process can be arranged to take place at a high enough energy scale, of about one or two orders of magnitude below the Planck energy; this scale, however, lies just outside the bounds of our approximations.Comment: 31 pages, 4 figures, LaTeX. v2: references added, typos fixe

    Methods for fast and reliable clustering

    Get PDF

    Typicality, graded membership, and vagueness

    Get PDF
    This paper addresses theoretical problems arising from the vagueness of language terms, and intuitions of the vagueness of the concepts to which they refer. It is argued that the central intuitions of prototype theory are sufficient to account for both typicality phenomena and psychological intuitions about degrees of membership in vaguely defined classes. The first section explains the importance of the relation between degrees of membership and typicality (or goodness of example) in conceptual categorization. The second and third section address arguments advanced by Osherson and Smith (1997), and Kamp and Partee (1995), that the two notions of degree of membership and typicality must relate to fundamentally different aspects of conceptual representations. A version of prototype theory—the Threshold Model—is proposed to counter these arguments and three possible solutions to the problems of logical selfcontradiction and tautology for vague categorizations are outlined. In the final section graded membership is related to the social construction of conceptual boundaries maintained through language use
    corecore