23,694 research outputs found

    Searching for network modules

    Full text link
    When analyzing complex networks a key target is to uncover their modular structure, which means searching for a family of modules, namely node subsets spanning each a subnetwork more densely connected than the average. This work proposes a novel type of objective function for graph clustering, in the form of a multilinear polynomial whose coefficients are determined by network topology. It may be thought of as a potential function, to be maximized, taking its values on fuzzy clusterings or families of fuzzy subsets of nodes over which every node distributes a unit membership. When suitably parametrized, this potential is shown to attain its maximum when every node concentrates its all unit membership on some module. The output thus is a partition, while the original discrete optimization problem is turned into a continuous version allowing to conceive alternative search strategies. The instance of the problem being a pseudo-Boolean function assigning real-valued cluster scores to node subsets, modularity maximization is employed to exemplify a so-called quadratic form, in that the scores of singletons and pairs also fully determine the scores of larger clusters, while the resulting multilinear polynomial potential function has degree 2. After considering further quadratic instances, different from modularity and obtained by interpreting network topology in alternative manners, a greedy local-search strategy for the continuous framework is analytically compared with an existing greedy agglomerative procedure for the discrete case. Overlapping is finally discussed in terms of multiple runs, i.e. several local searches with different initializations.Comment: 10 page

    BigFCM: Fast, Precise and Scalable FCM on Hadoop

    Full text link
    Clustering plays an important role in mining big data both as a modeling technique and a preprocessing step in many data mining process implementations. Fuzzy clustering provides more flexibility than non-fuzzy methods by allowing each data record to belong to more than one cluster to some degree. However, a serious challenge in fuzzy clustering is the lack of scalability. Massive datasets in emerging fields such as geosciences, biology and networking do require parallel and distributed computations with high performance to solve real-world problems. Although some clustering methods are already improved to execute on big data platforms, but their execution time is highly increased for large datasets. In this paper, a scalable Fuzzy C-Means (FCM) clustering named BigFCM is proposed and designed for the Hadoop distributed data platform. Based on the map-reduce programming model, it exploits several mechanisms including an efficient caching design to achieve several orders of magnitude reduction in execution time. Extensive evaluation over multi-gigabyte datasets shows that BigFCM is scalable while it preserves the quality of clustering

    Mapping Diversity of Publication Patterns in the Social Sciences and Humanities: An Approach Making Use of Fuzzy Cluster Analysis

    Get PDF
    &lt;b&gt;Purpose:&lt;/b&gt; To present a method for systematically mapping diversity of publication patterns at the author level in the social sciences and humanities in terms of publication type, publication language and co-authorship.&lt;br&gt;&lt;b&gt;Design/methodology/approach:&lt;/b&gt; In a follow-up to the hard partitioning clustering by Verleysen and Weeren in 2016, we now propose the complementary use of fuzzy cluster analysis, making use of a membership coefficient to study gradual differences between publication styles among authors within a scholarly discipline. The analysis of the probability density function of the membership coefficient allows to assess the distribution of publication styles within and between disciplines.&lt;br&gt;&lt;b&gt;Findings:&lt;/b&gt; As an illustration we analyze 1,828 productive authors affiliated in Flanders, Belgium. Whereas a hard partitioning previously identified two broad publication styles, an international one vs. a domestic one, fuzzy analysis now shows gradual differences among authors. Internal diversity also varies across disciplines and can be explained by researchers&#39; specialization and dissemination strategies.&lt;br&gt;&lt;b&gt;Research limitations:&lt;/b&gt; The dataset used is limited to one country for the years 2000-2011; a cognitive classification of authors may yield a different result from the affiliation-based classification used here.&lt;br&gt;&lt;b&gt;Practical implications:&lt;/b&gt; Our method is applicable to other bibliometric and research evaluation contexts, especially for the social sciences and humanities in non-Anglophone countries.&lt;br&gt;&lt;b&gt;Originality/value:&lt;/b&gt; The method proposed is a novel application of cluster analysis to the field of bibliometrics. Applied to publication patterns at the author level in the social sciences and humanities, for the first time it systematically documents intra-disciplinary diversity.&lt;b&gt;Purpose:&lt;/b&gt; To present a method for systematically mapping diversity of publication patterns at the author level in the social sciences and humanities in terms of publication type, publication language and co-authorship.&lt;br&gt;&lt;b&gt;Design/methodology/approach:&lt;/b&gt; In a follow-up to the hard partitioning clustering by Verleysen and Weeren in 2016, we now propose the complementary use of fuzzy cluster analysis, making use of a membership coefficient to study gradual differences between publication styles among authors within a scholarly discipline. The analysis of the probability density function of the membership coefficient allows to assess the distribution of publication styles within and between disciplines.&lt;br&gt;&lt;b&gt;Findings:&lt;/b&gt; As an illustration we analyze 1,828 productive authors affiliated in Flanders, Belgium. Whereas a hard partitioning previously identified two broad publication styles, an international one vs. a domestic one, fuzzy analysis now shows gradual differences among authors. Internal diversity also varies across disciplines and can be explained by researchers&#39; specialization and dissemination strategies.&lt;br&gt;&lt;b&gt;Research limitations:&lt;/b&gt; The dataset used is limited to one country for the years 2000-2011; a cognitive classification of authors may yield a different result from the affiliation-based classification used here.&lt;br&gt;&lt;b&gt;Practical implications:&lt;/b&gt; Our method is applicable to other bibliometric and research evaluation contexts, especially for the social sciences and humanities in non-Anglophone countries.&lt;br&gt;&lt;b&gt;Originality/value:&lt;/b&gt; The method proposed is a novel application of cluster analysis to the field of bibliometrics. Applied to publication patterns at the author level in the social sciences and humanities, for the first time it systematically documents intra-disciplinary diversity.</span

    Theoretical Interpretations and Applications of Radial Basis Function Networks

    Get PDF
    Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains

    Dynamic distributed clustering in wireless sensor networks via Voronoi tessellation control

    Get PDF
    This paper presents two dynamic and distributed clustering algorithms for Wireless Sensor Networks (WSNs). Clustering approaches are used in WSNs to improve the network lifetime and scalability by balancing the workload among the clusters. Each cluster is managed by a cluster head (CH) node. The first algorithm requires the CH nodes to be mobile: by dynamically varying the CH node positions, the algorithm is proved to converge to a specific partition of the mission area, the generalised Voronoi tessellation, in which the loads of the CH nodes are balanced. Conversely, if the CH nodes are fixed, a weighted Voronoi clustering approach is proposed with the same load-balancing objective: a reinforcement learning approach is used to dynamically vary the mission space partition by controlling the weights of the Voronoi regions. Numerical simulations are provided to validate the approaches
    • …
    corecore