276 research outputs found

    Considerations about multistep community detection

    Full text link
    The problem and implications of community detection in networks have raised a huge attention, for its important applications in both natural and social sciences. A number of algorithms has been developed to solve this problem, addressing either speed optimization or the quality of the partitions calculated. In this paper we propose a multi-step procedure bridging the fastest, but less accurate algorithms (coarse clustering), with the slowest, most effective ones (refinement). By adopting heuristic ranking of the nodes, and classifying a fraction of them as `critical', a refinement step can be restricted to this subset of the network, thus saving computational time. Preliminary numerical results are discussed, showing improvement of the final partition.Comment: 12 page

    Semi-Supervised Overlapping Community Finding based on Label Propagation with Pairwise Constraints

    Get PDF
    Algorithms for detecting communities in complex networks are generally unsupervised, relying solely on the structure of the network. However, these methods can often fail to uncover meaningful groupings that reflect the underlying communities in the data, particularly when those structures are highly overlapping. One way to improve the usefulness of these algorithms is by incorporating additional background information, which can be used as a source of constraints to direct the community detection process. In this work, we explore the potential of semi-supervised strategies to improve algorithms for finding overlapping communities in networks. Specifically, we propose a new method, based on label propagation, for finding communities using a limited number of pairwise constraints. Evaluations on synthetic and real-world datasets demonstrate the potential of this approach for uncovering meaningful community structures in cases where each node can potentially belong to more than one community.Comment: Fix table

    On the Hardness of SAT with Community Structure

    Full text link
    Recent attempts to explain the effectiveness of Boolean satisfiability (SAT) solvers based on conflict-driven clause learning (CDCL) on large industrial benchmarks have focused on the concept of community structure. Specifically, industrial benchmarks have been empirically found to have good community structure, and experiments seem to show a correlation between such structure and the efficiency of CDCL. However, in this paper we establish hardness results suggesting that community structure is not sufficient to explain the success of CDCL in practice. First, we formally characterize a property shared by a wide class of metrics capturing community structure, including "modularity". Next, we show that the SAT instances with good community structure according to any metric with this property are still NP-hard. Finally, we study a class of random instances generated from the "pseudo-industrial" community attachment model of Gir\'aldez-Cru and Levy. We prove that, with high probability, instances from this model that have relatively few communities but are still highly modular require exponentially long resolution proofs and so are hard for CDCL. We also present experimental evidence that our result continues to hold for instances with many more communities. This indicates that actual industrial instances easily solved by CDCL may have some other relevant structure not captured by the community attachment model.Comment: 23 pages. Full version of a SAT 2016 pape

    Outlier Edge Detection Using Random Graph Generation Models and Applications

    Get PDF
    Outliers are samples that are generated by different mechanisms from other normal data samples. Graphs, in particular social network graphs, may contain nodes and edges that are made by scammers, malicious programs or mistakenly by normal users. Detecting outlier nodes and edges is important for data mining and graph analytics. However, previous research in the field has merely focused on detecting outlier nodes. In this article, we study the properties of edges and propose outlier edge detection algorithms using two random graph generation models. We found that the edge-ego-network, which can be defined as the induced graph that contains two end nodes of an edge, their neighboring nodes and the edges that link these nodes, contains critical information to detect outlier edges. We evaluated the proposed algorithms by injecting outlier edges into some real-world graph data. Experiment results show that the proposed algorithms can effectively detect outlier edges. In particular, the algorithm based on the Preferential Attachment Random Graph Generation model consistently gives good performance regardless of the test graph data. Further more, the proposed algorithms are not limited in the area of outlier edge detection. We demonstrate three different applications that benefit from the proposed algorithms: 1) a preprocessing tool that improves the performance of graph clustering algorithms; 2) an outlier node detection algorithm; and 3) a novel noisy data clustering algorithm. These applications show the great potential of the proposed outlier edge detection techniques.Comment: 14 pages, 5 figures, journal pape

    Female economic dependence and the morality of promiscuity

    Get PDF
    This article is made available through the Brunel Open Access Publishing Fund. Copyright @ The Author(s) 2014.In environments in which female economic dependence on a male mate is higher, male parental investment is more essential. In such environments, therefore, both sexes should value paternity certainty more and thus object more to promiscuity (because promiscuity undermines paternity certainty). We tested this theory of anti-promiscuity morality in two studies (N = 656 and N = 4,626) using U.S. samples. In both, we examined whether opposition to promiscuity was higher among people who perceived greater female economic dependence in their social network. In Study 2, we also tested whether economic indicators of female economic dependence (e.g., female income, welfare availability) predicted anti-promiscuity morality at the state level. Results from both studies supported the proposed theory. At the individual level, perceived female economic dependence explained significant variance in anti-promiscuity morality, even after controlling for variance explained by age, sex, religiosity, political conservatism, and the anti-promiscuity views of geographical neighbors. At the state level, median female income was strongly negatively related to anti-promiscuity morality and this relationship was fully mediated by perceived female economic dependence. These results were consistent with the view that anti-promiscuity beliefs may function to promote paternity certainty in circumstances where male parental investment is particularly important

    Language comparison via network topology

    Full text link
    Modeling relations between languages can offer understanding of language characteristics and uncover similarities and differences between languages. Automated methods applied to large textual corpora can be seen as opportunities for novel statistical studies of language development over time, as well as for improving cross-lingual natural language processing techniques. In this work, we first propose how to represent textual data as a directed, weighted network by the text2net algorithm. We next explore how various fast, network-topological metrics, such as network community structure, can be used for cross-lingual comparisons. In our experiments, we employ eight different network topology metrics, and empirically showcase on a parallel corpus, how the methods can be used for modeling the relations between nine selected languages. We demonstrate that the proposed method scales to large corpora consisting of hundreds of thousands of aligned sentences on an of-the-shelf laptop. We observe that on the one hand properties such as communities, capture some of the known differences between the languages, while others can be seen as novel opportunities for linguistic studies

    Dynamic Community Detection into Analyzing of Wildfires Events

    Full text link
    The study and comprehension of complex systems are crucial intellectual and scientific challenges of the 21st century. In this scenario, network science has emerged as a mathematical tool to support the study of such systems. Examples include environmental processes such as wildfires, which are known for their considerable impact on human life. However, there is a considerable lack of studies of wildfire from a network science perspective. Here, employing the chronological network concept -- a temporal network where nodes are linked if two consecutive events occur between them -- we investigate the information that dynamic community structures reveal about the wildfires' dynamics. Particularly, we explore a two-phase dynamic community detection approach, i.e., we applied the Louvain algorithm on a series of snapshots. Then we used the Jaccard similarity coefficient to match communities across adjacent snapshots. Experiments with the MODIS dataset of fire events in the Amazon basing were conducted. Our results show that the dynamic communities can reveal wildfire patterns observed throughout the year.Comment: 16 pages, 8 figure

    Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics

    Get PDF
    Background: Network communities help the functional organization and evolution of complex networks. However, the development of a method, which is both fast and accurate, provides modular overlaps and partitions of a heterogeneous network, has proven to be rather difficult. Methodology/Principal Findings: Here we introduce the novel concept of ModuLand, an integrative method family determining overlapping network modules as hills of an influence function-based, centrality-type community landscape, and including several widely used modularization methods as special cases. As various adaptations of the method family, we developed several algorithms, which provide an efficient analysis of weighted and directed networks, and (1) determine pervasively overlapping modules with high resolution; (2) uncover a detailed hierarchical network structure allowing an efficient, zoom-in analysis of large networks; (3) allow the determination of key network nodes and (4) help to predict network dynamics. Conclusions/Significance: The concept opens a wide range of possibilities to develop new approaches and applications including network routing, classification, comparison and prediction.Comment: 25 pages with 6 figures and a Glossary + Supporting Information containing pseudo-codes of all algorithms used, 14 Figures, 5 Tables (with 18 module definitions, 129 different modularization methods, 13 module comparision methods) and 396 references. All algorithms can be downloaded from this web-site: http://www.linkgroup.hu/modules.ph

    Emerging landscape of oncogenic signatures across human cancers.

    Get PDF
    Cancer therapy is challenged by the diversity of molecular implementations of oncogenic processes and by the resulting variation in therapeutic responses. Projects such as The Cancer Genome Atlas (TCGA) provide molecular tumor maps in unprecedented detail. The interpretation of these maps remains a major challenge. Here we distilled thousands of genetic and epigenetic features altered in cancers to ∼500 selected functional events (SFEs). Using this simplified description, we derived a hierarchical classification of 3,299 TCGA tumors from 12 cancer types. The top classes are dominated by either mutations (M class) or copy number changes (C class). This distinction is clearest at the extremes of genomic instability, indicating the presence of different oncogenic processes. The full hierarchy shows functional event patterns characteristic of multiple cross-tissue groups of tumors, termed oncogenic signature classes. Targetable functional events in a tumor class are suggestive of class-specific combination therapy. These results may assist in the definition of clinical trials to match actionable oncogenic signatures with personalized therapies
    corecore