66,510 research outputs found

    Seeding for pervasively overlapping communities

    Full text link
    In some social and biological networks, the majority of nodes belong to multiple communities. It has recently been shown that a number of the algorithms that are designed to detect overlapping communities do not perform well in such highly overlapping settings. Here, we consider one class of these algorithms, those which optimize a local fitness measure, typically by using a greedy heuristic to expand a seed into a community. We perform synthetic benchmarks which indicate that an appropriate seeding strategy becomes increasingly important as the extent of community overlap increases. We find that distinct cliques provide the best seeds. We find further support for this seeding strategy with benchmarks on a Facebook network and the yeast interactome.Comment: 8 Page

    Network depth: identifying median and contours in complex networks

    Full text link
    Centrality descriptors are widely used to rank nodes according to specific concept(s) of importance. Despite the large number of centrality measures available nowadays, it is still poorly understood how to identify the node which can be considered as the `centre' of a complex network. In fact, this problem corresponds to finding the median of a complex network. The median is a non-parametric and robust estimator of the location parameter of a probability distribution. In this work, we present the most natural generalisation of the concept of median to the realm of complex networks, discussing its advantages for defining the centre of the system and percentiles around that centre. To this aim, we introduce a new statistical data depth and we apply it to networks embedded in a geometric space induced by different metrics. The application of our framework to empirical networks allows us to identify median nodes which are socially or biologically relevant

    Toward Entity-Aware Search

    Get PDF
    As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability
    • …
    corecore