700 research outputs found

    Behavioral Communities and the Atomic Structure of Networks

    Full text link
    We develop a theory of `behavioral communities' and the `atomic structure' of networks. We define atoms to be groups of agents whose behaviors always match each other in a set of coordination games played on the network. This provides a microfoundation for a method of detecting communities in social and economic networks. We provide theoretical results characterizing such behavior-based communities and atomic structures and discussing their properties in large random networks. We also provide an algorithm for identifying behavioral communities. We discuss applications including: a method of estimating underlying preferences by observing behavioral conventions in data, and optimally seeding diffusion processes when there are peer interactions and homophily. We illustrate the techniques with applications to high school friendship networks and rural village networks

    Benchmarking seeding strategies for spreading processes in social networks: an interplay between infuencers, topologies and sizes

    Get PDF
    The explosion of network science has permitted an understanding of how the structure of social networks affects the dynamics of social contagion. In community-based interventions with spill-over effects, identifying influential spreaders may be harnessed to increase the spreading efficiency of social contagion, in terms of time needed to spread all the largest connected component of the network. Several strategies have been proved to be efficient using only data and simulation-based models in specific network topologies without a consensus of an overall result. Hence, the purpose of this paper is to benchmark the spreading efficiency of seeding strategies related to network structural properties and sizes. We simulate spreading processes on empirical and simulated social networks within a wide range of densities, clustering coefficients, and sizes. We also propose three new decentralized seeding strategies that are structurally different from well-known strategies: community hubs, ambassadors, and random hubs. We observe that the efficiency ranking of strategies varies with the network structure. In general, for sparse networks with community structure, decentralized influencers are suitable for increasing the spreading efficiency. By contrast, when the networks are denser, centralized influencers outperform. These results provide a framework for selecting efficient strategies according to different contexts in which social networks emerge

    A New overlapping community detection algorithm based on similarity of neighbors in complex networks

    Get PDF
    summary:Community detection algorithms help us improve the management of complex networks and provide a clean sight of them. We can encounter complex networks in various fields such as social media, bioinformatics, recommendation systems, and search engines. As the definition of the community changes based on the problem considered, there is no algorithm that works universally for all kinds of data and network structures. Communities can be disjointed such that each member is in at most one community or overlapping such that every member is in at least one community. In this study, we examine the problem of finding overlapping communities in complex networks and propose a new algorithm based on the similarity of neighbors. This algorithm runs in O(mlgm) O(m \textit{lg} m) running time in the complex network containing m m number of relationships. To compare our algorithm with existing ones, we select the most successful four algorithms from the Community Detection library (CDlib) by eliminating the algorithms that require prior knowledge, are unstable, and are time-consuming. We evaluate the successes of the proposed algorithm and the selected algorithms using various known metrics such as modularity, F-score, and Normalized Mutual Information. In addition, we adapt the coverage metric defined for disjoint communities to overlapping communities and also make comparisons with this metric. We also test all of the algorithms on small graphs of real communities. The experimental results show that the proposed algorithm is successful in finding overlapping communities

    Deep Learning and Hybrid Approach for Particle Detection in Defocusing Particle Tracking Velocimetry

    Get PDF
    The present work aims at the improvement of particle detection in defocusing particle tracking velocimetry (DPTV) by means of a novel hybrid approach. Two deep learning approaches, namely faster R-CNN and RetinaNet are compared to the performance of two benchmark conventional image processing algorithms for DPTV. For the development of a hybrid approach with improved performance, the different detection approaches are evaluated on synthetic and images from an actual DPTV experiment. First, the performance under the influence of noise, overlaps, seeding density and optical aberrations is discussed and consequently advantages of neural networks over conventional image processing algorithms for image processing in DPTV are derived. Furthermore, current limitations of the application of neural networks for DPTV are pointed out and their origin is elaborated. It shows that neural networks have a better detection capability but suffer from low positional accuracy when locating particles. Finally, a novel Hybrid Approach is proposed, which uses a neural network for particle detection and passes the prediction onto a conventional refinement algorithm for better position accuracy. A third step is implemented to additionally eliminate false predictions by the network based on a subsequent rejection criterion. The novel approach improves the powerful detection performance of neural networks while maintaining the high position accuracy of conventional algorithms, combining the advantages of both approaches

    Benchmarking network propagation methods for disease gene identification

    Get PDF
    In-silico identification of potential target genes for disease is an essential aspect of drug target discovery. Recent studies suggest that successful targets can be found through by leveraging genetic, genomic and protein interaction information. Here, we systematically tested the ability of 12 varied algorithms, based on network propagation, to identify genes that have been targeted by any drug, on gene-disease data from 22 common non-cancerous diseases in OpenTargets. We considered two biological networks, six performance metrics and compared two types of input gene-disease association scores. The impact of the design factors in performance was quantified through additive explanatory models. Standard cross-validation led to over-optimistic performance estimates due to the presence of protein complexes. In order to obtain realistic estimates, we introduced two novel protein complex-aware cross-validation schemes. When seeding biological networks with known drug targets, machine learning and diffusion-based methods found around 2-4 true targets within the top 20 suggestions. Seeding the networks with genes associated to disease by genetics decreased performance below 1 true hit on average. The use of a larger network, although noisier, improved overall performance. We conclude that diffusion-based prioritisers and machine learning applied to diffusion-based features are suited for drug discovery in practice and improve over simpler neighbour-voting methods. We also demonstrate the large impact of choosing an adequate validation strategy and the definition of seed disease genesPeer ReviewedPostprint (published version

    GASP : Geometric Association with Surface Patches

    Full text link
    A fundamental challenge to sensory processing tasks in perception and robotics is the problem of obtaining data associations across views. We present a robust solution for ascertaining potentially dense surface patch (superpixel) associations, requiring just range information. Our approach involves decomposition of a view into regularized surface patches. We represent them as sequences expressing geometry invariantly over their superpixel neighborhoods, as uniquely consistent partial orderings. We match these representations through an optimal sequence comparison metric based on the Damerau-Levenshtein distance - enabling robust association with quadratic complexity (in contrast to hitherto employed joint matching formulations which are NP-complete). The approach is able to perform under wide baselines, heavy rotations, partial overlaps, significant occlusions and sensor noise. The technique does not require any priors -- motion or otherwise, and does not make restrictive assumptions on scene structure and sensor movement. It does not require appearance -- is hence more widely applicable than appearance reliant methods, and invulnerable to related ambiguities such as textureless or aliased content. We present promising qualitative and quantitative results under diverse settings, along with comparatives with popular approaches based on range as well as RGB-D data.Comment: International Conference on 3D Vision, 201

    Nucleus segmentation : towards automated solutions

    Get PDF
    Single nucleus segmentation is a frequent challenge of microscopy image processing, since it is the first step of many quantitative data analysis pipelines. The quality of tracking single cells, extracting features or classifying cellular phenotypes strongly depends on segmentation accuracy. Worldwide competitions have been held, aiming to improve segmentation, and recent years have definitely brought significant improvements: large annotated datasets are now freely available, several 2D segmentation strategies have been extended to 3D, and deep learning approaches have increased accuracy. However, even today, no generally accepted solution and benchmarking platform exist. We review the most recent single-cell segmentation tools, and provide an interactive method browser to select the most appropriate solution.Peer reviewe

    Data Mining Algorithms for Internet Data: from Transport to Application Layer

    Get PDF
    Nowadays we live in a data-driven world. Advances in data generation, collection and storage technology have enabled organizations to gather data sets of massive size. Data mining is a discipline that blends traditional data analysis methods with sophisticated algorithms to handle the challenges posed by these new types of data sets. The Internet is a complex and dynamic system with new protocols and applications that arise at a constant pace. All these characteristics designate the Internet a valuable and challenging data source and application domain for a research activity, both looking at Transport layer, analyzing network tra c flows, and going up to Application layer, focusing on the ever-growing next generation web services: blogs, micro-blogs, on-line social networks, photo sharing services and many other applications (e.g., Twitter, Facebook, Flickr, etc.). In this thesis work we focus on the study, design and development of novel algorithms and frameworks to support large scale data mining activities over huge and heterogeneous data volumes, with a particular focus on Internet data as data source and targeting network tra c classification, on-line social network analysis, recommendation systems and cloud services and Big data
    • …
    corecore