40 research outputs found

    Comparing Community Structure to Characteristics in Online Collegiate Social Networks

    Get PDF
    We study the structure of social networks of students by examining the graphs of Facebook "friendships" at five American universities at a single point in time. We investigate each single-institution network's community structure and employ graphical and quantitative tools, including standardized pair-counting methods, to measure the correlations between the network communities and a set of self-identified user characteristics (residence, class year, major, and high school). We review the basic properties and statistics of the pair-counting indices employed and recall, in simplified notation, a useful analytical formula for the z-score of the Rand coefficient. Our study illustrates how to examine different instances of social networks constructed in similar environments, emphasizes the array of social forces that combine to form "communities," and leads to comparative observations about online social lives that can be used to infer comparisons about offline social structures. In our illustration of this methodology, we calculate the relative contributions of different characteristics to the community structure of individual universities and subsequently compare these relative contributions at different universities, measuring for example the importance of common high school affiliation to large state universities and the varying degrees of influence common major can have on the social structure at different universities. The heterogeneity of communities that we observe indicates that these networks typically have multiple organizing factors rather than a single dominant one.Comment: Version 3 (17 pages, 5 multi-part figures), accepted in SIAM Revie

    Community Structure in Congressional Cosponsorship Networks

    Get PDF
    We study the United States Congress by constructing networks between Members of Congress based on the legislation that they cosponsor. Using the concept of modularity, we identify the community structure of Congressmen, as connected via sponsorship/cosponsorship of the same legislation, to investigate the collaborative communities of legislators in both chambers of Congress. This analysis yields an explicit and conceptually clear measure of political polarization, demonstrating a sharp increase in partisan polarization which preceded and then culminated in the 104th Congress (1995-1996), when Republicans took control of both chambers. Although polarization has since waned in the U.S. Senate, it remains at historically high levels in the House of Representatives.Comment: 8 pages, 4 figures (some with multiple parts), to appear in Physica A; additional background info and explanations added from last versio

    A network-specific approach to percolation in networks with bidirectional links

    Full text link
    Methods for determining the percolation threshold usually study the behavior of network ensembles and are often restricted to a particular type of probabilistic node/link removal strategy. We propose a network-specific method to determine the connectivity of nodes below the percolation threshold and offer an estimate to the percolation threshold in networks with bidirectional links. Our analysis does not require the assumption that a network belongs to a specific ensemble and can at the same time easily handle arbitrary removal strategies (previously an open problem for undirected networks). In validating our analysis, we find that it predicts the effects of many known complex structures (e.g., degree correlations) and may be used to study both probabilistic and deterministic attacks.Comment: 6 pages, 8 figure

    Influence of wiring cost on the large-scale architecture of human cortical connectivity

    Get PDF
    In the past two decades some fundamental properties of cortical connectivity have been discovered: small-world structure, pronounced hierarchical and modular organisation, and strong core and rich-club structures. A common assumption when interpreting results of this kind is that the observed structural properties are present to enable the brain's function. However, the brain is also embedded into the limited space of the skull and its wiring has associated developmental and metabolic costs. These basic physical and economic aspects place separate, often conflicting, constraints on the brain's connectivity, which must be characterized in order to understand the true relationship between brain structure and function. To address this challenge, here we ask which, and to what extent, aspects of the structural organisation of the brain are conserved if we preserve specific spatial and topological properties of the brain but otherwise randomise its connectivity. We perform a comparative analysis of a connectivity map of the cortical connectome both on high- and low-resolutions utilising three different types of surrogate networks: spatially unconstrained (‘random’), connection length preserving (‘spatial’), and connection length optimised (‘reduced’) surrogates. We find that unconstrained randomisation markedly diminishes all investigated architectural properties of cortical connectivity. By contrast, spatial and reduced surrogates largely preserve most properties and, interestingly, often more so in the reduced surrogates. Specifically, our results suggest that the cortical network is less tightly integrated than its spatial constraints would allow, but more strongly segregated than its spatial constraints would necessitate. We additionally find that hierarchical organisation and rich-club structure of the cortical connectivity are largely preserved in spatial and reduced surrogates and hence may be partially attributable to cortical wiring constraints. In contrast, the high modularity and strong s-core of the high-resolution cortical network are significantly stronger than in the surrogates, underlining their potential functional relevance in the brain

    Dynamics and Control of Diseases in Networks with Community Structure

    Get PDF
    The dynamics of infectious diseases spread via direct person-to-person transmission (such as influenza, smallpox, HIV/AIDS, etc.) depends on the underlying host contact network. Human contact networks exhibit strong community structure. Understanding how such community structure affects epidemics may provide insights for preventing the spread of disease between communities by changing the structure of the contact network through pharmaceutical or non-pharmaceutical interventions. We use empirical and simulated networks to investigate the spread of disease in networks with community structure. We find that community structure has a major impact on disease dynamics, and we show that in networks with strong community structure, immunization interventions targeted at individuals bridging communities are more effective than those simply targeting highly connected individuals. Because the structure of relevant contact networks is generally not known, and vaccine supply is often limited, there is great need for efficient vaccination algorithms that do not require full knowledge of the network. We developed an algorithm that acts only on locally available network information and is able to quickly identify targets for successful immunization intervention. The algorithm generally outperforms existing algorithms when vaccine supply is limited, particularly in networks with strong community structure. Understanding the spread of infectious diseases and designing optimal control strategies is a major goal of public health. Social networks show marked patterns of community structure, and our results, based on empirical and simulated data, demonstrate that community structure strongly affects disease dynamics. These results have implications for the design of control strategies

    Unsupervised record matching with noisy and incomplete data

    Get PDF
    We consider the problem of duplicate detection in noisy and incomplete data: given a large data set in which each record has multiple entries (attributes), detect which distinct records refer to the same real world entity. This task is complicated by noise (such as misspellings) and missing data, which can lead to records being different, despite referring to the same entity. Our method consists of three main steps: creating a similarity score between records, grouping records together into "unique entities", and refining the groups. We compare various methods for creating similarity scores between noisy records, considering different combinations of string matching, term frequency-inverse document frequency methods, and n-gram techniques. In particular, we introduce a vectorized soft term frequency-inverse document frequency method, with an optional refinement step. We also discuss two methods to deal with missing data in computing similarity scores. We test our method on the Los Angeles Police Department Field Interview Card data set, the Cora Citation Matching data set, and two sets of restaurant review data. The results show that the methods that use words as the basic units are preferable to those that use 3-grams. Moreover, in some (but certainly not all) parameter ranges soft term frequency-inverse document frequency methods can outperform the standard term frequency-inverse document frequency method. The results also confirm that our method for automatically determining the number of groups typically works well in many cases and allows for accurate results in the absence of a priori knowledge of the number of unique entities in the data set

    Understanding the Behavior of Filipino Twitter Users during Disaster

    No full text
    corecore