37,823 research outputs found

    Regional surname affinity: a spatial network approach

    Get PDF
    OBJECTIVE We investigate surname affinities among areas of modern‐day China, by constructing a spatial network, and making community detection. It reports a geographical genealogy of the Chinese population that is result of population origins, historical migrations, and societal evolutions. MATERIALS AND METHODS We acquire data from the census records supplied by China's National Citizen Identity Information System, including the surname and regional information of 1.28 billion registered Chinese citizens. We propose a multilayer minimum spanning tree (MMST) to construct a spatial network based on the matrix of isonymic distances, which is often used to characterize the dissimilarity of surname structure among areas. We use the fast unfolding algorithm to detect network communities. RESULTS We obtain a 10‐layer MMST network of 362 prefecture nodes and 3,610 edges derived from the matrix of the Euclidean distances among these areas. These prefectures are divided into eight groups in the spatial network via community detection. We measure the partition by comparing the inter‐distances and intra‐distances of the communities and obtain meaningful regional ethnicity classification. DISCUSSION The visualization of the resulting communities on the map indicates that the prefectures in the same community are usually geographically adjacent. The formation of this partition is influenced by geographical factors, historic migrations, trade and economic factors, as well as isolation of culture and language. The MMST algorithm proves to be effective in geo‐genealogy and ethnicity classification for it retains essential information about surname affinity and highlights the geographical consanguinity of the population.National Natural Science Foundation of China, Grant/Award Numbers: 61773069, 71731002; National Social Science Foundation of China, Grant/Award Number: 14BSH024; Foundation of China of China Scholarships Council, Grant/Award Numbers: 201606045048, 201706040188, 201706040015; DOE, Grant/Award Number: DE-AC07-05Id14517; DTRA, Grant/Award Number: HDTRA1-14-1-0017; NSF, Grant/Award Numbers: CHE-1213217, CMMI-1125290, PHY-1505000 (61773069 - National Natural Science Foundation of China; 71731002 - National Natural Science Foundation of China; 14BSH024 - National Social Science Foundation of China; 201606045048 - Foundation of China of China Scholarships Council; 201706040188 - Foundation of China of China Scholarships Council; 201706040015 - Foundation of China of China Scholarships Council; DE-AC07-05Id14517 - DOE; HDTRA1-14-1-0017 - DTRA; CHE-1213217 - NSF; CMMI-1125290 - NSF; PHY-1505000 - NSF)Published versio

    Communities in Networks

    Full text link
    We survey some of the concepts, methods, and applications of community detection, which has become an increasingly important area of network science. To help ease newcomers into the field, we provide a guide to available methodology and open problems, and discuss why scientists from diverse backgrounds are interested in these problems. As a running theme, we emphasize the connections of community detection to problems in statistical physics and computational optimization.Comment: survey/review article on community structure in networks; published version is available at http://people.maths.ox.ac.uk/~porterm/papers/comnotices.pd

    Connecting Dream Networks Across Cultures

    Full text link
    Many species dream, yet there remain many open research questions in the study of dreams. The symbolism of dreams and their interpretation is present in cultures throughout history. Analysis of online data sources for dream interpretation using network science leads to understanding symbolism in dreams and their associated meaning. In this study, we introduce dream interpretation networks for English, Chinese and Arabic that represent different cultures from various parts of the world. We analyze communities in these networks, finding that symbols within a community are semantically related. The central nodes in communities give insight about cultures and symbols in dreams. The community structure of different networks highlights cultural similarities and differences. Interconnections between different networks are also identified by translating symbols from different languages into English. Structural correlations across networks point out relationships between cultures. Similarities between network communities are also investigated by analysis of sentiment in symbol interpretations. We find that interpretations within a community tend to have similar sentiment. Furthermore, we cluster communities based on their sentiment, yielding three main categories of positive, negative, and neutral dream symbols.Comment: 6 pages, 3 figure

    Eigenvector localization as a tool to study small communities in online social networks

    Full text link
    We present and discuss a mathematical procedure for identification of small "communities" or segments within large bipartite networks. The procedure is based on spectral analysis of the matrix encoding network structure. The principal tool here is localization of eigenvectors of the matrix, by means of which the relevant network segments become visible. We exemplified our approach by analyzing the data related to product reviewing on Amazon.com. We found several segments, a kind of hybrid communities of densely interlinked reviewers and products, which we were able to meaningfully interpret in terms of the type and thematic categorization of reviewed items. The method provides a complementary approach to other ways of community detection, typically aiming at identification of large network modules

    Community Detection on Evolving Graphs

    Get PDF
    Clustering is a fundamental step in many information-retrieval and data-mining applications. Detecting clusters in graphs is also a key tool for finding the community structure in social and behavioral networks. In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph. Furthermore, there are often limitations on the frequency of such probes, either imposed explicitly by the online platform (e.g., in the case of crawling proprietary social networks like twitter) or implicitly because of resource limitations (e.g., in the case of crawling the web). In this paper, we study a model of clustering on evolving graphs that captures this aspect of the problem. Our model is based on the classical stochastic block model, which has been used to assess rigorously the quality of various static clustering methods. In our model, the algorithm is supposed to reconstruct the planted clustering, given the ability to query for small pieces of local information about the graph, at a limited rate. We design and analyze clustering algorithms that work in this model, and show asymptotically tight upper and lower bounds on their accuracy. Finally, we perform simulations, which demonstrate that our main asymptotic results hold true also in practice
    corecore