37,823 research outputs found
Regional surname affinity: a spatial network approach
OBJECTIVE
We investigate surname affinities among areas of modern‐day China, by constructing a spatial network, and making community detection. It reports a geographical genealogy of the Chinese population that is result of population origins, historical migrations, and societal evolutions.
MATERIALS AND METHODS
We acquire data from the census records supplied by China's National Citizen Identity Information System, including the surname and regional information of 1.28 billion registered Chinese citizens. We propose a multilayer minimum spanning tree (MMST) to construct a spatial network based on the matrix of isonymic distances, which is often used to characterize the dissimilarity of surname structure among areas. We use the fast unfolding algorithm to detect network communities.
RESULTS
We obtain a 10‐layer MMST network of 362 prefecture nodes and 3,610 edges derived from the matrix of the Euclidean distances among these areas. These prefectures are divided into eight groups in the spatial network via community detection. We measure the partition by comparing the inter‐distances and intra‐distances of the communities and obtain meaningful regional ethnicity classification.
DISCUSSION
The visualization of the resulting communities on the map indicates that the prefectures in the same community are usually geographically adjacent. The formation of this partition is influenced by geographical factors, historic migrations, trade and economic factors, as well as isolation of culture and language. The MMST algorithm proves to be effective in geo‐genealogy and ethnicity classification for it retains essential information about surname affinity and highlights the geographical consanguinity of the population.National Natural Science Foundation of China, Grant/Award Numbers: 61773069, 71731002; National Social Science Foundation of China, Grant/Award Number: 14BSH024; Foundation of China of China Scholarships Council, Grant/Award Numbers: 201606045048, 201706040188, 201706040015; DOE, Grant/Award Number: DE-AC07-05Id14517; DTRA, Grant/Award Number: HDTRA1-14-1-0017; NSF, Grant/Award Numbers: CHE-1213217, CMMI-1125290, PHY-1505000 (61773069 - National Natural Science Foundation of China; 71731002 - National Natural Science Foundation of China; 14BSH024 - National Social Science Foundation of China; 201606045048 - Foundation of China of China Scholarships Council; 201706040188 - Foundation of China of China Scholarships Council; 201706040015 - Foundation of China of China Scholarships Council; DE-AC07-05Id14517 - DOE; HDTRA1-14-1-0017 - DTRA; CHE-1213217 - NSF; CMMI-1125290 - NSF; PHY-1505000 - NSF)Published versio
Communities in Networks
We survey some of the concepts, methods, and applications of community
detection, which has become an increasingly important area of network science.
To help ease newcomers into the field, we provide a guide to available
methodology and open problems, and discuss why scientists from diverse
backgrounds are interested in these problems. As a running theme, we emphasize
the connections of community detection to problems in statistical physics and
computational optimization.Comment: survey/review article on community structure in networks; published
version is available at
http://people.maths.ox.ac.uk/~porterm/papers/comnotices.pd
Connecting Dream Networks Across Cultures
Many species dream, yet there remain many open research questions in the
study of dreams. The symbolism of dreams and their interpretation is present in
cultures throughout history. Analysis of online data sources for dream
interpretation using network science leads to understanding symbolism in dreams
and their associated meaning. In this study, we introduce dream interpretation
networks for English, Chinese and Arabic that represent different cultures from
various parts of the world. We analyze communities in these networks, finding
that symbols within a community are semantically related. The central nodes in
communities give insight about cultures and symbols in dreams. The community
structure of different networks highlights cultural similarities and
differences. Interconnections between different networks are also identified by
translating symbols from different languages into English. Structural
correlations across networks point out relationships between cultures.
Similarities between network communities are also investigated by analysis of
sentiment in symbol interpretations. We find that interpretations within a
community tend to have similar sentiment. Furthermore, we cluster communities
based on their sentiment, yielding three main categories of positive, negative,
and neutral dream symbols.Comment: 6 pages, 3 figure
Eigenvector localization as a tool to study small communities in online social networks
We present and discuss a mathematical procedure for identification of small
"communities" or segments within large bipartite networks. The procedure is
based on spectral analysis of the matrix encoding network structure. The
principal tool here is localization of eigenvectors of the matrix, by means of
which the relevant network segments become visible. We exemplified our approach
by analyzing the data related to product reviewing on Amazon.com. We found
several segments, a kind of hybrid communities of densely interlinked reviewers
and products, which we were able to meaningfully interpret in terms of the type
and thematic categorization of reviewed items. The method provides a
complementary approach to other ways of community detection, typically aiming
at identification of large network modules
Community Detection on Evolving Graphs
Clustering is a fundamental step in many information-retrieval and data-mining applications. Detecting clusters in graphs is also a key tool for finding the community structure in social and behavioral networks. In many of these applications, the input graph evolves over time in a continual and decentralized manner, and, to maintain a good clustering, the clustering algorithm needs to repeatedly probe the graph. Furthermore, there are often limitations on the frequency of such probes, either imposed explicitly by the online platform (e.g., in the case of crawling proprietary social networks like twitter) or implicitly because of resource limitations (e.g., in the case of crawling the web). In this paper, we study a model of clustering on evolving graphs that captures this aspect of the problem. Our model is based on the classical stochastic block model, which has been used to assess rigorously the quality of various static clustering methods. In our model, the algorithm is supposed to reconstruct the planted clustering, given the ability to query for small pieces of local information about the graph, at a limited rate. We design and analyze clustering algorithms that work in this model, and show asymptotically tight upper and lower bounds on their accuracy. Finally, we perform simulations, which demonstrate that our main asymptotic results hold true also in practice
- …