17,072 research outputs found
Time evolution of Wikipedia network ranking
We study the time evolution of ranking and spectral properties of the Google
matrix of English Wikipedia hyperlink network during years 2003 - 2011. The
statistical properties of ranking of Wikipedia articles via PageRank and
CheiRank probabilities, as well as the matrix spectrum, are shown to be
stabilized for 2007 - 2011. A special emphasis is done on ranking of Wikipedia
personalities and universities. We show that PageRank selection is dominated by
politicians while 2DRank, which combines PageRank and CheiRank, gives more
accent on personalities of arts. The Wikipedia PageRank of universities
recovers 80 percents of top universities of Shanghai ranking during the
considered time period.Comment: 10 pages, 11 figures. Accepted for publication in EPJ
Google matrix analysis of directed networks
In past ten years, modern societies developed enormous communication and
social networks. Their classification and information retrieval processing
become a formidable task for the society. Due to the rapid growth of World Wide
Web, social and communication networks, new mathematical methods have been
invented to characterize the properties of these networks on a more detailed
and precise level. Various search engines are essentially using such methods.
It is highly important to develop new tools to classify and rank enormous
amount of network information in a way adapted to internal network structures
and characteristics. This review describes the Google matrix analysis of
directed complex networks demonstrating its efficiency on various examples
including World Wide Web, Wikipedia, software architecture, world trade, social
and citation networks, brain neural networks, DNA sequences and Ulam networks.
The analytical and numerical matrix methods used in this analysis originate
from the fields of Markov chains, quantum chaos and Random Matrix theory.Comment: 56 pages, 58 figures. Missed link added in network example of Fig3
Google matrix of the world trade network
Using the United Nations Commodity Trade Statistics Database
[http://comtrade.un.org/db/] we construct the Google matrix of the world trade
network and analyze its properties for various trade commodities for all
countries and all available years from 1962 to 2009. The trade flows on this
network are classified with the help of PageRank and CheiRank algorithms
developed for the World Wide Web and other large scale directed networks. For
the world trade this ranking treats all countries on equal democratic grounds
independent of country richness. Still this method puts at the top a group of
industrially developed countries for trade in {\it all commodities}. Our study
establishes the existence of two solid state like domains of rich and poor
countries which remain stable in time, while the majority of countries are
shown to be in a gas like phase with strong rank fluctuations. A simple random
matrix model provides a good description of statistical distribution of
countries in two-dimensional rank plane. The comparison with usual ranking by
export and import highlights new features and possibilities of our approach.Comment: 14 pages, 13 figures. More detailed data and high definition figures
are available on the website:
http://www.quantware.ups-tlse.fr/QWLIB/tradecheirank/index.htm
Highlighting Entanglement of Cultures via Ranking of Multilingual Wikipedia Articles
How different cultures evaluate a person? Is an important person in one
culture is also important in the other culture? We address these questions via
ranking of multilingual Wikipedia articles. With three ranking algorithms based
on network structure of Wikipedia, we assign ranking to all articles in 9
multilingual editions of Wikipedia and investigate general ranking structure of
PageRank, CheiRank and 2DRank. In particular, we focus on articles related to
persons, identify top 30 persons for each rank among different editions and
analyze distinctions of their distributions over activity fields such as
politics, art, science, religion, sport for each edition. We find that local
heroes are dominant but also global heroes exist and create an effective
network representing entanglement of cultures. The Google matrix analysis of
network of cultures shows signs of the Zipf law distribution. This approach
allows to examine diversity and shared characteristics of knowledge
organization between cultures. The developed computational, data driven
approach highlights cultural interconnections in a new perspective.Comment: Published in PLoS ONE
(http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0074554).
Supporting information is available on the same webpag
Google matrix analysis of the multiproduct world trade network
Using the United Nations COMTRADE database \cite{comtrade} we construct the
Google matrix of multiproduct world trade between the UN countries and
analyze the properties of trade flows on this network for years 1962 - 2010.
This construction, based on Markov chains, treats all countries on equal
democratic grounds independently of their richness and at the same time it
considers the contributions of trade products proportionally to their trade
volume. We consider the trade with 61 products for up to 227 countries. The
obtained results show that the trade contribution of products is asymmetric:
some of them are export oriented while others are import oriented even if the
ranking by their trade volume is symmetric in respect to export and import
after averaging over all world countries. The construction of the Google matrix
allows to investigate the sensitivity of trade balance in respect to price
variations of products, e.g. petroleum and gas, taking into account the world
connectivity of trade links. The trade balance based on PageRank and CheiRank
probabilities highlights the leading role of China and other BRICS countries in
the world trade in recent years. We also show that the eigenstates of with
large eigenvalues select specific trade communities.Comment: 19 pages, 25 figure
Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia
Hyperlinks are an essential feature of the World Wide Web. They are
especially important for online encyclopedias such as Wikipedia: an article can
often only be understood in the context of related articles, and hyperlinks
make it easy to explore this context. But important links are often missing,
and several methods have been proposed to alleviate this problem by learning a
linking model based on the structure of the existing links. Here we propose a
novel approach to identifying missing links in Wikipedia. We build on the fact
that the ultimate purpose of Wikipedia links is to aid navigation. Rather than
merely suggesting new links that are in tune with the structure of existing
links, our method finds missing links that would immediately enhance
Wikipedia's navigability. We leverage data sets of navigation paths collected
through a Wikipedia-based human-computation game in which users must find a
short path from a start to a target article by only clicking links encountered
along the way. We harness human navigational traces to identify a set of
candidates for missing links and then rank these candidates. Experiments show
that our procedure identifies missing links of high quality
Interactions of cultures and top people of Wikipedia from ranking of 24 language editions
Wikipedia is a huge global repository of human knowledge, that can be
leveraged to investigate interwinements between cultures. With this aim, we
apply methods of Markov chains and Google matrix, for the analysis of the
hyperlink networks of 24 Wikipedia language editions, and rank all their
articles by PageRank, 2DRank and CheiRank algorithms. Using automatic
extraction of people names, we obtain the top 100 historical figures, for each
edition and for each algorithm. We investigate their spatial, temporal, and
gender distributions in dependence of their cultural origins. Our study
demonstrates not only the existence of skewness with local figures, mainly
recognized only in their own cultures, but also the existence of global
historical figures appearing in a large number of editions. By determining the
birth time and place of these persons, we perform an analysis of the evolution
of such figures through 35 centuries of human history for each language, thus
recovering interactions and entanglement of cultures over time. We also obtain
the distributions of historical figures over world countries, highlighting
geographical aspects of cross-cultural links. Considering historical figures
who appear in multiple editions as interactions between cultures, we construct
a network of cultures and identify the most influential cultures according to
this network.Comment: 32 pages. 10 figures. Submitted for publication. Supporting
information is available on
http://www.quantware.ups-tlse.fr/QWLIB/topwikipeople
- âŠ