25,582 research outputs found
Experience versus Talent Shapes the Structure of the Web
We use sequential large-scale crawl data to empirically investigate and
validate the dynamics that underlie the evolution of the structure of the web.
We find that the overall structure of the web is defined by an intricate
interplay between experience or entitlement of the pages (as measured by the
number of inbound hyperlinks a page already has), inherent talent or fitness of
the pages (as measured by the likelihood that someone visiting the page would
give a hyperlink to it), and the continual high rates of birth and death of
pages on the web. We find that the web is conservative in judging talent and
the overall fitness distribution is exponential, showing low variability. The
small variance in talent, however, is enough to lead to experience
distributions with high variance: The preferential attachment mechanism
amplifies these small biases and leads to heavy-tailed power-law (PL) inbound
degree distributions over all pages, as well as over pages that are of the same
age. The balancing act between experience and talent on the web allows newly
introduced pages with novel and interesting content to grow quickly and surpass
older pages. In this regard, it is much like what we observe in high-mobility
and meritocratic societies: People with entitlement continue to have access to
the best resources, but there is just enough screening for fitness that allows
for talented winners to emerge and join the ranks of the leaders. Finally, we
show that the fitness estimates have potential practical applications in
ranking query results
Challenges in cross-cultural/multilingual music information seeking
Understanding and meeting the needs of a broad range of music users across different cultures and languages are central in designing a global music digital library. This exploratory study examines cross-cultural/multilingual music information seeking behaviors and reveals
some important characteristics of these behaviors by analyzing 107 authentic music information queries from a Korean knowledge search portal Naver (knowledge) iN and 150 queries from Google Answers website. We conclude that new sets of access points must be developed to accommodate music queries that cross cultural or language boundaries
An Analytical Study of Large SPARQL Query Logs
With the adoption of RDF as the data model for Linked Data and the Semantic
Web, query specification from end- users has become more and more common in
SPARQL end- points. In this paper, we conduct an in-depth analytical study of
the queries formulated by end-users and harvested from large and up-to-date
query logs from a wide variety of RDF data sources. As opposed to previous
studies, ours is the first assessment on a voluminous query corpus, span- ning
over several years and covering many representative SPARQL endpoints. Apart
from the syntactical structure of the queries, that exhibits already
interesting results on this generalized corpus, we drill deeper in the
structural char- acteristics related to the graph- and hypergraph represen-
tation of queries. We outline the most common shapes of queries when visually
displayed as pseudographs, and char- acterize their (hyper-)tree width.
Moreover, we analyze the evolution of queries over time, by introducing the
novel con- cept of a streak, i.e., a sequence of queries that appear as
subsequent modifications of a seed query. Our study offers several fresh
insights on the already rich query features of real SPARQL queries formulated
by real users, and brings us to draw a number of conclusions and pinpoint
future di- rections for SPARQL query evaluation, query optimization, tuning,
and benchmarking
- …