35,233 research outputs found
Connecting every bit of knowledge: The Structure of Wikipedia’s first link network
Apples, porcupines, and the most obscure Bob Dylan song\u27is every topic a few clicks from Philosophy? Within Wikipedia, the surprising answer is yes: nearly all paths lead to Philosophy. Wikipedia is the largest, most meticulously indexed collection of human knowledge ever amassed. More than information about a topic, Wikipedia is a web of naturally emerging relationships. By following the first link in each article, we algorithmically construct a directed network of all 4.7 million articles: Wikipedia\u27s First Link Network.
Here we study the English edition of Wikipedia\u27s First Link Network for insight into how the many inventions, places, people, objects, and events are related and organized. We traverse every path, measuring the accumulation of first links, path lengths, basins, cycles, and the influence each article exerts in shaping the network. We discover scale-free distributions describe path length, accumulation, and influence. Far from dispersed, first links disproportionately accumulate at a few articles\u27flowing from specific to general and culminating around fundamental notions such as Community, State, and Science. Philosophy shapes more paths than any other article by two orders of magnitude. Curiously, we also observe a gravitation towards topical articles such as Health Care and Fossil Fuel. These findings enrich our view of the connections and structure of Wikipedia\u27s ever growing store of knowledge
Global disease monitoring and forecasting with Wikipedia
Infectious disease is a leading threat to public health, economic stability,
and other key social structures. Efforts to mitigate these impacts depend on
accurate and timely monitoring to measure the risk and progress of disease.
Traditional, biologically-focused monitoring techniques are accurate but costly
and slow; in response, new techniques based on social internet data such as
social media and search queries are emerging. These efforts are promising, but
important challenges in the areas of scientific peer review, breadth of
diseases and countries, and forecasting hamper their operational usefulness.
We examine a freely available, open data source for this use: access logs
from the online encyclopedia Wikipedia. Using linear models, language as a
proxy for location, and a systematic yet simple article selection procedure, we
tested 14 location-disease combinations and demonstrate that these data
feasibly support an approach that overcomes these challenges. Specifically, our
proof-of-concept yields models with up to 0.92, forecasting value up to
the 28 days tested, and several pairs of models similar enough to suggest that
transferring models from one location to another without re-training is
feasible.
Based on these preliminary results, we close with a research agenda designed
to overcome these challenges and produce a disease monitoring and forecasting
system that is significantly more effective, robust, and globally comprehensive
than the current state of the art.Comment: 27 pages; 4 figures; 4 tables. Version 2: Cite McIver & Brownstein
and adjust novelty claims accordingly; revise title; various revisions for
clarit
Formulas for Continued Fractions. An Automated Guess and Prove Approach
We describe a simple method that produces automatically closed forms for the
coefficients of continued fractions expansions of a large number of special
functions. The function is specified by a non-linear differential equation and
initial conditions. This is used to generate the first few coefficients and
from there a conjectured formula. This formula is then proved automatically
thanks to a linear recurrence satisfied by some remainder terms. Extensive
experiments show that this simple approach and its straightforward
generalization to difference and -difference equations capture a large part
of the formulas in the literature on continued fractions.Comment: Maple worksheet attache
- …