4 research outputs found
Transforming Wikipedia into an Ontology-based Information Retrieval Search Engine for Local Experts using a Third-Party Taxonomy
Wikipedia is widely used for finding general information about a wide variety
of topics. Its vocation is not to provide local information. For example, it
provides plot, cast, and production information about a given movie, but not
showing times in your local movie theatre. Here we describe how we can connect
local information to Wikipedia, without altering its content. The case study we
present involves finding local scientific experts. Using a third-party
taxonomy, independent from Wikipedia's category hierarchy, we index information
connected to our local experts, present in their activity reports, and we
re-index Wikipedia content using the same taxonomy. The connections between
Wikipedia pages and local expert reports are stored in a relational database,
accessible through as public SPARQL endpoint. A Wikipedia gadget (or plugin)
activated by the interested user, accesses the endpoint as each Wikipedia page
is accessed. An additional tab on the Wikipedia page allows the user to open up
a list of teams of local experts associated with the subject matter in the
Wikipedia page. The technique, though presented here as a way to identify local
experts, is generic, in that any third party taxonomy, can be used in this to
connect Wikipedia to any non-Wikipedia data source.Comment: Joint Second Workshop on Language and Ontology \& Terminology and
Knowledge Structures (LangOnto2 + TermiKS) LO2TKS, May 2016, Portoroz,
Slovenia. 201
Between news and history: Identifying networked topics of collective attention on Wikipedia
The digital information landscape has introduced a new dimension to understanding how we collectively react to new information and preserve it at the societal level. This, together with the emergence of platforms such as Wikipedia, has challenged traditional views on the relationship between current events and historical accounts of events, with an ever-shrinking divide between "news" and "history". Wikipedia's place as the Internet's primary reference work thus poses the question of how it represents both traditional encyclopaedic knowledge and evolving important news stories. In other words, how is information on and attention towards current events integrated into the existing topical structures of Wikipedia? To address this we develop a temporal community detection approach towards topic detection that takes into account both short term dynamics of attention as well as long term article network structures. We apply this method to a dataset of one year of current events on Wikipedia to identify clusters distinct from those that would be found solely from page view time series correlations or static network structure. We are able to resolve the topics that more strongly reflect unfolding current events vs more established knowledge by the relative importance of collective attention dynamics vs link structures. We also offer important developments by identifying and describing the emergent topics on Wikipedia. This work provides a means of distinguishing how these information and attention clusters are related to Wikipedia's twin faces of encyclopaedic knowledge and current events -- crucial to understanding the production and consumption of knowledge in the digital age
A Data-Driven Sketch of Wikipedia Editors ∗
Who edits Wikipedia? We attempt to shed light on this question by using aggregated log data from Yahoo!’s browser toolbar in order to analyze Wikipedians ’ editing behavior in the context of their online lives beyond Wikipedia. We broadly characterize editors by investigating how their online behavior differs from that of other users; e.g., we find that Wikipedia editors search more, read more news, play more games, and, perhaps surprisingly, are more immersed in pop culture. Then we inspect how editors ’ general interests relate to the articles to which they contribute; e.g., we confirm the intuition that editors show more expertise in their active domains than average users. Our results are relevant as they illuminate novel aspects of what has become many Web users ’ prevalent source of information and can help in recruiting new editors