19,901 research outputs found
Tagging, Folksonomy & Co - Renaissance of Manual Indexing?
This paper gives an overview of current trends in manual indexing on the Web.
Along with a general rise of user generated content there are more and more
tagging systems that allow users to annotate digital resources with tags
(keywords) and share their annotations with other users. Tagging is frequently
seen in contrast to traditional knowledge organization systems or as something
completely new. This paper shows that tagging should better be seen as a
popular form of manual indexing on the Web. Difference between controlled and
free indexing blurs with sufficient feedback mechanisms. A revised typology of
tagging systems is presented that includes different user roles and knowledge
organization systems with hierarchical relationships and vocabulary control. A
detailed bibliography of current research in collaborative tagging is included.Comment: Preprint. 12 pages, 1 figure, 54 reference
A Survey of Volunteered Open Geo-Knowledge Bases in the Semantic Web
Over the past decade, rapid advances in web technologies, coupled with
innovative models of spatial data collection and consumption, have generated a
robust growth in geo-referenced information, resulting in spatial information
overload. Increasing 'geographic intelligence' in traditional text-based
information retrieval has become a prominent approach to respond to this issue
and to fulfill users' spatial information needs. Numerous efforts in the
Semantic Geospatial Web, Volunteered Geographic Information (VGI), and the
Linking Open Data initiative have converged in a constellation of open
knowledge bases, freely available online. In this article, we survey these open
knowledge bases, focusing on their geospatial dimension. Particular attention
is devoted to the crucial issue of the quality of geo-knowledge bases, as well
as of crowdsourced data. A new knowledge base, the OpenStreetMap Semantic
Network, is outlined as our contribution to this area. Research directions in
information integration and Geographic Information Retrieval (GIR) are then
reviewed, with a critical discussion of their current limitations and future
prospects
Distilling Information Reliability and Source Trustworthiness from Digital Traces
Online knowledge repositories typically rely on their users or dedicated
editors to evaluate the reliability of their content. These evaluations can be
viewed as noisy measurements of both information reliability and information
source trustworthiness. Can we leverage these noisy evaluations, often biased,
to distill a robust, unbiased and interpretable measure of both notions?
In this paper, we argue that the temporal traces left by these noisy
evaluations give cues on the reliability of the information and the
trustworthiness of the sources. Then, we propose a temporal point process
modeling framework that links these temporal traces to robust, unbiased and
interpretable notions of information reliability and source trustworthiness.
Furthermore, we develop an efficient convex optimization procedure to learn the
parameters of the model from historical traces. Experiments on real-world data
gathered from Wikipedia and Stack Overflow show that our modeling framework
accurately predicts evaluation events, provides an interpretable measure of
information reliability and source trustworthiness, and yields interesting
insights about real-world events.Comment: Accepted at 26th World Wide Web conference (WWW-17
Recommended from our members
Enriching videos with light semantics
This paper describes an ongoing prototypical framework to annotate and retrieve web videos with light semantics. The proposed framework reuses many existing vocabularies along with a video model. The knowledge is captured from three different information spaces (media content, context, document). We also describe ways to extract the semantic content descriptions from the existing usergenerated content using multiple approaches of linguistic processing and Named Entity Recognition, which are later identified with DBpedia resources to establish meanings for the tags. Finally, the implemented prototype is described with multiple search interfaces and retrieval processes. Evaluation on semantic enrichment shows a considerable (50% of videos) improvement in content description
Entity Type Prediction in Knowledge Graphs using Embeddings
Open Knowledge Graphs (such as DBpedia, Wikidata, YAGO) have been recognized
as the backbone of diverse applications in the field of data mining and
information retrieval. Hence, the completeness and correctness of the Knowledge
Graphs (KGs) are vital. Most of these KGs are mostly created either via an
automated information extraction from Wikipedia snapshots or information
accumulation provided by the users or using heuristics. However, it has been
observed that the type information of these KGs is often noisy, incomplete, and
incorrect. To deal with this problem a multi-label classification approach is
proposed in this work for entity typing using KG embeddings. We compare our
approach with the current state-of-the-art type prediction method and report on
experiments with the KGs
Exploring Features for Predicting Policy Citations
In this study we performed an initial investigation and evaluation of
altmetrics and their relationship with public policy citation of research
papers. We examined methods for using altmetrics and other data to predict
whether a research paper is cited in public policy and applied receiver
operating characteristic curve on various feature groups in order to evaluate
their potential usefulness. From the methods we tested, classifying based on
tweet count provided the best results, achieving an area under the ROC curve of
0.91.Comment: 2 pages, accepted to JCDL '1
- …