120 research outputs found
On crowdsourcing relevance magnitudes for information retrieval evaluation
4siMagnitude estimation is a psychophysical scaling technique for the measurement of sensation, where observers assign numbers to stimuli in response to their perceived intensity. We investigate the use of magnitude estimation for judging the relevance of documents for information retrieval evaluation, carrying out a large-scale user study across 18 TREC topics and collecting over 50,000 magnitude estimation judgments using crowdsourcing. Our analysis shows that magnitude estimation judgments can be reliably collected using crowdsourcing, are competitive in terms of assessor cost, and are, on average, rank-aligned with ordinal judgments made by expert relevance assessors. We explore the application of magnitude estimation for IR evaluation, calibrating two gain-based effectiveness metrics, nDCG and ERR, directly from user-reported perceptions of relevance. A comparison of TREC system effectiveness rankings based on binary, ordinal, and magnitude estimation relevance shows substantial variation; in particular, the top systems ranked using magnitude estimation and ordinal judgments differ substantially. Analysis of the magnitude estimation scores shows that this effect is due in part to varying perceptions of relevance: different users have different perceptions of the impact of relative differences in document relevance. These results have direct implications for IR evaluation, suggesting that current assumptions about a single view of relevance being sufficient to represent a population of users are unlikely to hold.partially_openopenMaddalena, Eddy; Mizzaro, Stefano; Scholer, Falk; Turpin, AndrewMaddalena, Eddy; Mizzaro, Stefano; Scholer, Falk; Turpin, Andre
Multidimensional news quality: A comparison of crowdsourcing and nichesourcing
In the age of fake news and of filter bubbles, assessing the quality of information is a compelling issue: it is important for users to understand the quality of the information they consume online. We report on our experiment aimed at understanding if workers from the crowd can be a suitable alternative to
Towards building a standard dataset for Arabic keyphrase extraction evaluation
Keyphrases are short phrases that best
represent a document content. They can be useful
in a variety of applications, including document
summarization and retrieval models. In this paper,
we introduce the first dataset of keyphrases for an
Arabic document collection, obtained by means of
crowdsourcing. We experimentally evaluate different
crowdsourced answer aggregation strategies and
validate their performances against expert annotations
to evaluate the quality of our dataset. We
report about our experimental results, the dataset
features
Visual exploration and retrieval of XML document collections with the generic system X2
This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user
first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically.
After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed
Using ontological contexts to assess the relevance of statements in ontology evolution
Ontology evolution tools often propose new ontological changes in the form of statements. While different methods exist to check the quality of such statements to be added to the ontology (e.g., in terms of consistency and impact), their relevance is usually left to the user to assess. Relevance in this context is a notion of how well the statement fits in the target ontology. We present an approach to automatically assess such relevance. It is acknowledged in cognitive science and other research areas that a piece of information flowing between two entities is relevant if there is an agreement on the context used between the entities. In our approach, we derive the context of a statement from online ontologies in which it is used, and study how this context matches with the target ontology. We identify relevance patterns that give an indication of rele- vance when the statement context and the target ontology fulfill specific conditions. We validate our approach through an experiment in three dif- ferent domains, and show how our pattern-based technique outperforms a naive overlap-based approach
Efficiency Theory: a Unifying Theory for Information, Computation and Intelligence
The paper serves as the first contribution towards the development of the
theory of efficiency: a unifying framework for the currently disjoint theories
of information, complexity, communication and computation. Realizing the
defining nature of the brute force approach in the fundamental concepts in all
of the above mentioned fields, the paper suggests using efficiency or
improvement over the brute force algorithm as a common unifying factor
necessary for the creation of a unified theory of information manipulation. By
defining such diverse terms as randomness, knowledge, intelligence and
computability in terms of a common denominator we are able to bring together
contributions from Shannon, Levin, Kolmogorov, Solomonoff, Chaitin, Yao and
many others under a common umbrella of the efficiency theory
Mapping recent information behavior research: an analysis of co-authorship and cocitation networks
There has been an increase in research published on information behavior in recent years, and this has been accompanied by an increase in its diversity and interaction with other fields, particularly information retrieval (HR). The aims of this study are to determine which researchers have contributed to producing the current body of knowledge on this subject, and to describe its intellectual basis. A bibliometric and network analysis was applied to authorship and co-authorship as well as citation and co-citation. According to these analyses, there is a small number of authors who can be considered to be the most productive and who publish regularly, and a large number of transient ones. Other findings reveal a marked predominance of theoretical works, some examples of qualitative methodology that originate in other areas of social science, and a high incidence of research focused on the user interaction with information retrieval systems and the information behavior of doctors
- …