316 research outputs found

    Crowdsourcing Peer Review in the Digital Humanities?

    Get PDF
    We propose an alternative approach to the standard peer review activity that aims to exploit the otherwise lost opinions of readers of publications which is called Readersourcing, originally proposed by Mizzaro [1]. Such an approach can be formalized by means of different models which share the same general principles. These models should be able to define a way, to measure the overall quality of a publication as well the reputation of a reader as an assessor; moreover, from these measures it should be possible to derive the reputation of a scholar as an author. We describe an ecosystem called Readersourcing 2.0 which provides an implementation for two Readersourcing models [2, 3] by outlining its goals and requirements. Readersourcing 2.0 will be used in the future to gather fresh data to analyze and validate

    readersourcing scholarly publishing peer review and barefoot cobbler s children

    Get PDF
    In this talk, I will start from an introduction to the field of scholarly publishing, the main knowledge dissemination mechanism adopted by science, and I will pay particular attention to one of its most important aspects, peer review. I will present scholarly publishing and peer review aims and motivations, and discuss some of their limits: Nobel Prize winners experiencing rejected papers, fraudulent behavior, sometimes long publishing time, etc. I will then briefly mention Science 2.0, namely the use of Web 2.0 tools to do science in a hopefully more effective way. I will then move to the main aspect of the talk. My thesis is composed of three parts. 1. Peer review is a scarce resource, i.e., there are not enough good referees today. I will try to support this statement by something more solid than the usual anecdotal experience of being reject because of bad review(er)s -- that I'm sure almost any researcher has experienced. 2. An alternative mechanism to peer review is available right out there, it is already widely used in the Web 2.0, it is quite a hot topic, and it probably is much studied and discussed by researchers: crowdsourcing. According to Web 2.0 enthusiasts, crowdsourcing allows to outsource to a large crowd tasks that are usually performed by a small group of experts. I think that peer review might be replaced -- or complemented -- by what we can name Readersourcing: a large crowd of readers that judge the papers that they read. Since most scholarly papers have many more readers than reviewers, this would allow to harness a large evaluation workforce. Today, readers's opinions usually are discussed very informally, have an impact on bibliographic citations and bibliometric indexes, or stay inside their own mind. In my opinion, it is quite curious that such an important resource, which is free, already available, used and studied by the research community in the Web 2.0 field, is not used at all in nowadays scholarly publishing, where the very same researchers publish their results. 3. Of course, to get a wisdom of the crowd, some readers have to be more equal than others: expert readers should be more influential than naive readers. There are probably several possible choices to this aim; I suggest to use a mechanism that I proposed some years ago, and that allows to evaluate papers, authors, and readers in an objective way. I will close the talk by showing some preliminary experimental results that support this readersourcing proposal

    On crowdsourcing relevance magnitudes for information retrieval evaluation

    Get PDF
    4siMagnitude estimation is a psychophysical scaling technique for the measurement of sensation, where observers assign numbers to stimuli in response to their perceived intensity. We investigate the use of magnitude estimation for judging the relevance of documents for information retrieval evaluation, carrying out a large-scale user study across 18 TREC topics and collecting over 50,000 magnitude estimation judgments using crowdsourcing. Our analysis shows that magnitude estimation judgments can be reliably collected using crowdsourcing, are competitive in terms of assessor cost, and are, on average, rank-aligned with ordinal judgments made by expert relevance assessors. We explore the application of magnitude estimation for IR evaluation, calibrating two gain-based effectiveness metrics, nDCG and ERR, directly from user-reported perceptions of relevance. A comparison of TREC system effectiveness rankings based on binary, ordinal, and magnitude estimation relevance shows substantial variation; in particular, the top systems ranked using magnitude estimation and ordinal judgments differ substantially. Analysis of the magnitude estimation scores shows that this effect is due in part to varying perceptions of relevance: different users have different perceptions of the impact of relative differences in document relevance. These results have direct implications for IR evaluation, suggesting that current assumptions about a single view of relevance being sufficient to represent a population of users are unlikely to hold.partially_openopenMaddalena, Eddy; Mizzaro, Stefano; Scholer, Falk; Turpin, AndrewMaddalena, Eddy; Mizzaro, Stefano; Scholer, Falk; Turpin, Andre

    Multidimensional news quality: A comparison of crowdsourcing and nichesourcing

    Get PDF
    In the age of fake news and of filter bubbles, assessing the quality of information is a compelling issue: it is important for users to understand the quality of the information they consume online. We report on our experiment aimed at understanding if workers from the crowd can be a suitable alternative to

    Teaching of web information retrieval: web first or IR first?

    Get PDF
    When teaching Web Information retrieval (IR), a teacher has two alternatives: (i) to teach the classical pre-Web IR issues first and present the Web specific issues later; or (ii) to teach directly the Web IR discipline per se. The first approach has the advantages of building on prerequisite knowledge, of presenting the historical development of the discipline, and probably appears more natural to most lecturers, who have followed the historical development of the field. Conversely, the second approach has the advantage of concentrating on a more modern view of the field, and probably leads to a higher motivation in the students, since the more appealing Web issues are dealt with at course start. I will discuss these issues, I will mention the approaches followed in the (rather few) Web IR books available, I will make some comparisons with the teaching of related disciplines, and I will also summarize my experience and some feedback from my students (I have been teaching a Web IR course for two Master's degrees in Computer Science and Information Technology at Udine University for the last two years; I had about twenty students each year; and I followed the first approach)

    Exploiting news to categorize tweets: Quantifying the impact of different news collections

    Get PDF
    Short texts, due to their nature which makes them full of abbreviations and new coined acronyms, are not easy to classify. Text enrichment is emerging in the literature as a potentially useful tool. This paper is a part of a longer term research that aims at understanding the effectiveness of tweet enrichment by means of news, instead of the whole web as a knowledge source. Since the choice of a news collection may contribute to produce very different outcomes in the enrichment process, we compare the impact of three features of such collections: volume, variety, and freshness. We show that all three features have a significant impact on categorization accuracy. Copyright \ua9 2016 for the individual papers by the paper's authors

    Towards building a standard dataset for Arabic keyphrase extraction evaluation

    Get PDF
    Keyphrases are short phrases that best represent a document content. They can be useful in a variety of applications, including document summarization and retrieval models. In this paper, we introduce the first dataset of keyphrases for an Arabic document collection, obtained by means of crowdsourcing. We experimentally evaluate different crowdsourced answer aggregation strategies and validate their performances against expert annotations to evaluate the quality of our dataset. We report about our experimental results, the dataset features

    Visual exploration and retrieval of XML document collections with the generic system X2

    Get PDF
    This article reports on the XML retrieval system X2 which has been developed at the University of Munich over the last five years. In a typical session with X2, the user first browses a structural summary of the XML database in order to select interesting elements and keywords occurring in documents. Using this intermediate result, queries combining structure and textual references are composed semiautomatically. After query evaluation, the full set of answers is presented in a visual and structured way. X2 largely exploits the structure found in documents, queries and answers to enable new interactive visualization and exploration techniques that support mixed IR and database-oriented querying, thus bridging the gap between these three views on the data to be retrieved. Another salient characteristic of X2 which distinguishes it from other visual query systems for XML is that it supports various degrees of detailedness in the presentation of answers, as well as techniques for dynamically reordering and grouping retrieved elements once the complete answer set has been computed
    • …
    corecore