Search CORE

4,728 research outputs found

Tools to integrate organoleptic quality criteria into breeding programs

Author: Coulombel A.
REY Frederic
Sinoir N.
Taupier-Létage Bruno
VINDRAS Camille
Publication venue
Publication date: 01/01/2018
Field of study

This technical booklet provides methodologies and guidance to implement sensory evaluations for organoleptic quality assessment in multi-actor-projects for organic agriculture. It presents five detailed tests that can be used in sensory evaluation, methodologies on how to prepare the samples and a glossary. This booklet has been developed under Solibam project and updated during Diversifood project

Organic Eprints

Full Issue Spring 2010 Volume 5, Issue 2

Author
Publication venue: SFA ScholarWorks
Publication date: 02/11/2018
Field of study

SFA ScholarWorks

Entity linking: test collections revisited

Author: Deleu Johannes
Demeester Thomas
Develder Chris
Feys Matthias
Mertens Laurent
Publication venue
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Archivsystem Ask23

Evaluation of the NAS-ILAB Matrix for Monitoring International Labor Standards: Project Report

Author: Root Lawrence S.
Vernon Ada A.
Publication venue: DigitalCommons@ILR
Publication date: 23/02/2009
Field of study

The Bureau of International Labor Affairs (ILAB) engaged the National Research Council of the National Academy of Sciences (NAS) to recommend a method to monitor and evaluate labor conditions in a given country. The method focuses on 5 labor standards: freedom of association and collective bargaining, forced or compulsory labor, child labor, discrimination, and acceptable conditions of work

DigitalCommons@ILR

eCommons@Cornell

Recommended from our members

A collaborative approach to IR evaluation

Author: Sheshadri Aashish
Publication venue
Publication date: 16/09/2014
Field of study

textIn this thesis we investigate two main problems: 1) inferring consensus from disparate inputs to improve quality of crowd contributed data; and 2) developing a reliable crowd-aided IR evaluation framework. With regard to the first contribution, while many statistical label aggregation methods have been proposed, little comparative benchmarking has occurred in the community making it difficult to determine the state-of-the-art in consensus or to quantify novelty and progress, leaving modern systems to adopt simple control strategies. To aid the progress of statistical consensus and make state-of-the-art methods accessible, we develop a benchmarking framework in SQUARE, an open source shared task framework including benchmark datasets, defined tasks, standard metrics, and reference implementations with empirical results for several popular methods. Through the development of SQUARE we propose a crowd simulation model that emulates real crowd environments to enable rapid and reliable experimentation of collaborative methods with different crowd contributions. We apply the findings of the benchmark to develop reliable crowd contributed test collections for IR evaluation. As our second contribution, we describe a collaborative model for distributing relevance judging tasks between trusted assessors and crowd judges. Based on prior work's hypothesis of judging disagreements on borderline documents, we train a logistic regression model to predict assessor disagreement, prioritizing judging tasks by expected disagreement. Judgments are generated from different crowd models and intelligently aggregated. Given a priority queue, a judging budget, and a ratio for expert vs. crowd judging costs, critical judging tasks are assigned to trusted assessors with the crowd supplying remaining judgments. Results on two TREC datasets show significant judging burden can be confidently shifted to the crowd, achieving high rank correlation and often at lower cost vs. exclusive use of trusted assessors.Computer Science

Texas ScholarWorks

Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols

Author: AM Cohen
D Demner-Fushman
E Amitay
EM Voorhees
Fabien Campagne
I Soboroff
JA Aslam
K Sparck Jones
K Sparck Jones
KC Dorff
M Fuller
P Boldi
P Dong
R Nuray
S Buttcher
SE Robertson
SF Kim
Y Yue
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. Results We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79–0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86–0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. Conclusion Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute for human evaluations.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

What you think and what I think: Studying intersubjectivity in knowledge artifacts evaluation

Author: NC DOCKS at The University of North Carolina at Greensboro
Singh Rahul
Zhao Xia
Publication venue
Publication date: 01/01/2015
Field of study

Miscalibration, the failure to accurately evaluate one’s own work relative to others' evaluation, is a common concern in social systems of knowledge creation where participants act as both creators and evaluators. Theories of social norming hold that individual’s self-evaluation miscalibration diminishes over multiple iterations of creator-evaluator interactions and shared understanding emerges. This paper explores intersubjectivity and the longitudinal dynamics of miscalibration between creators' and evaluators' assessments in IT-enabled social knowledge creation and refinement systems. Using Latent Growth Modeling, we investigated dynamics of creator’s assessments of their own knowledge artifacts compared to peer evaluators' to determine whether miscalibration attenuates over multiple interactions. Contrary to theory, we found that creator’s self-assessment miscalibration does not attenuate over repeated interactions. Moreover, depending on the degree of difference, we found self-assessment miscalibration to amplify over time with knowledge artifact creators' diverging farther from their peers' collective opinion. Deeper analysis found no significant evidence of the influence of bias and controversy on miscalibration. Therefore, relying on social norming to correct miscalibration in knowledge creation environments (e.g., social media interactions) may not function as expected

The University of North Carolina at Greensboro

s-AWARE: Measure-based Supervised Merging Algorithms for Crowd Assessors in Information Retrieval

Author
Publication venue
Publication date
Field of study

In this thesis we develop a new approach to exploit crowd assessors relevance judgements for IR evaluation. We compute evaluation measures based on each assessor's ground truth. These measures are then merged weighting each assessor on the basis of his expertise level, estimated as the closeness between the assessor measures and gold standard measures, on a training set. The results highlight the greater performance of s-AWARE approach with respect to the majority of tested approaches

Padua Thesis and Dissertation Archive