492 research outputs found
The scholarly impact of TRECVid (2003-2009)
This paper reports on an investigation into the scholarly impact of the TRECVid (TREC Video Retrieval Evaluation) benchmarking conferences between 2003 and 2009. The contribution of TRECVid to research in video retrieval is assessed by analyzing publication content to show the development of techniques and approaches over time and by analyzing publication impact through publication numbers and citation analysis. Popular conference and journal venues for TRECVid publications are identified in terms of number of citations received. For a selection of participants at different career stages, the relative importance of TRECVid publications in terms of citations vis a vis their other publications is investigated. TRECVid, as an evaluation conference, provides data on which research teams âscoredâ highly against the evaluation criteria and the relationship between âtop scoringâ teams at TRECVid and the âtop scoringâ papers in terms of citations is analysed. A strong relationship was found between âsuccessâ at TRECVid and âsuccessâ at citations both for high scoring and low scoring teams. The implications of the study in terms of the value of TRECVid as a research activity, and the value of bibliometric analysis as a research evaluation tool, are discussed
RHECITAS: citation analysis of French humanities articles
International audienceThe RHECITAS project aims at the analysis of citations in French Humanities and Social Sciences articles using natural language processing techniques. It is based on a corpus of online articles, through the aid of natural language processing tools. The project is funded by TGE-ADONIS (CNRS, French National Research Centre). Although very little research, either theoretical and technical, has been made on such data (most approaches focusing on science publications written in English), we developed two different tools that can automatically a) identify the more important items in a list of references, based on a number of linguistic cues, and b) extract relevant terms associated to a reference. These results show a new angle on citation analysis, both from a linguistic point of view and for practical applications
A novel gluten knowledge base of potential biomedical and health-related interactions extracted from the literature: using machine learning and graph analysis methodologies to reconstruct the bibliome
Background
In return for their nutritional properties and broad availability, cereal crops have been associated with different alimentary disorders and symptoms, with the majority of the responsibility being attributed to gluten. Therefore, the research of gluten-related literature data continues to be produced at ever-growing rates, driven in part by the recent exploratory studies that link gluten to non-traditional diseases and the popularity of gluten-free diets, making it increasingly difficult to access and analyse practical and structured information. In this sense, the accelerated discovery of novel advances in diagnosis and treatment, as well as exploratory studies, produce a favourable scenario for disinformation and misinformation.
Objectives
Aligned with, the European Union strategy âDelivering on EU Food Safety and Nutrition in 2050âł which emphasizes the inextricable links between imbalanced diets, the increased exposure to unreliable sources of information and misleading information, and the increased dependency on reliable sources of information; this paper presents GlutKNOIS, a public and interactive literature-based database that reconstructs and represents the experimental biomedical knowledge extracted from the gluten-related literature. The developed platform includes different external database knowledge, bibliometrics statistics and social media discussion to propose a novel and enhanced way to search, visualise and analyse potential biomedical and health-related interactions in relation to the gluten domain.
Methods
For this purpose, the presented study applies a semi-supervised curation workflow that combines natural language processing techniques, machine learning algorithms, ontology-based normalization and integration approaches, named entity recognition methods, and graph knowledge reconstruction methodologies to process, classify, represent and analyse the experimental findings contained in the literature, which is also complemented by data from the social discussion.
Results and conclusions
In this sense, 5814 documents were manually annotated and 7424 were fully automatically processed to reconstruct the first online gluten-related knowledge database of evidenced health-related interactions that produce health or metabolic changes based on the literature. In addition, the automatic processing of the literature combined with the knowledge representation methodologies proposed has the potential to assist in the revision and analysis of years of gluten research. The reconstructed knowledge base is public and accessible at https://sing-group.org/glutknois/Fundação para a CiĂȘncia e a Tecnologia | Ref. UIDB/50006/2020Xunta de Galicia | Ref. ED481B-2019-032Xunta de Galicia | Ref. ED431G2019/06Xunta de Galicia | Ref. ED431C 2022/03Universidade de Vigo/CISU
Benchmarking some Portuguese S&T system research units: 2nd Edition
The increasing use of productivity and impact metrics for evaluation and
comparison, not only of individual researchers but also of institutions,
universities and even countries, has prompted the development of bibliometrics.
Currently, metrics are becoming widely accepted as an easy and balanced way to
assist the peer review and evaluation of scientists and/or research units,
provided they have adequate precision and recall.
This paper presents a benchmarking study of a selected list of representative
Portuguese research units, based on a fairly complete set of parameters:
bibliometric parameters, number of competitive projects and number of PhDs
produced. The study aimed at collecting productivity and impact data from the
selected research units in comparable conditions i.e., using objective metrics
based on public information, retrievable on-line and/or from official sources
and thus verifiable and repeatable. The study has thus focused on the activity
of the 2003-06 period, where such data was available from the latest official
evaluation.
The main advantage of our study was the application of automatic tools,
achieving relevant results at a reduced cost. Moreover, the results over the
selected units suggest that this kind of analyses will be very useful to
benchmark scientific productivity and impact, and assist peer review.Comment: 26 pages, 20 figures F. Couto, D. Faria, B. Tavares, P.
Gon\c{c}alves, and P. Verissimo, Benchmarking some portuguese S\&T system
research units: 2nd edition, DI/FCUL TR 13-03, Department of Informatics,
University of Lisbon, February 201
Recommended from our members
Incidental or influential? â A decade of using text-mining for citation function classification.
This work looks in depth at several studies that have attempted to automate the process of citation importance classification based on the publicationsâ full text. We offer a comparison of their individual similarities, strengths and weaknesses. We analyse a range of features that have been previously used in this task. Our experimental results confirm that the number of in-text references are highly predictive of influence. Contrary to the work of Valenzuela et al. (2015), we find abstract similarity one of the most predictive features. Overall, we show that many of the features previously described in literature have been either reported as not particularly predictive, cannot be reproduced based on their existing descriptions or should not be used due to their reliance on external changing evidence. Additionally, we find significant variance in the results provided by the PDF extraction tools used in the pre-processing stages of citation extraction. This has a direct and significant impact on the classification features that rely on this extraction process. Consequently, we discuss challenges and potential improvements in the classification pipeline, provide a critical review of the performance of individual features and address the importance of constructing a large-scale gold-standard reference dataset
Usage Bibliometrics
Scholarly usage data provides unique opportunities to address the known
shortcomings of citation analysis. However, the collection, processing and
analysis of usage data remains an area of active research. This article
provides a review of the state-of-the-art in usage-based informetric, i.e. the
use of usage data to study the scholarly process.Comment: Publisher's PDF (by permission). Publisher web site:
books.infotoday.com/asist/arist44.shtm
- âŠ