Search CORE

28 research outputs found

PaperMaker: validation of biomedical scientific publications

Author: D. Rebholz-Schuhmann
Leitner
P. Pezik
Rebholz-Schuhmann
S. Kavaliauskas
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: The automatic analysis of scientific literature can support authors in writing their manuscripts

Crossref

PubMed Central

MedEvi: Retrieving textual evidence of relations between biomedical concepts from Medline

Author: D. Rebholz-Schuhmann
Hoffmann
J.-j. Kim
P. Pezik
Rebholz-Schuhmann
Publication venue: Oxford University Press
Publication date
Field of study

Summary: Search engines running on MEDLINE abstracts have been widely used by biologists to find publications that are related to their research. The existing search engines such as PubMed, however, have limitations when applied for the task of seeking textual evidence of relations between given concepts. The limitations are mainly due to the problem that the search engines do not effectively deal with multi-term queries which may imply semantic relations between the terms. To address this problem, we present MedEvi, a novel search engine that imposes positional restriction on occurrences matching multi-term queries, based on the observation that terms with semantic relations which are explicitly stated in text are not found too far from each other. MedEvi further identifies additional keywords of biological and statistical significance from local context of matching occurrences in order to help users reformulate their queries for better results

Crossref

PubMed Central

Response to comment on ''MeSH-up: effective MeSH text classification for improved document retrieval''

Author: de Jong Franciska M.G.
Kraaij Wessel
Lee Vivian
Pezik Piotr
Rebholz-Schuhmann Dietrich
Trieschnigg Rudolf Berend
Publication venue
Publication date: 15/10/2009
Field of study

University of Twente Research Information

Agricultural Academy

Author: G Jurgiel-Malecka
M Gibczynska
M Nawrocka-Pezik
Publication venue
Publication date: 01/01/2015
Field of study

The analysed six onion cultivars (Allium cepa L.) cultivated in Poland were characterised by different colour of onion scale leaf: Albion and Alibaba (white cultivar), Grabowska and Majka (yellow cultivar), Scarlet and Wenta (red cultivar). The onion cultivars were obtained from the Experimental Station of Cultivars Testing in Węgrzce near Kraków. The following was determined for each cultivar: the content of macro-and micronutrients, reducing and total sugar, the vitamin C content. Significant differences in chemical composition between the analysed cultivars were found. The cultivars of the same colour exhibited similar tendencies in terms of accumulating the most of the analysed elements. The greatest differences in the chemical content were found among yellow and red cultivars. Yellow cultivars accumulated significantly greater amounts of nitrogen, phosphorus, potassium, magnesium, iron, manganese, zinc, copper and reducing sugar than red onion cultivars. Red onion cultivars contained significantly greater amounts of total sugar and vitamin C than yellow onion cultivars

CiteSeerX

Overview of the Authorship Verification Task at PAN 2022

Author: Bevendorff Janek
Heini Annina
Kestemont Mike
Kredens Krzysztof
Pezik Piotr
Potthast Martin
Stamatatos Efstathios
Stein Benno
Publication venue
Publication date: 05/09/2022
Field of study

The authorship verification task at PAN 2022 follows the experimental setup of similar shared tasks in the recent past. However, it focuses on a different, and very challenging scenario: given two texts belonging to different discourse types, the task is to determine whether they are written by the same author. Based on a new corpus in English, we provide pairs of texts using four discourse types: essays, emails, text messages, and business memos. The differences in communicative purpose, intended audience, and the level of formality render the cross-discourse-type authorship verification task very hard. We received 7 submissions and evaluated them using the TIRA integrated research architecture, along with two baseline approaches. This paper reviews the submissions and presents a detailed discussion of the evaluation results

Aston Publications Explorer

MeSH Up: effective MeSH text classification for improved document retrieval

Author: Aronson
Aronson
Camous
Dietrich Rebholz-Schuhmann
Dolf Trieschnigg
Franciska de Jong
Gaudan
Hersh
Hersh
Hiemstra
Kim
Lam
Lam
Lavrenko
Lewis
Lin
Lu
Nenadic
Parkinson
Piotr Pezik
Rak
Robertson
Ruch
Ruiz
Schuemie
Smucker
Sohn
Srinivasan
Vivian Lee
Wessel Kraaij
Yu
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Controlled vocabularies such as the Medical Subject Headings (MeSH) thesaurus and the Gene Ontology (GO) provide an efficient way of accessing and organizing biomedical information by reducing the ambiguity inherent to free-text data. Different methods of automating the assignment of MeSH concepts have been proposed to replace manual annotation, but they are either limited to a small subset of MeSH or have only been compared with a limited number of other systems

CiteSeerX

Crossref

PubMed Central

Leiden University Scholary Publications

Radboud Repository

University of Twente Research Information

BioLexicon: Towards a reference terminological resource in the biomedical domain

Author: Ananiadou Sophia
Calzolari Nicoletta
Del Gratta Riccardo
Kim Jung-Jae
Lee Vivian
McNaught John
Monachini Monica
Montemagni Simonetta
Pezik Piotr
Rebholz-Schuhmann Dietrich
Sasaki Yutaka
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

The BioLexicon is a publicly available large-scale terminological resource which brings together potential terms from several resources representing selected semantic types (genes, proteins, chemicals, species, enzymes, selected ontological terms). The schema of the BioLexicon enables improved resolution of term ambiguity and follows lexical standards for terminological resources

The University of Manchester - Institutional Repository

PUblication MAnagement

Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

Author: A Jimeno
A Jimeno-Yepes
A Schwartz
A Yeh
Alan R Aronson
Antonio J Jimeno-Yepes
B McInnes
B McInnes
Bridget T McInnes
C Leacock
C Manning
G Leroy
H Liu
H Liu
H Liu
J Fan
L Hirschman
M Stevenson
M Weeber
P Pezik
R Leaman
S Gaudan
S Humphrey
T Pedersen
WA Gale
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Evaluation of Word Sense Disambiguation (WSD) methods in the biomedical domain is difficult because the available resources are either too small or too focused on specific types of entities (e.g. diseases or genes). We present a method that can be used to automatically develop a WSD test collection using the Unified Medical Language System (UMLS) Metathesaurus and the manual MeSH indexing of MEDLINE. We demonstrate the use of this method by developing such a data set, called MSH WSD. Methods In our method, the Metathesaurus is first screened to identify ambiguous terms whose possible senses consist of two or more MeSH headings. We then use each ambiguous term and its corresponding MeSH heading to extract MEDLINE citations where the term and only one of the MeSH headings co-occur. The term found in the MEDLINE citation is automatically assigned the UMLS CUI linked to the MeSH heading. Each instance has been assigned a UMLS Concept Unique Identifier (CUI). We compare the characteristics of the MSH WSD data set to the previously existing NLM WSD data set. Results The resulting MSH WSD data set consists of 106 ambiguous abbreviations, 88 ambiguous terms and 9 which are a combination of both, for a total of 203 ambiguous entities. For each ambiguous term/abbreviation, the data set contains a maximum of 100 instances per sense obtained from MEDLINE. We evaluated the reliability of the MSH WSD data set using existing knowledge-based methods and compared their performance to that of the results previously obtained by these algorithms on the pre-existing data set, NLM WSD. We show that the knowledge-based methods achieve different results but keep their relative performance except for the Journal Descriptor Indexing (JDI) method, whose performance is below the other methods. Conclusions The MSH WSD data set allows the evaluation of WSD algorithms in the biomedical domain. Compared to previously existing data sets, MSH WSD contains a larger number of biomedical terms/abbreviations and covers the largest set of UMLS Semantic Types. Furthermore, the MSH WSD data set has been generated automatically reusing already existing annotations and, therefore, can be regenerated from subsequent UMLS versions.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Annotation of protein residues based on a literature analysis: cross-validation against UniProtKb

Author: A Stark
Antonio Jimeno-Yepes
BJ Polacco
BJ Stapley
C Blaschke
C Blaschke
C Friedman
CH Wu
CJO Baker
CJO Baker
D Bourigault
D Rebholz-Schuhmann
D Rebholz-Schuhmann
Dietrich Rebholz-Schuhmann
DL Wheeler
DM Kristensen
EM Marcotte
F Cerbah
F Guenthner
F Horn
G Leroy
JA Barker
JC Nebel
Kevin Nagel
LC Lee
M Ikeda
MM Babu
P Pezik
R Kanagasabai
R Witte
S Gaudan
S Yoon
TJ Oldfield
Y Miyao
Y Tateisi
Y Tsuruoka
YL Yip
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background A protein annotation database, such as the Universal Protein Resource knowledge base (UniProtKb), is a valuable resource for the validation and interpretation of predicted 3D structure patterns in proteins. Existing studies have focussed on point mutation extraction methods from biomedical literature which can be used to support the time consuming work of manual database curation. However, these methods were limited to point mutation extraction and do not extract features for the annotation of proteins at the residue level. Results This work introduces a system that identifies protein residues in MEDLINE abstracts and annotates them with features extracted from the context written in the surrounding text. MEDLINE abstract texts have been processed to identify protein mentions in combination with taxonomic species and protein residues (F1-measure 0.52). The identified protein-species-residue triplets have been validated and benchmarked against reference data resources (UniProtKb, average F1-measure of 0.54). Then, contextual features were extracted through shallow and deep parsing and the features have been classified into predefined categories (F1-measure ranges from 0.15 to 0.67). Furthermore, the feature sets have been aligned with annotation types in UniProtKb to assess the relevance of the annotations for ongoing curation projects. Altogether, the annotations have been assessed automatically and manually against reference data resources. Conclusion This work proposes a solution for the automatic extraction of functional annotation for protein residues from biomedical articles. The presented approach is an extension to other existing systems in that a wider range of residue entities are considered and that features of residues are extracted as annotations.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The Strategic Impact of META-NET on the Regional, National and International Level

Author: Georg Rehm Hans Uszkoreit, Sophia Ananiadou, Núria Bel, Audrone Bieleviciene, Lars Borin, António Branco, Gerhard Budin, Nicoletta Calzolari, Walter Daelemans, Radovan Garabík, Marko Grobelnik, Carmen Garcia-Mateo, Josef Van Genabith, Jan Hajic, Inma Hernaez, John Judge, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Joseph Mariani, John Mcnaught, Maite Melero, Monica Monachini, Asuncion Moreno, Jan Odijk, Maciej Ogrodniczuk, Piotr Pezik, Stelios Piperidis, Adam Przepiórkowski, Eiríkur Rögnvaldsson, Michael Rosner, Bolette Sandford Pedersen, Inguna Skadina, Koenraad De Smedt, Marko Tadić, Paul Thompson, Dan Tufiș, Tamás Váradi, Andrejs Vasiljevs, Kadri Vider, Jolanta Zabarskaite
Publication venue: European Language Resources Association (ELRA)
Publication date: 26/05/2014
Field of study

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.Peer reviewe

Helsingin yliopiston digitaalinen arkisto