Search CORE

17 research outputs found

BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine.

Author: Baker Simon
Björne Jari
Brown Susan Windisch
Collins Charlotte
Korhonen Anna
Majewska Olga
Palmer Martha
Publication venue: Journal of biomedical semantics
Publication date: 01/07/2021
Field of study

BackgroundRecent advances in representation learning have enabled large strides in natural language understanding; However, verbal reasoning remains a challenge for state-of-the-art systems. External sources of structured, expert-curated verb-related knowledge have been shown to boost model performance in different Natural Language Processing (NLP) tasks where accurate handling of verb meaning and behaviour is critical. The costliness and time required for manual lexicon construction has been a major obstacle to porting the benefits of such resources to NLP in specialised domains, such as biomedicine. To address this issue, we combine a neural classification method with expert annotation to create BioVerbNet. This new resource comprises 693 verbs assigned to 22 top-level and 117 fine-grained semantic-syntactic verb classes. We make this resource available complete with semantic roles and VerbNet-style syntactic frames.ResultsWe demonstrate the utility of the new resource in boosting model performance in document- and sentence-level classification in biomedicine. We apply an established retrofitting method to harness the verb class membership knowledge from BioVerbNet and transform a pretrained word embedding space by pulling together verbs belonging to the same semantic-syntactic class. The BioVerbNet knowledge-aware embeddings surpass the non-specialised baseline by a significant margin on both tasks.ConclusionThis work introduces the first large, annotated semantic-syntactic classification of biomedical verbs, providing a detailed account of the annotation process, the key differences in verb behaviour between the general and biomedical domain, and the design choices made to accurately capture the meaning and properties of verbs used in biomedical texts. The demonstrated benefits of leveraging BioVerbNet in text classification suggest the resource could help systems better tackle challenging NLP tasks in biomedicine

Directory of Open Access Journals

Apollo (Cambridge)

BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine

Author: Baker Simon
Björne Jari
Brown Susan Windisch
Collins Charlotte
Korhonen Anna
Majewska Olga
Palmer Martha
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/10/2022
Field of study

UTUPub

Semantic annotation of metaphorical verbs with VerbNet

Author: Brown SUSAN WINDISCH
Publication venue: Harry Bunt
Publication date: 01/01/2012
Field of study

Florence Research

VerbNet class assignment as a WSD task

Author: Brown SUSAN WINDISCH
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2012
Field of study

Florence Research

IMAGACT4ALL: Mapping Spanish Varieties onto a Corpus-Based Ontology of Action.

Author: Gloria Gagliardi
Massimo Moneglia
Susan Windisch Brown
Publication venue
Publication date: 01/01/2014
Field of study

IMAGACT is a corpus-based ontology of action concepts, derived from English and Italian spontaneous speech resources, which makes use of the universal language of images to identify action types. IMAGACT4ALL is an Internet infrastructure for mapping languages onto the ontology. Because the action concepts are represented with videos, ex- tension into new languages is done using competence-based judgments by mother-tongue informants without intense lexicographic work involving underdetermined semantic description. It has already been proved on Spanish and Chinese and it is now in the process of being extended to Hindi, Bengali, Sanskrit, Urdu, Oriya, Polish, European and Brazilian Portuguese. IMAGACT4ALL has also been successfully used to implement language varieties, as European and American (Argentinian) Spanish. The first part of this paper presents the infrastructure and the methodology for mapping languages onto the ontology. In the second part we present the results of a comparative analysis of European and American Spanish data derived from the database, that show relevant distinctions in the referential properties of the Spanish verbal lexicon in the two language varieties

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"

Florence Research

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

IMAGACT4ALL: Mapping Spanish Varieties onto a Corpus-Based Ontology of Action

Author: Gloria Gagliardi
Massimo Moneglia
Susan Windisch Brown
Publication venue
Publication date: 01/01/2014
Field of study

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"