Search CORE

38 research outputs found

Unsupervised Acquisition of Verb Subcategorization Frames from Shallow-Parsed Corpora

Author: Lenci Alessandro
McGillivray Barbara
Montemagni Simonetta
Pirrelli Vito
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2008
Field of study

In this paper, we reported experiments of unsupervised automatic acquisition of Italian and English verb subcategorization frames (SCFs) from general and domain corpora. The proposed technique operates on syntactically shallow-parsed corpora on the basis of a limited number of search heuristics not relying on any previous lexico-syntactic knowledge about SCFs. Although preliminary, reported results are in line with state-of-the-art lexical acquisition systems. The issue of whether verbs sharing similar SCFs distributions happen to share similar semantic properties as well was also explored by clustering verbs that share frames with the same distribution using the Minimum Description Length Principle (MDL). First experiments in this direction were carried out on Italian verbs with encouraging results

Archivio della Ricerca - Università di Pisa

PUblication MAnagement

Disambiguating Nouns, Verbs, and Adjectives Using Automatically Acquired Selectional Preferences

Author: Briscoe Ted
Diana McCarthy
John Carroll
Li Hang
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2003
Field of study

Selectional preferences have been used by word sense disambiguation (WSD) systems as one source of disambiguating information. We evaluate WSD using selectional preferences acquired for English adjective—noun, subject, and direct object grammatical relationships with respect to a standard test corpus. The selectional preferences are specific to verb or adjective classes, rather than individual word forms, so they can be used to disambiguate the co-occurring adjectives and verbs, rather than just the nominal argument heads. We also investigate use of the one-senseper-discourse heuristic to propagate a sense tag for a word to other occurrences of the same word within the current document in order to increase coverage. Although the preferences perform well in comparison with other unsupervised WSD systems on the same corpus, the results show that for many applications, further knowledge sources would be required to achieve an adequate level of accuracy and coverage. In addition to quantifying performance, we analyze the results to investigate the situations in which the selectional preferences achieve the best precision and in which the one-sense-per-discourse heuristic increases performance

CiteSeerX

Crossref

Sussex Research Online

Learning Fine-Grained Selectional Restrictions

Author: Hovy Eduard
Tan Yongmei
Publication venue: Department of English, National Chengchi University
Publication date: 01/01/2013
Field of study

Waseda University Repository

Argumentness and Probabilistic Case Structures

Author: Lee Ik-Hwan
Yang Dan-Hee
Publication venue: The Korean Society for Language and Information
Publication date: 01/01/2002
Field of study

Waseda University Repository

The Minimum Description Length Principle for Pattern Mining: A Survey

Author: Galbrun Esther
Publication venue
Publication date: 28/07/2021
Field of study

This is about the Minimum Description Length (MDL) principle applied to pattern mining. The length of this description is kept to the minimum. Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The MDL principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, as well as of work on the theory behind the MDL and similar principles, we review MDL-based methods for mining various types of data and patterns. Finally, we open a discussion on some issues regarding these methods, and highlight currently active related data analysis problems

arXiv.org e-Print Archive