3 research outputs found
Measuring Thematic Fit with Distributional Feature Overlap
In this paper, we introduce a new distributional method for modeling
predicate-argument thematic fit judgments. We use a syntax-based DSM to build a
prototypical representation of verb-specific roles: for every verb, we extract
the most salient second order contexts for each of its roles (i.e. the most
salient dimensions of typical role fillers), and then we compute thematic fit
as a weighted overlap between the top features of candidate fillers and role
prototypes. Our experiments show that our method consistently outperforms a
baseline re-implementing a state-of-the-art system, and achieves better or
comparable results to those reported in the literature for the other
unsupervised systems. Moreover, it provides an explicit representation of the
features characterizing verb-specific semantic roles.Comment: 9 pages, 2 figures, 5 tables, EMNLP, 2017, thematic fit, selectional
preference, semantic role, DSMs, Distributional Semantic Models, Vector Space
Models, VSMs, cosine, APSyn, similarity, prototyp
An Exploration of Semantic Features in an Unsupervised Thematic Fit Evaluation Framework
Thematic fit is the extent to which an entity fits a thematic role in the semantic frame of an event, e.g., how well humans would rate “knife” as an instrument of an event of cutting. We explore the use of the SENNA semantic role-labeller in defining a distributional space in order to build an unsupervised model of event-entity thematic fit judgements. We test a number of ways of extracting features from SENNA-labelled versions of the ukWaC and BNC corpora and identify tradeoffs. Some of our Distributional Memory models outperform an existing syntax-based model (TypeDM) that uses hand-crafted rules for role inference on a previously tested data set. We combine the results of a selected SENNA-based model with TypeDM’s results and find that there is some amount of complementarity in what a syntactic and a semantic model will cover. In the process, we create a broad-coverage semantically-labelled corpus