1,715 research outputs found

    Deriving Verb Predicates By Clustering Verbs with Arguments

    Full text link
    Hand-built verb clusters such as the widely used Levin classes (Levin, 1993) have proved useful, but have limited coverage. Verb classes automatically induced from corpus data such as those from VerbKB (Wijaya, 2016), on the other hand, can give clusters with much larger coverage, and can be adapted to specific corpora such as Twitter. We present a method for clustering the outputs of VerbKB: verbs with their multiple argument types, e.g. "marry(person, person)", "feel(person, emotion)." We make use of a novel low-dimensional embedding of verbs and their arguments to produce high quality clusters in which the same verb can be in different clusters depending on its argument type. The resulting verb clusters do a better job than hand-built clusters of predicting sarcasm, sentiment, and locus of control in tweets

    Assessing the contribution of shallow and deep knowledge sources for word sense disambiguation

    No full text
    Corpus-based techniques have proved to be very beneficial in the development of efficient and accurate approaches to word sense disambiguation (WSD) despite the fact that they generally represent relatively shallow knowledge. It has always been thought, however, that WSD could also benefit from deeper knowledge sources. We describe a novel approach to WSD using inductive logic programming to learn theories from first-order logic representations that allows corpus-based evidence to be combined with any kind of background knowledge. This approach has been shown to be effective over several disambiguation tasks using a combination of deep and shallow knowledge sources. Is it important to understand the contribution of the various knowledge sources used in such a system. This paper investigates the contribution of nine knowledge sources to the performance of the disambiguation models produced for the SemEval-2007 English lexical sample task. The outcome of this analysis will assist future work on WSD in concentrating on the most useful knowledge sources

    D6.1: Technologies and Tools for Lexical Acquisition

    Get PDF
    This report describes the technologies and tools to be used for Lexical Acquisition in PANACEA. It includes descriptions of existing technologies and tools which can be built on and improved within PANACEA, as well as of new technologies and tools to be developed and integrated in PANACEA platform. The report also specifies the Lexical Resources to be produced. Four main areas of lexical acquisition are included: Subcategorization frames (SCFs), Selectional Preferences (SPs), Lexical-semantic Classes (LCs), for both nouns and verbs, and Multi-Word Expressions (MWEs)

    The Acquisition Of Lexical Knowledge From The Web For Aspects Of Semantic Interpretation

    Get PDF
    This work investigates the effective acquisition of lexical knowledge from the Web to perform semantic interpretation. The Web provides an unprecedented amount of natural language from which to gain knowledge useful for semantic interpretation. The knowledge acquired is described as common sense knowledge, information one uses in his or her daily life to understand language and perception. Novel approaches are presented for both the acquisition of this knowledge and use of the knowledge in semantic interpretation algorithms. The goal is to increase accuracy over other automatic semantic interpretation systems, and in turn enable stronger real world applications such as machine translation, advanced Web search, sentiment analysis, and question answering. The major contributions of this dissertation consist of two methods of acquiring lexical knowledge from the Web, namely a database of common sense knowledge and Web selectors. The first method is a framework for acquiring a database of concept relationships. To acquire this knowledge, relationships between nouns are found on the Web and analyzed over WordNet using information-theory, producing information about concepts rather than ambiguous words. For the second contribution, words called Web selectors are retrieved which take the place of an instance of a target word in its local context. The selectors serve for the system to learn the types of concepts that the sense of a target word should be similar. Web selectors are acquired dynamically as part of a semantic interpretation algorithm, while the relationships in the database are useful to iii stand-alone programs. A final contribution of this dissertation concerns a novel semantic similarity measure and an evaluation of similarity and relatedness measures on tasks of concept similarity. Such tasks are useful when applying acquired knowledge to semantic interpretation. Applications to word sense disambiguation, an aspect of semantic interpretation, are used to evaluate the contributions. Disambiguation systems which utilize semantically annotated training data are considered supervised. The algorithms of this dissertation are considered minimallysupervised; they do not require training data created by humans, though they may use humancreated data sources. In the case of evaluating a database of common sense knowledge, integrating the knowledge into an existing minimally-supervised disambiguation system significantly improved results – a 20.5% error reduction. Similarly, the Web selectors disambiguation system, which acquires knowledge directly as part of the algorithm, achieved results comparable with top minimally-supervised systems, an F-score of 80.2% on a standard noun disambiguation task. This work enables the study of many subsequent related tasks for improving semantic interpretation and its application to real-world technologies. Other aspects of semantic interpretation, such as semantic role labeling could utilize the same methods presented here for word sense disambiguation. As the Web continues to grow, the capabilities of the systems in this dissertation are expected to increase. Although the Web selectors system achieves great results, a study in this dissertation shows likely improvements from acquiring more data. Furthermore, the methods for acquiring a database of common sense knowledge could be applied in a more exhaustive fashion for other types of common sense knowledge. Finally, perhaps the greatest benefits from this work will come from the enabling of real world technologies that utilize semantic interpretation

    Verb similarity: comparing corpus and psycholinguistic data

    Get PDF
    Similarity, which plays a key role in fields like cognitive science, psycholinguistics and natural language processing, is a broad and multifaceted concept. In this work we analyse how two approaches that belong to different perspectives, the corpus view and the psycholinguistic view, articulate similarity between verb senses in Spanish. Specifically, we compare the similarity between verb senses based on their argument structure, which is captured through semantic roles, with their similarity defined by word associations. We address the question of whether verb argument structure, which reflects the expression of the events, and word associations, which are related to the speakers' organization of the mental lexicon, shape similarity between verbs in a congruent manner, a topic which has not been explored previously. While we find significant correlations between verb sense similarities obtained from these two approaches, our findings also highlight some discrepancies between them and the importance of the degree of abstraction of the corpus annotation and psycholinguistic representations.La similitud, que desempeña un papel clave en campos como la ciencia cognitiva, la psicolingüística y el procesamiento del lenguaje natural, es un concepto amplio y multifacético. En este trabajo analizamos cómo dos enfoques que pertenecen a diferentes perspectivas, la visión del corpus y la visión psicolingüística, articulan la semejanza entre los sentidos verbales en español. Específicamente, comparamos la similitud entre los sentidos verbales basados en su estructura argumental, que se capta a través de roles semánticos, con su similitud definida por las asociaciones de palabras. Abordamos la cuestión de si la estructura del argumento verbal, que refleja la expresión de los acontecimientos, y las asociaciones de palabras, que están relacionadas con la organización de los hablantes del léxico mental, forman similitud entre los verbos de una manera congruente, un tema que no ha sido explorado previamente. Mientras que encontramos correlaciones significativas entre las similitudes de los sentidos verbales obtenidas de estos dos enfoques, nuestros hallazgos también resaltan algunas discrepancias entre ellos y la importancia del grado de abstracción de la anotación del corpus y las representaciones psicolingüísticas.La similitud, que exerceix un paper clau en camps com la ciència cognitiva, la psicolingüística i el processament del llenguatge natural, és un concepte ampli i multifacètic. En aquest treball analitzem com dos enfocaments que pertanyen a diferents perspectives, la visió del corpus i la visió psicolingüística, articulen la semblança entre els sentits verbals en espanyol. Específicament, comparem la similitud entre els sentits verbals basats en la seva estructura argumental, que es capta a través de rols semàntics, amb la seva similitud definida per les associacions de paraules. Abordem la qüestió de si l'estructura de l'argument verbal, que reflecteix l'expressió dels esdeveniments, i les associacions de paraules, que estan relacionades amb l'organització dels parlants del lèxic mental, formen similitud entre els verbs d'una manera congruent, un tema que no ha estat explorat prèviament. Mentre que trobem correlacions significatives entre les similituds dels sentits verbals obtingudes d'aquests dos enfocaments, les nostres troballes també ressalten algunes discrepàncies entre ells i la importància del grau d'abstracció de l'anotació del corpus i les representacions psicolingüístiques

    Inférences réflexives dans la publicité

    Get PDF
    Advertisements are so ubiquitous nowadays that capturing the addressee’s attention and maintaining it long enough for them to be fully processed have become fundamental objectives for advertisers. Employing specific strategies in the design of the advertisement contributes efficiently to achieving these goals, getting the audience not only to attend the stimulus but also to process it in certain ways favourable for the advertiser. We argue that Relevance theory, an approach to communication built on a massively modular view of cognition, offers the right tools to explain the nature of the interpretative processes in verbal comprehension. Knowledge of the relevance-based reflexive inferential procedures involved in utterance interpretation allows advertisers to foresee the addressee’s processing behaviour, giving them the possibility to control it in a such a way that the intended interpretative effects are achieved in the desired way

    Can Subcategorisation Probabilities Help a Statistical Parser?

    Full text link
    Research into the automatic acquisition of lexical information from corpora is starting to produce large-scale computational lexicons containing data on the relative frequencies of subcategorisation alternatives for individual verbal predicates. However, the empirical question of whether this type of frequency information can in practice improve the accuracy of a statistical parser has not yet been answered. In this paper we describe an experiment with a wide-coverage statistical grammar and parser for English and subcategorisation frequencies acquired from ten million words of text which shows that this information can significantly improve parse accuracy.Comment: 9 pages, uses colacl.st

    The Meaning of Syntactic Dependencies

    Get PDF
    This paper discusses the semantic content of syntactic dependencies. We assume that syntactic dependencies play a central role in the process of semantic interpretation. They are defined as selective functions on word denotations. Among their properties, special attention will be paid to their ability to make interpretation co-compositional and incremental. To describe the semantic properties of dependencies, the paper will be focused on two particular linguistic tasks: word sense disambiguation and attachment resolution. The second task will be performed using a strategy based on automatic acquisition from corpora
    corecore