12 research outputs found
Transfer and Multi-Task Learning for Noun-Noun Compound Interpretation
In this paper, we empirically evaluate the utility of transfer and multi-task
learning on a challenging semantic classification task: semantic interpretation
of noun--noun compounds. Through a comprehensive series of experiments and
in-depth error analysis, we show that transfer learning via parameter
initialization and multi-task learning via parameter sharing can help a neural
classification model generalize over a highly skewed distribution of relations.
Further, we demonstrate how dual annotation with two distinct sets of relations
over the same set of compounds can be exploited to improve the overall accuracy
of a neural classifier and its F1 scores on the less frequent, but more
difficult relations.Comment: EMNLP 2018: Conference on Empirical Methods in Natural Language
Processing (EMNLP
A Computational Model of Conceptual Combination
We describe the Interactional-Constraint (ICON) model of
conceptual combination. This model is based on the idea that
combinations are interpreted by incrementally constraining
the range of interpretation according to the interacting
influence of both constituent nouns. ICON consists of a series
of discrete stages, combining data from the British National
Corpus, the WordNet lexicon and the Web to predict the
dominant interpretation of a combination and a range of
factors relating to ease of interpretation. One of the major
advantages of the model is that it does not require a tailored
knowledge base, thus broadening its scope and utility. We
evaluate ICON’s reliability and find that it is accurate in
predicting word senses and relations for a wide variety of
combinations. However, its ability to predict ease of
interpretation is poor. The implications for models of
conceptual combination are discussed
Opinion Holder and Target Extraction on Opinion Compounds – A Linguistic Approach
We present an approach to the new task of opinion holder and target extraction on opinion compounds. Opinion compounds (e.g. user rating or victim support) are noun compounds whose head is an opinion noun. We do not only examine features known to be effective for noun compound analysis, such as paraphrases and semantic classes of heads and modifiers, but also propose novel features tailored to this new task. Among them, we examine paraphrases that jointly consider holders and targets, a verb detour in which noun heads are replaced by related verbs, a global head constraint allowing inferencing between different compounds, and the categorization of the sentiment view that the head conveys
A Dataset of 108 Novel Noun-Noun Compound Words with Active and Passive Interpretation
We created a dataset of 205 English novel noun-noun compounds (NNCs, e.g., “doctor charity”) by combining nouns with higher and lower agentivity (i.e., the probability of being an agent in a sentence). We collected active and passive interpretations of NNCs from a group of 58 English native speakers. We then measured interpretation time differences between NNCs with active and passive interpretations (i.e., 108 NNCs), using data obtained from a group of 68 English native speakers. Data were collected online using crowdsourcing platforms (SONA and Prolific). The datasets are available at osf.io/gvc2w/ and can be used to address questions about semantic and syntactic composition
Resolving pronominal anaphora using commonsense knowledge
Coreference resolution is the task of resolving all expressions in a text that refer to the same entity. Such expressions are often used in writing and speech as shortcuts to avoid repetition. The most frequent form of coreference is the anaphor. To resolve anaphora not only grammatical and syntactical strategies are required, but also semantic approaches should be taken into consideration. This dissertation presents a framework for automatically resolving pronominal anaphora by integrating recent findings from the field of linguistics with new semantic features. Commonsense knowledge is the routine knowledge people have of the everyday world. Because such knowledge is widely used it is frequently omitted from social communications such as texts. It is understandable that without this knowledge computers will have difficulty making sense of textual information. In this dissertation a new set of computational and linguistic features are used in a supervised learning approach to resolve the pronominal anaphora in document. Commonsense knowledge sources such as ConceptNet and WordNet are used and similarity measures are extracted to uncover the elaborative information embedded in the words that can help in the process of anaphora resolution. The anaphoric system is tested on 350 Wall Street Journal articles from the BBN corpus. When compared with other systems available such as BART (Versley et al. 2008) and Charniak and Elsner 2009, our system performed better and also resolved a much wider range of anaphora. We were able to achieve a 92% F-measure on the BBN corpus and an average of 85% F-measure when tested on other genres of documents such as children stories and short stories selected from the web
Recommended from our members
AXEL: A framework to deal with ambiguity in three-noun compounds
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 6/12/2010.Cognitive Linguistics has been widely used to deal with the ambiguity generated by words in combination. Although this domain offers many solutions to address this challenge, not all of them can be implemented in a computational environment. The Dynamic Construal of Meaning framework is argued to have this ability because it describes an intrinsic degree of association of meanings, which in turn, can be translated into computational programs. A limitation towards a computational approach, however, has been the lack of syntactic parameters. This research argues that this limitation could be overcome with the aid of the Generative Lexicon Theory (GLT). Specifically, this dissertation formulated possible means to marry the GLT and Cognitive Linguistics in a novel rapprochement between the two.
This bond between opposing theories provided the means to design a computational template (the AXEL System) by realising syntax and semantics at software levels. An instance of the AXEL system was created using a Design Research approach. Planned iterations were involved in the development to improve artefact performance. Such iterations boosted performance-improving, which accounted for the degree of association of meanings in three-noun compounds.
This dissertation delivered three major contributions on the brink of a so-called turning point in Computational Linguistics (CL). First, the AXEL system was used to disclose hidden lexical patterns on ambiguity. These patterns are difficult, if not impossible, to be identified without automatic techniques. This research claimed that these patterns can assist audiences of linguists to review lexical knowledge on a software-based viewpoint.
Following linguistic awareness, the second result advocated for the adoption of improved resources by decreasing electronic space of Sense Enumerative Lexicons (SELs). The AXEL system deployed the generation of “at the moment of use” interpretations, optimising the way the space is needed for lexical storage.
Finally, this research introduced a subsystem of metrics to characterise an ambiguous degree of association of three-noun compounds enabling ranking methods. Weighing methods delivered mechanisms of classification of meanings towards Word Sense Disambiguation (WSD). Overall these results attempted to tackle difficulties in understanding studies of Lexical Semantics via software tools
Automatic interpretation of noun compounds using WordNet similarity
Abstract. The paper introduces a method for interpreting novel noun compounds with semantic relations. The method is built around word similarity with pretagged noun compounds, based on WordNet::Similarity. Over 1,088 training instances and 1,081 test instances from the Wall Street Journal in the Penn Treebank, the proposed method was able to correctly classify 53.3 % of the test noun compounds. We also investigated the relative contribution of the modifier and the head noun in noun compounds of different semantic types.
Επίδραση της δομής των συνθέτων στην ερμηνεία: Μια εμπειρική έρευνα σε χρωματικά σύνθετα της Νέας Ελληνικής
Στην παρούσα διδακτορική διατριβή εξετάζεται πειραματικά η σημασιολογική επίδραση που ασκεί ο μηχανισμός της σύνθεσης στη Νέα Ελληνική σε σύνθετα με τα βασικά χρώματα. Για τη διεξαγωγή του πειράματος χρησιμοποιήθηκε ως εργαλείο το πείραμα των Berlin & Kay (1969), σύμφωνα με τους οποίους κάθε γλώσσα περιέχει βασικές κατηγορίες χρωμάτων και για κάθε χρωματικό όρο ανά γλώσσα υπάρχει μία κεντρική τιμή. Στόχος της εμπειρικής αυτής μελέτης είναι α) να εξεταστεί πειραματικά αν η αντίληψη των ομιλητών στα σύνθετα χρώματα βρίσκεται μεταξύ των δύο συνθετικών ή τείνει περισσότερο στο πρώτο ή στο δεύτερο συνθετικό, με βάση το χρωματικό φάσμα ή μέσω του μηχανισμού της παράφρασης, και να μελετηθούν οι σημασιολογικές διεργασίες που επιτελούνται μέσω της σύνθεσης, β) να ερευνηθεί αν η αλλαγή στη σειρά των συστατικών δημιουργεί και διαφορετική εκτίμηση από τους ομιλητές.
Παρά το γεγονός ότι στη βιβλιογραφία τα σύνθετα με τα βασικά χρώματα θεωρούνται ως παρατακτικά και παρ’ όλο που θα περιμέναμε ο φυσικός ομιλητής να έχει την εκτίμηση πως το δεύτερο συστατικό είναι το πιο βασικό, με βάση τον κανόνα της δεξιόστροφης κεφαλής που ισχύει για την Ελληνική, τα δεδομένα των πειραμάτων διαψεύδουν και τις δύο αυτές υποθέσεις. Συγκεκριμένα, η αντίληψη για τα σύνθετα τείνει περισσότερο στο πρώτο συνθετικό, π.χ. στο κιτρινοπράσινο η εκτίμηση των ομιλητών τείνει περισσότερο στο κίτρινο παρά στο πράσινο, ενώ στο πρασινοκίτρινο δίνεται έμφαση περισσότερο στο πράσινο από ό,τι στο κίτρινο. Αυτό μπορεί ενδεχομένως να συνδέεται με την προοδευτική (από τα αριστερά στα δεξιά) επεξεργασία του λόγου, όπως παρατηρείται σε προτασιακό επίπεδο. Επίσης, η εναλλαγή στη σειρά των συστατικών δεν δημιουργεί και διαφορετική εκτίμηση.
Η επιστημονική συμβολή της παρούσας διδακτορικής διατριβής έγκειται στο ότι για πρώτη φορά εξετάζεται πειραματικά η επίδραση που ασκεί ο μηχανισμός της σύνθεσης στην ερμηνεία σε σύνθετα με τα βασικά χρώματα. Ως προς το θεωρητικό υπόβαθρο προτείνεται μια νέα κατηγοριοποίηση των παρατακτικών συνθέτων της Νέας Ελληνικής, βασισμένη στη λεξική σημασιολογία, ενώ για πρώτη φορά γίνεται συστηματική αναπαράσταση των συνθέτων της Νέας Ελληνικής με διαγράμματα Venn.This thesis attempts an original empirical investigation of the semantic impact of the compounding mechanism in Modern Greek on the interpretation of colour compounds. The study adopted the framework put forward by Berlin and Kay (1969), which claims that every language contains basic colour categories and that there is one focus area for every colour term per language. The purpose of the study was: a) to empirically test whether speakers split their attention between the two constituents or they are likely to concentrate on the first or the second colour, based either on the colour palette or the paraphrase, as well as to study the semantic processes that are implemented in compounding; b) to explore whether the alternation of the constituents leads speakers to different intuition.
Taking into account that the research literature supports the view that colour compounds may be considered as coordinates, and despite the intuitions that native speakers are likely to consider the second constituent in accordance with the Right-hand Head Rule that applies for Greek, the results of our experiments negate our two hypotheses. Specifically, respondents’ perception regarding compound compounds tend to acknowledge the first constituent as strongest, e.g., where yellow-green is concerned, speakers focus their attention on yellow. This might well be attributed to the gradual left-to-right speech processing, as observed at sentence level. Moreover, the alternate order of the constituents does not lead to different intuitions.
The contribution of this thesis lies in its attempt to experimentally study the effect of the compounding mechanism toward the interpretation of compounds. A new categorization of coordinate compounds of Modern Greek based on lexical semantics is put forward in the form of a comprehensive theoretical framework, while it is the first time that an attempt is made to systematically represent Modern Greek compounds by means of Venn diagrams
Recommended from our members
Semantic information systems engineering: A query-based approach for semi-automatic annotation of web services
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.There has been an increasing interest in Semantic Web services (SWS) as a proposed solution to facilitate automatic discovery, composition and deployment of existing syntactic Web services. Successful implementation and wider adoption of SWS by research and industry are, however, profoundly based on the existence of effective and easy to use methods for service semantic description. Unfortunately, Web service semantic annotation is currently performed by manual means. Manual annotation is a difficult, error-prone and time-consuming task and few approaches exist aiming to semi-automate that task. Existing approaches are difficult to use since they require ontology building. Moreover, these approaches employ ineffective matching methods and suffer from the Low Percentage Problem. The latter problem happens when a small number of service elements - in comparison to the total number of elements – are annotated in a given service.
This research addresses the Web services annotation problem by developing a semi-automatic annotation approach that allows SWS developers to effectively and easily annotate their syntactic services. The proposed approach does not require application ontologies to model service semantics. Instead, a standard query template is used: This template is filled with data and semantics extracted from WSDL files in order to produce query instances. The input of the annotation approach is the WSDL file of a candidate service and a set of ontologies. The output is an annotated WSDL file. The proposed approach is composed of five phases: (1) Concept extraction; (2) concept filtering and query filling; (3) query execution; (4) results assessment; and (5) SAWSDL annotation. The query execution engine makes use of name-based and structural matching techniques. The name-based matching is carried out by CN-Match which is a novel matching method and tool that is developed and evaluated in this research.
The proposed annotation approach is evaluated using a set of existing Web services and ontologies. Precision (P), Recall (R), F-Measure (F) and Percentage of annotated elements are used as evaluation metrics. The evaluation reveals that the proposed approach is effective since - in relation to manual results - accurate and almost complete annotation results are obtained. In addition, high percentage of annotated elements is achieved using the proposed approach because it makes use of effective ontology extension mechanisms
Semantic information systems engineering : a query-based approach for semi-automatic annotation of web services
There has been an increasing interest in Semantic Web services (SWS) as a proposed solution to facilitate automatic discovery, composition and deployment of existing syntactic Web services. Successful implementation and wider adoption of SWS by research and industry are, however, profoundly based on the existence of effective and easy to use methods for service semantic description. Unfortunately, Web service semantic annotation is currently performed by manual means. Manual annotation is a difficult, error-prone and time-consuming task and few approaches exist aiming to semi-automate that task. Existing approaches are difficult to use since they require ontology building. Moreover, these approaches employ ineffective matching methods and suffer from the Low Percentage Problem. The latter problem happens when a small number of service elements - in comparison to the total number of elements – are annotated in a given service. This research addresses the Web services annotation problem by developing a semi-automatic annotation approach that allows SWS developers to effectively and easily annotate their syntactic services. The proposed approach does not require application ontologies to model service semantics. Instead, a standard query template is used: This template is filled with data and semantics extracted from WSDL files in order to produce query instances. The input of the annotation approach is the WSDL file of a candidate service and a set of ontologies. The output is an annotated WSDL file. The proposed approach is composed of five phases: (1) Concept extraction; (2) concept filtering and query filling; (3) query execution; (4) results assessment; and (5) SAWSDL annotation. The query execution engine makes use of name-based and structural matching techniques. The name-based matching is carried out by CN-Match which is a novel matching method and tool that is developed and evaluated in this research. The proposed annotation approach is evaluated using a set of existing Web services and ontologies. Precision (P), Recall (R), F-Measure (F) and Percentage of annotated elements are used as evaluation metrics. The evaluation reveals that the proposed approach is effective since - in relation to manual results - accurate and almost complete annotation results are obtained. In addition, high percentage of annotated elements is achieved using the proposed approach because it makes use of effective ontology extension mechanisms.EThOS - Electronic Theses Online ServiceGBUnited Kingdo