Search CORE

715 research outputs found

SimLex-999: Evaluating Semantic Models with (Genuine) Similarity Estimation

Author: Hill Felix
Korhonen Anna
Reichart Roi
Publication venue
Publication date: 14/08/2014
Field of study

We present SimLex-999, a gold standard resource for evaluating distributional semantic models that improves on existing resources in several important ways. First, in contrast to gold standards such as WordSim-353 and MEN, it explicitly quantifies similarity rather than association or relatedness, so that pairs of entities that are associated but not actually similar [Freud, psychology] have a low rating. We show that, via this focus on similarity, SimLex-999 incentivizes the development of models with a different, and arguably wider range of applications than those which reflect conceptual association. Second, SimLex-999 contains a range of concrete and abstract adjective, noun and verb pairs, together with an independent rating of concreteness and (free) association strength for each pair. This diversity enables fine-grained analyses of the performance of models on concepts of different types, and consequently greater insight into how architectures can be improved. Further, unlike existing gold standard evaluations, for which automatic approaches have reached or surpassed the inter-annotator agreement ceiling, state-of-the-art models perform well below this ceiling on SimLex-999. There is therefore plenty of scope for SimLex-999 to quantify future improvements to distributional semantic models, guiding the development of the next generation of representation-learning architectures

arXiv.org e-Print Archive

CiteSeerX

Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

Author: Korhonen Anna
Mrkšić Nikola
Vulić Ivan
Publication venue
Publication date: 01/01/2017
Field of study

Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines. In this work, we propose a novel cross-lingual transfer method for inducing VerbNets for multiple languages. To the best of our knowledge, this is the first study which demonstrates how the architectures for learning word embeddings can be applied to this challenging syntactic-semantic task. Our method uses cross-lingual translation pairs to tie each of the six target languages into a bilingual vector space with English, jointly specialising the representations to encode the relational information from English VerbNet. A standard clustering algorithm is then run on top of the VerbNet-specialised representations, using vector dimensions as features for learning verb classes. Our results show that the proposed cross-lingual transfer approach sets new state-of-the-art verb classification performance across all six target languages explored in this work.Comment: EMNLP 2017 (long paper

arXiv.org e-Print Archive

Crossref

Marketing performance measurement in B2B service companies : a multiple case study

Author: Korhonen Anna
Publication venue
Publication date: 13/05/2019
Field of study

Prior research has recognised the positive relationship between an organisation’s ability to control marketing activities through marketing performance measurement (MPM), and the attainment of that organisation’s marketing and business performance goals. Although the positive relationship between marketing activities and financial outcomes is currently widely accepted, marketing practitioners have found it difficult to measure and communicate the value of marketing to top management and others in the organisation. Still, empirical research has provided surprisingly little evidence of the key contingencies that can affect the successful application of MPM. Moreover, the majority of the marketing performance measurement research is only derived from business-to-consumer (B2C) markets. The purpose of this research is thus to understand the factors that affect the successful application of marketing performance measurement in B2B service companies. This issue is examined empirically through a qualitative case study approach. The data was collected through semi-structured theme interviews with 12 marketing and sales decision-makers having at least moderate MPM experience. The interviewed individuals work in 10 case companies operating in different industries in the B2B service field. Further, 6 exploratory expert interviews were carried out to improve the preliminary understanding of the phenomenon under study and also to facilitate in the choosing of the case companies and the key informants within these selected companies. This study finds that there is no typical MPM process. Instead, the MPM process should always be adopted to fit the company specific context. In the context of B2B service companies, 5 industry level factors and 9 corporate level factors were found to influence the successful application of MPM

UTUPub

Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization

Author: Glavaš Goran
Korhonen Anna
Mrkšić Nikola
Ponti Edoardo Maria
Vulić Ivan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2018
Field of study

Semantic specialization is the process of fine-tuning pre-trained distributional word vectors using external lexical knowledge (e.g., WordNet) to accentuate a particular semantic relation in the specialized vector space. While post-processing specialization methods are applicable to arbitrary distributional vectors, they are limited to updating only the vectors of words occurring in external lexicons (i.e., seen words), leaving the vectors of all other words unchanged. We propose a novel approach to specializing the full distributional vocabulary. Our adversarial post-specialization method propagates the external lexical knowledge to the full distributional space. We exploit words seen in the resources as training examples for learning a global specialization function. This function is learned by combining a standard L2-distance loss with an adversarial loss: the adversarial component produces more realistic output vectors. We show the effectiveness and robustness of the proposed method across three languages and on three tasks: word similarity, dialog state tracking, and lexical simplification. We report consistent improvements over distributional word vectors and vectors specialized by other state-of-the-art specialization frameworks. Finally, we also propose a cross-lingual transfer method for zero-shot specialization which successfully specializes a full target distributional space without any lexical knowledge in the target language and without any bilingual data.Comment: Accepted at EMNLP 201

arXiv.org e-Print Archive

Crossref

MAnnheim DOCument Server (Univ. Mannheim)

Edinburgh Research Explorer

Gendered pathways from academic performance, motivational beliefs, and school burnout to adolescents’ educational and occupational aspirations

Author: Korhonen Johan
Tapola Anna
Tuominen Heta
Widlund Anna
Publication venue
Publication date: 01/04/2020
Field of study

This study examined Finnish 9th-graders’ (N = 966) pathways to educational and occupational aspirations considering two academic domains: mathematics and reading. Multi-group structural equation models were conducted to investigate how domain-specific performance and motivational beliefs (self-concept and interest), and more general school burnout (exhaustion, cynicism, and inadequacy) relate to boys' and girls' aspirations. Performance in both domains was related to girls' educational aspirations, but only mathematics was linked to boys' aspirations. Positive within-domain relations from girls' motivational beliefs were also found, but their reading self-concept was negatively linked to their math-related occupational aspirations. For boys, only math-related motivational beliefs were associated with their aspirations. Lastly, school burnout was both directly and indirectly linked to students' aspirations. Overall, the study demonstrated the importance of including several factors when investigating students’ aspired educational degrees and occupational plans and, also, the added value of examining educational and occupational aspirations across academic domains.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Link prediction in drug-target interactions network using similarity indices.

Author: Guo Yufan
Korhonen Anna
Lu Yiding
Publication venue: BMC Bioinformatics
Publication date: 17/01/2017
Field of study

BACKGROUND: In silico drug-target interaction (DTI) prediction plays an integral role in drug repositioning: the discovery of new uses for existing drugs. One popular method of drug repositioning is network-based DTI prediction, which uses complex network theory to predict DTIs from a drug-target network. Currently, most network-based DTI prediction is based on machine learning - methods such as Restricted Boltzmann Machines (RBM) or Support Vector Machines (SVM). These methods require additional information about the characteristics of drugs, targets and DTIs, such as chemical structure, genome sequence, binding types, causes of interactions, etc., and do not perform satisfactorily when such information is unavailable. We propose a new, alternative method for DTI prediction that makes use of only network topology information attempting to solve this problem. RESULTS: We compare our method for DTI prediction against the well-known RBM approach. We show that when applied to the MATADOR database, our approach based on node neighborhoods yield higher precision for high-ranking predictions than RBM when no information regarding DTI types is available. CONCLUSION: This demonstrates that approaches purely based on network topology provide a more suitable approach to DTI prediction in the many real-life situations where little or no prior knowledge is available about the characteristics of drugs, targets, or their interactions

Crossref

Springer - Publisher Connector

PubMed Central

Apollo (Cambridge)

Springer OAI

Unsupervised Declarative Knowledge Induction for Constraint-Based Learning of Information Structure in Scientific Documents

Author: Guo Yufan
Korhonen Anna
Reichart Roi
Publication venue: Transactions of Association for Computational Linguistics
Publication date: 01/03/2015
Field of study

Inferring the information structure of scientific documents is useful for many NLP applications. Existing approaches to this task require substantial human effort. We propose a framework for constraint learning that reduces human involvement considerably. Our model uses topic models to identify latent topics and their key linguistic features in input documents, induces constraints from this information and maps sentences to their dominant information structure categories through a constrained unsupervised model. When the induced constraints are combined with a fully unsupervised model, the resulting model challenges existing lightly supervised featurebased models as well as unsupervised models that use manually constructed declarative knowledge. Our results demonstrate that useful declarative knowledge can be learned from data with very limited human involvement.This is the final published version. It first appeared at https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/472

CiteSeerX

Apollo (Cambridge)

LexSchem: A Large Subcategorization Lexicon for French Verbs

Author: Korhonen Anna
Messiant Cédric
Poibeau Thierry
Publication venue: HAL CCSD
Publication date: 01/01/2008
Field of study

International audienceCet article traite de l'acquisition automatique de schémas de sous-catégorisation à partir de corpus pour le français

HAL-Paris 13