Skip to main content
Article thumbnail
Location of Repository

Human-Level Performance on Word Analogy Questions by Latent Relational Analysis

By Peter D. Turney

Abstract

This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason/stone is analogous to the pair carpenter/wood; the relations between mason and stone are highly similar to the relations between carpenter and wood. Past work on semantic similarity measures has mainly been concerned with attributional similarity. For instance, Latent Semantic Analysis (LSA) can measure the degree of similarity between two words, but not between two relations. Recently the Vector Space Model (VSM) of information retrieval has been adapted to the task of measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus (they are not predefined), (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data (it is also used this way in LSA), and (3) automatically generated synonyms are used to explore reformulations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying noun-modifier relations, LRA achieves similar gains over the VSM, while using a smaller corpus

Topics: Language, Computational Linguistics, Semantics, Machine Learning
Year: 2004
OAI identifier: oai:cogprints.org:3981

Suggested articles

Citations

  1. (2000). 10 Real SATs. College Entrance Examination Board.
  2. (2003). A three-pronged approach to the extraction of key terms and semantic roles.
  3. (1986). An experimental study of factors important in document ranking.
  4. (1999). Automated essay scoring: Applications to educational technology.
  5. (1992). Automatic acquisition of hyponyms from large text corpora.
  6. (1997). Automatic detection of thesaurus relations for information retrieval applications. Foundations of Computer Science: Potential - Theory -
  7. (2002). Automatic labeling of semantic roles.
  8. (1998). Automatic retrieval and clustering of similar words.
  9. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. doi
  10. (2001). Classifying the semantic relations in nouncompounds via a domain-specific lexical hierarchy.
  11. (1965). Cognition and Thought: An Information Processing Approach. doi
  12. (2003). Coherent keyphrase extraction via Web mining.
  13. (2003). Combining independent modules to solve multiple-choice synonym and analogy problems.
  14. (1992). Computer understanding of conventional metaphoric language.
  15. (1988). Dependency Syntax: Theory and Practice.
  16. (1998). Dependency-based evaluation of MINIPAR.
  17. (1992). Direction-based text interpretation as an information access refinement. In
  18. (2002). Discovering word senses from text.
  19. (1992). Experiments on linguistically-based term associations. doi
  20. (2003). Exploring noun-modifier semantic relations.
  21. (2003). Extended gloss overlaps as a measure of semantic relatedness.
  22. (1999). Finding parts in very large corpora.
  23. (2003). Frequency estimates for statistical word similarity measures.
  24. (1992). Human-Level Performance on Word Analogy Questions Turney 28
  25. (2002). Human-Level Performance on Word Analogy Questions Turney 29
  26. (1995). Human-Level Performance on Word Analogy Questions Turney 31
  27. (1983). Introduction to Modern Information Retrieval.
  28. (1997). Kernel principal component analysis.
  29. (2000). Latent semantic space: Iterative scaling improves inter-document similarity measurement.
  30. (1991). Lexical cohesion computed by thesaural relations as an indicator of the structure of text.
  31. (1996). Matrix Computations. Third edition. Johns Hopkins
  32. (2003). Measuring praise and criticism: Inference of semantic orientation from association.
  33. (2001). Metaphor is like analogy. In
  34. (2001). Mining the Web for synonyms: PMI-IR versus LSA on TOEFL.
  35. (1999). Modern Information Retrieval.
  36. (1993). Respects for similarity.
  37. (1993). Retrieving collocations from Text: Xtract.
  38. (2001). Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures.
  39. (1998). Semi-automatic recognition of noun modifier relationships.
  40. (1990). Similarity involving attributes and relations: Judgments of similarity and difference are not inverses.
  41. (1983). Structure-mapping: A theoretical framework for analogy.
  42. (1988). Term-weighting approaches in automatic text retrieval.
  43. (2002). The descent of hierarchy, and selection in relational semantics.
  44. (1998). The measurement of textual coherence with latent semantic analysis.
  45. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews.
  46. (1998). Using latent semantic analysis to assess knowledge: Some technical considerations. doi
  47. (1997). Using lexical chains for text summarization.
  48. (2003). Using measures of semantic relatedness for word sense disambiguation.
  49. (1989). Word association norms, mutual information and lexicography.
  50. (2004). Word sense disambiguation by Web mining for word co-occurrence probabilities.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.