26,774 research outputs found

    Human-Level Performance on Word Analogy Questions by Latent Relational Analysis

    Get PDF
    This paper introduces Latent Relational Analysis (LRA), a method for measuring relational similarity. LRA has potential applications in many areas, including information extraction, word sense disambiguation, machine translation, and information retrieval. Relational similarity is correspondence between relations, in contrast with attributional similarity, which is correspondence between attributes. When two words have a high degree of attributional similarity, we call them synonyms. When two pairs of words have a high degree of relational similarity, we say that their relations are analogous. For example, the word pair mason/stone is analogous to the pair carpenter/wood; the relations between mason and stone are highly similar to the relations between carpenter and wood. Past work on semantic similarity measures has mainly been concerned with attributional similarity. For instance, Latent Semantic Analysis (LSA) can measure the degree of similarity between two words, but not between two relations. Recently the Vector Space Model (VSM) of information retrieval has been adapted to the task of measuring relational similarity, achieving a score of 47% on a collection of 374 college-level multiple-choice word analogy questions. In the VSM approach, the relation between a pair of words is characterized by a vector of frequencies of predefined patterns in a large corpus. LRA extends the VSM approach in three ways: (1) the patterns are derived automatically from the corpus (they are not predefined), (2) the Singular Value Decomposition (SVD) is used to smooth the frequency data (it is also used this way in LSA), and (3) automatically generated synonyms are used to explore reformulations of the word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the average human score of 57%. On the related problem of classifying noun-modifier relations, LRA achieves similar gains over the VSM, while using a smaller corpus

    Comprehensive Review of Opinion Summarization

    Get PDF
    The abundance of opinions on the web has kindled the study of opinion summarization over the last few years. People have introduced various techniques and paradigms to solving this special task. This survey attempts to systematically investigate the different techniques and approaches used in opinion summarization. We provide a multi-perspective classification of the approaches used and highlight some of the key weaknesses of these approaches. This survey also covers evaluation techniques and data sets used in studying the opinion summarization problem. Finally, we provide insights into some of the challenges that are left to be addressed as this will help set the trend for future research in this area.unpublishednot peer reviewe

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Complete Semantics to empower Touristic Service Providers

    Full text link
    The tourism industry has a significant impact on the world's economy, contributes 10.2% of the world's gross domestic product in 2016. It becomes a very competitive industry, where having a strong online presence is an essential aspect for business success. To achieve this goal, the proper usage of latest Web technologies, particularly schema.org annotations is crucial. In this paper, we present our effort to improve the online visibility of touristic service providers in the region of Tyrol, Austria, by creating and deploying a substantial amount of semantic annotations according to schema.org, a widely used vocabulary for structured data on the Web. We started our work from Tourismusverband (TVB) Mayrhofen-Hippach and all touristic service providers in the Mayrhofen-Hippach region and applied the same approach to other TVBs and regions, as well as other use cases. The rationale for doing this is straightforward. Having schema.org annotations enables search engines to understand the content better, and provide better results for end users, as well as enables various intelligent applications to utilize them. As a direct consequence, the region of Tyrol and its touristic service increase their online visibility and decrease the dependency on intermediaries, i.e. Online Travel Agency (OTA).Comment: 18 pages, 6 figure
    • ā€¦
    corecore