Search CORE

8,835 research outputs found

Improving Hypernymy Extraction with Distributional Semantic Classes

Author: Biemann Chris
Faralli Stefano
Panchenko Alexander
Ponzetto Simone P.
Ustalov Dmitry
Publication venue
Publication date: 01/01/2018
Field of study

In this paper, we show how distributionally-induced semantic classes can be helpful for extracting hypernyms. We present methods for inducing sense-aware semantic classes using distributional semantics and using these induced semantic classes for filtering noisy hypernymy relations. Denoising of hypernyms is performed by labeling each semantic class with its hypernyms. On the one hand, this allows us to filter out wrong extractions using the global structure of distributionally similar senses. On the other hand, we infer missing hypernyms via label propagation to cluster terms. We conduct a large-scale crowdsourcing study showing that processing of automatically extracted hypernyms using our approach improves the quality of the hypernymy extraction in terms of both precision and recall. Furthermore, we show the utility of our method in the domain taxonomy induction task, achieving the state-of-the-art results on a SemEval'16 task on taxonomy induction.Comment: In Proceedings of the 11th Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japa

arXiv.org e-Print Archive

MAnnheim DOCument Server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Simplifying Deep-Learning-Based Model for Code Search

Author: Hassan Ahmed E.
Li Shanping
Liu Chao
Liu Zhiwei
Lo David
Xia Xin
Publication venue
Publication date: 28/05/2020
Field of study

To accelerate software development, developers frequently search and reuse existing code snippets from a large-scale codebase, e.g., GitHub. Over the years, researchers proposed many information retrieval (IR) based models for code search, which match keywords in query with code text. But they fail to connect the semantic gap between query and code. To conquer this challenge, Gu et al. proposed a deep-learning-based model named DeepCS. It jointly embeds method code and natural language description into a shared vector space, where methods related to a natural language query are retrieved according to their vector similarities. However, DeepCS' working process is complicated and time-consuming. To overcome this issue, we proposed a simplified model CodeMatcher that leverages the IR technique but maintains many features in DeepCS. Generally, CodeMatcher combines query keywords with the original order, performs a fuzzy search on name and body strings of methods, and returned the best-matched methods with the longer sequence of used keywords. We verified its effectiveness on a large-scale codebase with about 41k repositories. Experimental results showed the simplified model CodeMatcher outperforms DeepCS by 97% in terms of MRR (a widely used accuracy measure for code search), and it is over 66 times faster than DeepCS. Besides, comparing with the state-of-the-art IR-based model CodeHow, CodeMatcher also improves the MRR by 73%. We also observed that: fusing the advantages of IR-based and deep-learning-based models is promising because they compensate with each other by nature; improving the quality of method naming helps code search, since method name plays an important role in connecting query and code

arXiv.org e-Print Archive

Institutional Knowledge at Singapore Management University

Unsupervised Sense-Aware Hypernymy Extraction

Author: Biemann Chris
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue
Publication date: 17/09/2018
Field of study

In this paper, we show how unsupervised sense representations can be used to improve hypernymy extraction. We present a method for extracting disambiguated hypernymy relationships that propagates hypernyms to sets of synonyms (synsets), constructs embeddings for these sets, and establishes sense-aware relationships between matching synsets. Evaluation on two gold standard datasets for English and Russian shows that the method successfully recognizes hypernymy relationships that cannot be found with standard Hearst patterns and Wiktionary datasets for the respective languages.Comment: In Proceedings of the 14th Conference on Natural Language Processing (KONVENS 2018). Vienna, Austri

arXiv.org e-Print Archive

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Recommended from our members

Ideation as an intellectual information acquisition and use context: Investigating game designers’ information-based ideation behavior

Author: Hsueh T-L.
Jones S.
Makri S.
Publication venue: 'Wiley'
Publication date: 01/08/2019
Field of study

Human Information Behavior (HIB) research commonly examines behavior in the context of why information is acquired and how it will be used, but usually at the level of the work or everyday-life tasks the information will support. HIB has not been examined in detail at the broader contextual level of intellectual purpose (i.e. the higher-order conceptual tasks the information was acquired to support). Examination at this level can enhance holistic understanding of HIB as a ‘means to an intellectual end’ and inform the design of digital information environments that support information interaction for specific intellectual purposes. We investigate information-based ideation (IBI) as a specific intellectual information acquisition and use context by conducting Critical Incident-style interviews with ten game designers, focusing on how they interact with information to generate and develop creative design ideas. Our findings give rise to a framework of their ideation-focused HIB, which systems designers can leverage to reason about how best to support certain behaviors to drive design ideation. These findings emphasize the importance of intellectual purpose as a driver for acquisition and desired outcome of use

City Research Online

Exploiting a Multilingual Web-based Encyclopedia for Bilingual Terminology Extraction

Author: Sadat Fatiha
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

Waseda University Repository