32,959 research outputs found
Distributional Measures of Semantic Distance: A Survey
The ability to mimic human notions of semantic distance has widespread
applications. Some measures rely only on raw text (distributional measures) and
some rely on knowledge sources such as WordNet. Although extensive studies have
been performed to compare WordNet-based measures with human judgment, the use
of distributional measures as proxies to estimate semantic distance has
received little attention. Even though they have traditionally performed poorly
when compared to WordNet-based measures, they lay claim to certain uniquely
attractive features, such as their applicability in resource-poor languages and
their ability to mimic both semantic similarity and semantic relatedness.
Therefore, this paper presents a detailed study of distributional measures.
Particular attention is paid to flesh out the strengths and limitations of both
WordNet-based and distributional measures, and how distributional measures of
distance can be brought more in line with human notions of semantic distance.
We conclude with a brief discussion of recent work on hybrid measures
Building a wordnet for Turkish
This paper summarizes the development process of a wordnet for Turkish as part of the Balkanet project. After discussing the basic method-ological issues that had to be resolved during the course of the project, the paper presents the basic steps of the construction process in chronological order. Two applications using Turkish wordnet are summarized and links to resources for wordnet builders are provided at the end of the paper
Affect Analysis of Radical Contents on Web Forums Using SentiWordNet
The internet has become a major tool for communication, training, fundraising, media operations, and recruitment, and these processes often use web forums. This paper presents a model that was built using SentiWordNet, WordNet and NLTK to analyze selected web forums that included radical content. SentiWordNet is a lexical resource for supporting opinion mining by assigning a positivity score and a negativity score to each WordNet. The approaches of the model measure and identify sentiment polarity and affect the intensity of that which appears in the web forum. The results show that SentiWordNet can be used for analyzing sentences that appear in web forums
Extending, trimming and fusing WordNet for technical documents
This paper describes a tool for the automatic
extension and trimming of a multilingual
WordNet database for cross-lingual retrieval
and multilingual ontology building in
intranets and domain-specific document
collections. Hierarchies, built from
automatically extracted terms and combined
with the WordNet relations, are trimmed
with a disambiguation method based on the
document salience of the words in the
glosses. The disambiguation is tested in a
cross-lingual retrieval task, showing
considerable improvement (7%-11%). The
condensed hierarchies can be used as
browse-interfaces to the documents
complementary to retrieval
A proposal for a shallow ontologization of WordNet
En este artículo se presenta el trabajo que se está realizando para la llamada ontologización superficial de WordNet, una estructura orientada a superar muchos de los problemas estructurales de la popular base de conocimiento léxico. El resultado esperado es un recurso multilingüe más apropiado que los ahora existentes para el procesamiento semántico a gran escala.This paper presents the work carried out towards the so-called shallow ontologization of WordNet, which is argued to be a way to overcome most of the many structural problems of the widely used lexical knowledge base. The result shall be a multilingual resource more suitable for large-scale semantic processing
- …
