45,901 research outputs found
CRL at Ntcir2
We have developed systems of two types for NTCIR2. One is an enhenced version
of the system we developed for NTCIR1 and IREX. It submitted retrieval results
for JJ and CC tasks. A variety of parameters were tried with the system. It
used such characteristics of newspapers as locational information in the CC
tasks. The system got good results for both of the tasks. The other system is a
portable system which avoids free parameters as much as possible. The system
submitted retrieval results for JJ, JE, EE, EJ, and CC tasks. The system
automatically determined the number of top documents and the weight of the
original query used in automatic-feedback retrieval. It also determined
relevant terms quite robustly. For EJ and JE tasks, it used document expansion
to augment the initial queries. It achieved good results, except on the CC
tasks.Comment: 11 pages. Computation and Language. This paper describes our results
of information retrieval in the NTCIR2 contes
Automatic domain ontology extraction for context-sensitive opinion mining
Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizations’ business strategy development and individual consumers’ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline
Distributional Measures of Semantic Distance: A Survey
The ability to mimic human notions of semantic distance has widespread
applications. Some measures rely only on raw text (distributional measures) and
some rely on knowledge sources such as WordNet. Although extensive studies have
been performed to compare WordNet-based measures with human judgment, the use
of distributional measures as proxies to estimate semantic distance has
received little attention. Even though they have traditionally performed poorly
when compared to WordNet-based measures, they lay claim to certain uniquely
attractive features, such as their applicability in resource-poor languages and
their ability to mimic both semantic similarity and semantic relatedness.
Therefore, this paper presents a detailed study of distributional measures.
Particular attention is paid to flesh out the strengths and limitations of both
WordNet-based and distributional measures, and how distributional measures of
distance can be brought more in line with human notions of semantic distance.
We conclude with a brief discussion of recent work on hybrid measures
- …