76 research outputs found
Extracting Synonyms from Bilingual Dictionaries
We present our progress in developing a novel algorithm to extract synonyms
from bilingual dictionaries. Identification and usage of synonyms play a
significant role in improving the performance of information access
applications. The idea is to construct a translation graph from translation
pairs, then to extract and consolidate cyclic paths to form bilingual sets of
synonyms. The initial evaluation of this algorithm illustrates promising
results in extracting Arabic-English bilingual synonyms. In the evaluation, we
first converted the synsets in the Arabic WordNet into translation pairs (i.e.,
losing word-sense memberships). Next, we applied our algorithm to rebuild these
synsets. We compared the original and extracted synsets obtaining an F-Measure
of 82.3% and 82.1% for Arabic and English synsets extraction, respectively.Comment: In Proceedings - 11th International Global Wordnet Conference
(GWC2021). Global Wordnet Association (2021
A Comprehensive Review of Sentiment Analysis on Indian Regional Languages: Techniques, Challenges, and Trends
Sentiment analysis (SA) is the process of understanding emotion within a text. It helps identify the opinion, attitude, and tone of a text categorizing it into positive, negative, or neutral. SA is frequently used today as more and more people get a chance to put out their thoughts due to the advent of social media. Sentiment analysis benefits industries around the globe, like finance, advertising, marketing, travel, hospitality, etc. Although the majority of work done in this field is on global languages like English, in recent years, the importance of SA in local languages has also been widely recognized. This has led to considerable research in the analysis of Indian regional languages. This paper comprehensively reviews SA in the following major Indian Regional languages: Marathi, Hindi, Tamil, Telugu, Malayalam, Bengali, Gujarati, and Urdu. Furthermore, this paper presents techniques, challenges, findings, recent research trends, and future scope for enhancing results accuracy
Categories and classifications in EuroWordNet
In EuroWordNet we develop wordnets in 8 European languages, which are structured along the same lines as the Princeton WordNet. The wordnets are inter-linked in a multilingual database, where they can be compared. This comparison reveals many different lexicalizations of classes across the languages that also lead to important differences in the hierarchical structure of the wordnets. It is not feasible to include all these classes (the superset) in each language-specific wordnet and to reach consensus on the implicational effects across all the languages. Each wordnet is therefore limited to the lexicalized words and expressions of a language. The wordnets are thus autonomous language-specific structures that capture valuable information about the lexicalization of each language, which is important for information retrieval, machine translation and language generation. By connecting the wordnets to a separate ontology, semantic inferencing can still be guaranteed. Still, different types of classification schemes can be distinguished among the lexicalized classes. In this paper we will further describe the properties of these different classes and discuss the advantages and effects of distinguishing them in wordnet-like structures
- …