29,175 research outputs found
Automatic Construction of Clean Broad-Coverage Translation Lexicons
Word-level translational equivalences can be extracted from parallel texts by
surprisingly simple statistical techniques. However, these techniques are
easily fooled by {\em indirect associations} --- pairs of unrelated words whose
statistical properties resemble those of mutual translations. Indirect
associations pollute the resulting translation lexicons, drastically reducing
their precision. This paper presents an iterative lexicon cleaning method. On
each iteration, most of the remaining incorrect lexicon entries are filtered
out, without significant degradation in recall. This lexicon cleaning technique
can produce translation lexicons with recall and precision both exceeding 90\%,
as well as dictionary-sized translation lexicons that are over 99\% correct.Comment: PostScript file, 10 pages. To appear in Proceedings of AMTA-9
Dublin City University at CLEF 2007: Cross-Language Speech Retrieval Experiments
The Dublin City University participation in the CLEF 2007 CL-SR English task concentrated primarily on issues of topic translation. Our retrieval system used the BM25F model and pseudo relevance feedback. Topics were translated into English using the Yahoo! BabelFish free online service combined with domain-specific translation lexicons gathered automatically from Wikipedia. We explored alternative topic translation methods using these resources. Our results indicate that extending machine translation tools using automatically generated domainspecific translation lexicons can provide improved CLIR effectiveness for this task
Representing the bilingual's two lexicons
A review of empirical work suggests that the lexical representations of a bilingual’s two languages are independent (Smith, 1991), but may also be sensitive to between language similarity patterns (e.g. Cristoffanini, Kirsner, and Milech, 1986). Some researchers hold that infant bilinguals do not initially differentiate between their two languages (e.g. Redlinger & Park, 1980). Yet by the age of two they appear to have acquired separate linguistic systems for each language (Lanza, 1992). This paper explores the hypothesis that the separation of lexical representations in bilinguals is a functional rather than an architectural one. It suggests that the separation may be driven by differences in the structure of the input to a common architectural system. Connectionist simulations are presented modelling the representation of two sets of lexical information. These simulations explore the conditions required to create functionally independent lexical representations in a single neural network. It is shown that a single network may acquire a second language after learning a first (avoiding the traditional problem of catastrophic interference in these networks). Further it is shown that in a single network, the functional independence of representations is dependent on inter-language similarity patterns. The latter finding is difficult to account for in a model that postulates architecturally separate lexical representations
Word Affect Intensities
Words often convey affect -- emotions, feelings, and attitudes. Lexicons of
word-affect association have applications in automatic emotion analysis and
natural language generation. However, existing lexicons indicate only coarse
categories of affect association. Here, for the first time, we create an affect
intensity lexicon with real-valued scores of association. We use a technique
called best-worst scaling that improves annotation consistency and obtains
reliable fine-grained scores. The lexicon includes terms common from both
general English and terms specific to social media communications. It has close
to 6,000 entries for four basic emotions. We will be adding entries for other
affect dimensions shortly
Lexicons in Nelayan Dance Movements
AbstractNelayan dance is one of the classical dances in Bali that have high cultural value. Nelayan dance has unique and varied lexicons to study. Usually, the lexicons in Nelayan dance will be used during dance practice. However, due to the development of the times, the use of the lexicon in the Nelayan dance is decreasing. This study aims to collect the lexicon that exists in the Nelayan dance. This research was designed in a descriptive qualitative form using an ecolinguistic approach. The object of this research is the lexicon of movements in the Nelayan dance, including the cultural meanings of each of the lexicons. The subjects of this study were three informants who had extensive knowledge of the arts. The data from this study were obtained by conducting observations and interviews at the Sanggar Seni Manik Uttara. The results of this study indicate that the total lexicon movements in Nelayan dance are 55 lexicons which are divided into six types, such as head movements (4 lexicons), eye movements (5 lexicons), neck movements (2 lexicons), hand movements (22 lexicons), body movements (14 lexicons), and leg movements (8 lexicons). Based on those lexicons, the cultural meaning found in lexicons’ movements is 28 lexicons.Keywords: Lexicons, nelayan dance, ecolinguistics, movements, language death
Recommended from our members
A Linked Open Data Approach for Sentiment Lexicon Adaptation
Social media platforms have recently become a gold mine for organisations to monitor their reputation by extracting and analysing the sentiment of the posts generated about them, their markets, and competitors. Among the approaches to analyse sentiment from social media, approaches based on sentiment lexicons (sets of words with associated sentiment scores) have gained popularity since they do not rely on training data, as opposed to Machine Learning approaches. However, sentiment lexicons consider a static sentiment score for each word without taking into consideration the different contexts in which the word is used (e.g, great problem vs. great smile). Additionally, new words constantly emerge from dynamic and rapidly changing social media environments that may not be covered by the lexicons. In this paper we propose a lexicon adaptation approach that makes use of semantic relations extracted from DBpedia to better understand the various contextual scenarios in which words are used. We evaluate our approach on three different Twitter datasets and show that using semantic information to adapt the lexicon improves sentiment computation by 3.7% in average accuracy, and by 2.6% in average F1 measure
- …