Search CORE

29,175 research outputs found

Automatic Construction of Clean Broad-Coverage Translation Lexicons

Author: Melamed I. Dan
Publication venue
Publication date: 01/01/1996
Field of study

Word-level translational equivalences can be extracted from parallel texts by surprisingly simple statistical techniques. However, these techniques are easily fooled by {\em indirect associations} --- pairs of unrelated words whose statistical properties resemble those of mutual translations. Indirect associations pollute the resulting translation lexicons, drastically reducing their precision. This paper presents an iterative lexicon cleaning method. On each iteration, most of the remaining incorrect lexicon entries are filtered out, without significant degradation in recall. This lexicon cleaning technique can produce translation lexicons with recall and precision both exceeding 90\%, as well as dictionary-sized translation lexicons that are over 99\% correct.Comment: PostScript file, 10 pages. To appear in Proceedings of AMTA-9

arXiv.org e-Print Archive

CiteSeerX

Dublin City University at CLEF 2007: Cross-Language Speech Retrieval Experiments

Author: Jones Gareth J.F.
Zhang Ke
Zhang Ying
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

The Dublin City University participation in the CLEF 2007 CL-SR English task concentrated primarily on issues of topic translation. Our retrieval system used the BM25F model and pseudo relevance feedback. Topics were translated into English using the Yahoo! BabelFish free online service combined with domain-specific translation lexicons gathered automatically from Wikipedia. We explored alternative topic translation methods using these resources. Our results indicate that extending machine translation tools using automatically generated domainspecific translation lexicons can provide improved CLIR effectiveness for this task

Irish Universities

DCU Online Research Access Service

Representing the bilingual's two lexicons

Author: Plunkett K.
Thomas Michael S.C.
Publication venue
Publication date: 01/01/1995
Field of study

A review of empirical work suggests that the lexical representations of a bilingual’s two languages are independent (Smith, 1991), but may also be sensitive to between language similarity patterns (e.g. Cristoffanini, Kirsner, and Milech, 1986). Some researchers hold that infant bilinguals do not initially differentiate between their two languages (e.g. Redlinger & Park, 1980). Yet by the age of two they appear to have acquired separate linguistic systems for each language (Lanza, 1992). This paper explores the hypothesis that the separation of lexical representations in bilinguals is a functional rather than an architectural one. It suggests that the separation may be driven by differences in the structure of the input to a common architectural system. Connectionist simulations are presented modelling the representation of two sets of lexical information. These simulations explore the conditions required to create functionally independent lexical representations in a single neural network. It is shown that a single network may acquire a second language after learning a first (avoiding the traditional problem of catastrophic interference in these networks). Further it is shown that in a single network, the functional independence of representations is dependent on inter-language similarity patterns. The latter finding is difficult to account for in a model that postulates architecturally separate lexical representations

eScholarship - University of California

Birkbeck Institutional Research Online

Word Affect Intensities

Author: Mohammad Saif M.
Publication venue
Publication date: 27/04/2017
Field of study

Words often convey affect -- emotions, feelings, and attitudes. Lexicons of word-affect association have applications in automatic emotion analysis and natural language generation. However, existing lexicons indicate only coarse categories of affect association. Here, for the first time, we create an affect intensity lexicon with real-valued scores of association. We use a technique called best-worst scaling that improves annotation consistency and obtains reliable fine-grained scores. The lexicon includes terms common from both general English and terms specific to social media communications. It has close to 6,000 entries for four basic emotions. We will be adding entries for other affect dimensions shortly

arXiv.org e-Print Archive

Lexicons in Nelayan Dance Movements

Author: Andini Ni Ketut Sri
Publication venue: Universitas Nurul Jadid
Publication date: 04/07/2023
Field of study

AbstractNelayan dance is one of the classical dances in Bali that have high cultural value. Nelayan dance has unique and varied lexicons to study. Usually, the lexicons in Nelayan dance will be used during dance practice. However, due to the development of the times, the use of the lexicon in the Nelayan dance is decreasing. This study aims to collect the lexicon that exists in the Nelayan dance. This research was designed in a descriptive qualitative form using an ecolinguistic approach. The object of this research is the lexicon of movements in the Nelayan dance, including the cultural meanings of each of the lexicons. The subjects of this study were three informants who had extensive knowledge of the arts. The data from this study were obtained by conducting observations and interviews at the Sanggar Seni Manik Uttara. The results of this study indicate that the total lexicon movements in Nelayan dance are 55 lexicons which are divided into six types, such as head movements (4 lexicons), eye movements (5 lexicons), neck movements (2 lexicons), hand movements (22 lexicons), body movements (14 lexicons), and leg movements (8 lexicons). Based on those lexicons, the cultural meaning found in lexicons’ movements is 28 lexicons.Keywords: Lexicons, nelayan dance, ecolinguistics, movements, language death

E-Journal UNUJA (Universitas Nurul Jadid)

Recommended from our members

A Linked Open Data Approach for Sentiment Lexicon Adaptation

Author: Alani Harith
Fernández Miriam
Kastler Leon
Saif Hassan
Publication venue
Publication date: 01/01/2016
Field of study

Social media platforms have recently become a gold mine for organisations to monitor their reputation by extracting and analysing the sentiment of the posts generated about them, their markets, and competitors. Among the approaches to analyse sentiment from social media, approaches based on sentiment lexicons (sets of words with associated sentiment scores) have gained popularity since they do not rely on training data, as opposed to Machine Learning approaches. However, sentiment lexicons consider a static sentiment score for each word without taking into consideration the different contexts in which the word is used (e.g, great problem vs. great smile). Additionally, new words constantly emerge from dynamic and rapidly changing social media environments that may not be covered by the lexicons. In this paper we propose a lexicon adaptation approach that makes use of semantic relations extracted from DBpedia to better understand the various contextual scenarios in which words are used. We evaluate our approach on three different Twitter datasets and show that using semantic information to adapt the lexicon improves sentiment computation by 3.7% in average accuracy, and by 2.6% in average F1 measure

Open Research Online (The Open University)