29,175 research outputs found

    Automatic Construction of Clean Broad-Coverage Translation Lexicons

    Full text link
    Word-level translational equivalences can be extracted from parallel texts by surprisingly simple statistical techniques. However, these techniques are easily fooled by {\em indirect associations} --- pairs of unrelated words whose statistical properties resemble those of mutual translations. Indirect associations pollute the resulting translation lexicons, drastically reducing their precision. This paper presents an iterative lexicon cleaning method. On each iteration, most of the remaining incorrect lexicon entries are filtered out, without significant degradation in recall. This lexicon cleaning technique can produce translation lexicons with recall and precision both exceeding 90\%, as well as dictionary-sized translation lexicons that are over 99\% correct.Comment: PostScript file, 10 pages. To appear in Proceedings of AMTA-9

    Dublin City University at CLEF 2007: Cross-Language Speech Retrieval Experiments

    Get PDF
    The Dublin City University participation in the CLEF 2007 CL-SR English task concentrated primarily on issues of topic translation. Our retrieval system used the BM25F model and pseudo relevance feedback. Topics were translated into English using the Yahoo! BabelFish free online service combined with domain-specific translation lexicons gathered automatically from Wikipedia. We explored alternative topic translation methods using these resources. Our results indicate that extending machine translation tools using automatically generated domainspecific translation lexicons can provide improved CLIR effectiveness for this task

    Representing the bilingual's two lexicons

    Get PDF
    A review of empirical work suggests that the lexical representations of a bilingual’s two languages are independent (Smith, 1991), but may also be sensitive to between language similarity patterns (e.g. Cristoffanini, Kirsner, and Milech, 1986). Some researchers hold that infant bilinguals do not initially differentiate between their two languages (e.g. Redlinger & Park, 1980). Yet by the age of two they appear to have acquired separate linguistic systems for each language (Lanza, 1992). This paper explores the hypothesis that the separation of lexical representations in bilinguals is a functional rather than an architectural one. It suggests that the separation may be driven by differences in the structure of the input to a common architectural system. Connectionist simulations are presented modelling the representation of two sets of lexical information. These simulations explore the conditions required to create functionally independent lexical representations in a single neural network. It is shown that a single network may acquire a second language after learning a first (avoiding the traditional problem of catastrophic interference in these networks). Further it is shown that in a single network, the functional independence of representations is dependent on inter-language similarity patterns. The latter finding is difficult to account for in a model that postulates architecturally separate lexical representations

    Word Affect Intensities

    Full text link
    Words often convey affect -- emotions, feelings, and attitudes. Lexicons of word-affect association have applications in automatic emotion analysis and natural language generation. However, existing lexicons indicate only coarse categories of affect association. Here, for the first time, we create an affect intensity lexicon with real-valued scores of association. We use a technique called best-worst scaling that improves annotation consistency and obtains reliable fine-grained scores. The lexicon includes terms common from both general English and terms specific to social media communications. It has close to 6,000 entries for four basic emotions. We will be adding entries for other affect dimensions shortly

    Lexicons in Nelayan Dance Movements

    Get PDF
    AbstractNelayan dance is one of the classical dances in Bali that have high cultural value. Nelayan dance has unique and varied lexicons to study. Usually, the lexicons in Nelayan dance will be used during dance practice. However, due to the development of the times, the use of the lexicon in the Nelayan dance is decreasing. This study aims to collect the lexicon that exists in the Nelayan dance. This research was designed in a descriptive qualitative form using an ecolinguistic approach. The object of this research is the lexicon of movements in the Nelayan dance, including the cultural meanings of each of the lexicons. The subjects of this study were three informants who had extensive knowledge of the arts. The data from this study were obtained by conducting observations and interviews at the Sanggar Seni Manik Uttara. The results of this study indicate that the total lexicon movements in Nelayan dance are 55 lexicons which are divided into six types, such as head movements (4 lexicons), eye movements (5 lexicons), neck movements (2 lexicons), hand movements (22 lexicons), body movements (14 lexicons), and leg movements (8 lexicons). Based on those lexicons, the cultural meaning found in lexicons’ movements is 28 lexicons.Keywords: Lexicons, nelayan dance, ecolinguistics, movements, language death
    corecore