17,591 research outputs found

    MIRACLE Retrieval Experiments with East Asian Languages

    Get PDF
    This paper describes the participation of MIRACLE in NTCIR 2005 CLIR task. Although our group has a strong background and long expertise in Computational Linguistics and Information Retrieval applied to European languages and using Latin and Cyrillic alphabets, this was our first attempt on East Asian languages. Our main goal was to study the particularities and distinctive characteristics of Japanese, Chinese and Korean, specially focusing on the similarities and differences with European languages, and carry out research on CLIR tasks which include those languages. The basic idea behind our participation in NTCIR is to test if the same familiar linguisticbased techniques may also applicable to East Asian languages, and study the necessary adaptations

    LLTI Highlights

    Get PDF

    LLTI Highlights

    Get PDF

    On the ethnic classification of Pakistani face using deep learning

    Get PDF

    Translation into any natural language of the error messages generated by any computer program

    Full text link
    Since the introduction of the Fortran programming language some 60 years ago, there has been little progress in making error messages more user-friendly. A first step in this direction is to translate them into the natural language of the students. In this paper we propose a simple script for Linux systems which gives word by word translations of error messages. It works for most programming languages and for all natural languages. Understanding the error messages generated by compilers is a major hurdle for students who are learning programming, particularly for non-native English speakers. Not only may they never become "fluent" in programming but many give up programming altogether. Whereas programming is a tool which can be useful in many human activities, e.g. history, genealogy, astronomy, entomology, in many countries the skill of programming remains confined to a narrow fringe of professional programmers. In all societies, besides professional violinists there are also amateurs. It should be the same for programming. It is our hope that once translated and explained the error messages will be seen by the students as an aid rather than as an obstacle and that in this way more students will enjoy learning and practising programming. They should see it as a funny game.Comment: 14 pages, 1 figur

    The "handedness" of language: Directional symmetry breaking of sign usage in words

    Full text link
    Language, which allows complex ideas to be communicated through symbolic sequences, is a characteristic feature of our species and manifested in a multitude of forms. Using large written corpora for many different languages and scripts, we show that the occurrence probability distributions of signs at the left and right ends of words have a distinct heterogeneous nature. Characterizing this asymmetry using quantitative inequality measures, viz. information entropy and the Gini index, we show that the beginning of a word is less restrictive in sign usage than the end. This property is not simply attributable to the use of common affixes as it is seen even when only word roots are considered. We use the existence of this asymmetry to infer the direction of writing in undeciphered inscriptions that agrees with the archaeological evidence. Unlike traditional investigations of phonotactic constraints which focus on language-specific patterns, our study reveals a property valid across languages and writing systems. As both language and writing are unique aspects of our species, this universal signature may reflect an innate feature of the human cognitive phenomenon.Comment: 10 pages, 4 figures + Supplementary Information (15 pages, 8 figures), final corrected versio
    • …
    corecore