12 research outputs found

    Creating multimedia dictionaries of endangered languages using LEXUS

    No full text
    This paper reports on the development of a flexible web based lexicon tool, LEXUS. LEXUS is targeted at linguists involved in language documentation (of endangered languages). It allows the creation of lexica within the structure of the proposed ISO LMF standard and uses the proposed concept naming conventions from the ISO data categories, thus enabling interoperability, search and merging. LEXUS also offers the possibility to visualize language, since it provides functionalities to include audio, video and still images to the lexicon. With LEXUS it is possible to create semantic network knowledge bases, using typed relations. The LEXUS tool is free for use. Index Terms: lexicon, web based application, endangered languages, language documentation

    Zapotec Language Activism And Talking Dictionaries

    Get PDF
    Online dictionaries have become a key tool for some indigenous communities to promote and preserve their languages, often in collaboration with linguists. They can provide a pathway for crossing the digital divide and for establishing a first-ever presence on the internet. Many questions around digital lexicography have been explored, although primarily in relation to large and well-resourced languages. Lexical projects on small and under-resourced languages can provide an opportunity to examine these questions from a different perspective and to raise new questions (Mosel, 2011). In this paper, linguists, technical experts, and Zapotec language activists, who have worked together in Mexico and the United States to create a multimedia platform to showcase and preserve lexical, cultural, and environmental knowledge, share their experience and insight in creating trilingual online Talking Dictionaries in several Zapotec languages. These dictionaries sit opposite from big data mining and illustrate the value of dictionary projects based on small corpora, including having the flexibility to make design decisions to maximize community impact and elevate the status of marginalized languages

    Archiving and accessing language resources

    Get PDF
    Languages are among the most complex systems that evolution has created. With an unforeseen speed many of these unique results of evolution are currently disappearing: every two weeks one of the 6500 still spoken languages is dying and many are subject to extreme changes due to globalization. Experts understood the need to document the languages and preserve the cultural and linguistic treasures embedded in them for future generations. Also linguistic theory will need to consider the variation of the linguistic systems encoded in languages to improve our understanding of how human minds process language material, thus accessibility to all types of resources is increasingly crucial. Deeper insights into human language processing and a higher degree of integration and interoperability between resources will also improve our language processing technology. The DOBES programme is focussing on the documentation and preservation of language material. The Max Planck Institute developed the Language Archiving Technology to help researchers when creating, archiving and accessing language resources. The recently started CLARIN research infrastructure has as main goals to achieve a broad visibility and an easy accessibility of language resources

    Research Report 2007 | 2008

    No full text

    Annual Report of the University, 1994-1995, Volumes 1-4

    Get PDF
    DEMONSTRATING THE STRENGTH OF DIVERSITY A walk around the UNM campus as students change classes demonstrates UNM\\u27s commitment to diversity. Students and professors from a variety of ethnic backgrounds crowd the sidewalks and fill classrooms. Over the past year UNM moved forward with existing and new programs to interest more minority students, faculty and staff in the University and to aid in their success while here. Hispanic Outlook in Higher Education recently recognized the University\\u27s endeavors, ranking UNM as one of the best colleges in the nation at graduating Hispanic students. Provost Mary Sue Coleman says diversity contributes to a stimulating environment where faculty and students have different points of view and experiences. The campus becomes a more intellectually alive place, she says. The efforts to build a diverse campus go hand in hand with the University\\u27s goals of achieving academic excellence and attracting the best and brightest. MINORITY ENROLLMENT In the fall of 1994 a total of 32 percent of the student body came from underrepresented groups. The UNM School of Law had the largest number of Native Americans enrolled in any law school in the country

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal
    corecore