1,277 research outputs found

    Highly Interactive and Natural User Interfaces: Enabling Visual Analysis in Historical Lexicography

    Get PDF
    Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage.Information technology, through the advances provided by computational linguistics and related disciplines, has opened the door to previously unthinkable possibilities of study in linguistics. The wealth and diversity of sources that is now available is fundamental to the understanding of language evolution and dictionary-making. However, these advancements are paired with a paradigm shift, in which both the user needs and the modes in which the users interact with technology have changed so much and so rapidly, that modern lexicography would need to resort to a new generation of tools to support its tasks. We present our work developed for the Nuevo Diccionario Histórico del Español (NDHE), in which the challenges of enabling deeper insight and supporting new user's tasks in diachronic linguistics have been approached from a human-computer interaction perspective. Thus, in contrast to what has happened in other disciplines in which visual analytics has focused its efforts since earlier, the analysis tools that are made now in the hands of the experts usually provide a volume of "raw" data so vast, that the data themselves can greatly hinder the work of experts. The linguistics community has already recognized the key importance of user-friendly interfaces. However, neither more powerful tools (in terms of automatic processing) nor user-friendliness alone are sufficient to support typical analytical tasks that take out the most from the multidimensional and ever-growing data stored in corpora and dictionaries. This paper discusses the benefits of producing corpus and dictionary analysis tools that go beyond user-friendliness and presents, interactive visual analysis tools produced for the NDHE and its sources

    06491 Abstracts Collection -- Digital Historical Corpora- Architecture, Annotation, and Retrieval

    Get PDF
    From 03.12.06 to 08.12.06, the Dagstuhl Seminar 06491 ``Digital Historical Corpora - Architecture, Annotation, and Retrieval\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if availabl

    Lexical typology through similarity semantics: Toward a semantic map of motion verbs

    Get PDF
    This paper discusses a multidimensional probabilistic semantic map of lexical motion verb stems based on data collected from parallel texts (viz. translations of the Gospel according to Mark) for 100 languages from all continents. The crosslinguistic diversity of lexical semantics in motion verbs is illustrated in detail for the domain of `go', `come', and `arrive' type contexts. It is argued that the theoretical bases underlying probabilistic semantic maps from exemplar data are the isomorphism hypothesis (given any two meanings and their corresponding forms in any particular language, more similar meanings are more likely to be expressed by the same form in any language), similarity semantics (similarity is more basic than identity), and exemplar semantics (exemplar meaning is more fundamental than abstract concepts)

    Computational approaches to semantic change (Volume 6)

    Get PDF
    Semantic change — how the meanings of words change over time — has preoccupied scholars since well before modern linguistics emerged in the late 19th and early 20th century, ushering in a new methodological turn in the study of language change. Compared to changes in sound and grammar, semantic change is the least understood. Ever since, the study of semantic change has progressed steadily, accumulating a vast store of knowledge for over a century, encompassing many languages and language families. Historical linguists also early on realized the potential of computers as research tools, with papers at the very first international conferences in computational linguistics in the 1960s. Such computational studies still tended to be small-scale, method-oriented, and qualitative. However, recent years have witnessed a sea-change in this regard. Big-data empirical quantitative investigations are now coming to the forefront, enabled by enormous advances in storage capability and processing power. Diachronic corpora have grown beyond imagination, defying exploration by traditional manual qualitative methods, and language technology has become increasingly data-driven and semantics-oriented. These developments present a golden opportunity for the empirical study of semantic change over both long and short time spans

    LAF-Fabric: a data analysis tool for Linguistic Annotation Framework with an application to the Hebrew Bible

    Get PDF
    The Linguistic Annotation Framework (LAF) provides a general, extensible stand-off markup system for corpora. This paper discusses LAF-Fabric, a new tool to analyse LAF resources in general with an extension to process the Hebrew Bible in particular. We first walk through the history of the Hebrew Bible as text database in decennium-wide steps. Then we describe how LAF-Fabric may serve as an analysis tool for this corpus. Finally, we describe three analytic projects/workflows that benefit from the new LAF representation: 1) the study of linguistic variation: extract cooccurrence data of common nouns between the books of the Bible (Martijn Naaijer); 2) the study of the grammar of Hebrew poetry in the Psalms: extract clause typology (Gino Kalkman); 3) construction of a parser of classical Hebrew by Data Oriented Parsing: generate tree structures from the database (Andreas van Cranenburgh)
    • …
    corecore