3,587 research outputs found

    E-magyar -- A Digital Language Processing System

    Get PDF

    Hungarian Gyerekestül versus Gyerekkel (‘with [the] kid’)

    Get PDF
    The paper analyzes the various uses of the Hungarian -stUl (‘together with’, ‘along with’) sociative (associative) suffix (later in the paper referred to simply as “sociative”), as in the example gyerekestül. As opposed to its comitative-instrumental suffix -vAl (‘with’), the - stUl suffix cannot express instrumentality. The paper aims to demonstrate the difference in use between the comitative-instrumental -vAl and the -stUl suffix in contemporary Hungarian, and to illuminate the historical emergence of the suffix as well as its grammatical status. It is argued on the basis of Antal (1960) and Kiefer (2003) that -stUl cannot be analyzed as an inflectional case suffix (such as the -vAl suffix, or -ed, -ing, or the plural in English), but should rather be categorized as a derivational suffix (such as English dis-, re-, in-, -ance, - able, -ish, -like, etc.). The paper also tries to shed light on the hypothetical cognitive psychological distinction between the comitative and the sociative. It is suggested that the sociative is based on the amalgam image schema which is derived from the LINK schema of the comitative. The ironical reading of the sociative is an implicature in the sense of Grice (1989) and Sperber and Wilson (1987). Psycholinguistic experimentation is proposed to follow up on the mental representation of the sociative

    BEA – A multifunctional Hungarian spoken language database

    Get PDF
    In diverse areas of linguistics, the demand for studying actual language use is on the increase. The aim of developing a phonetically-based multi-purpose database of Hungarian spontaneous speech, dubbed BEA2, is to accumulate a large amount of spontaneous speech of various types together with sentence repetition and reading. Presently, the recorded material of BEA amounts to 260 hours produced by 280 present-day Budapest speakers (ages between 20 and 90, 168 females and 112 males), providing also annotated materials for various types of research and practical applications

    HuSpaCy : an industrial-strength Hungarian natural language processing toolkit

    Get PDF
    Although there are a couple of open-source language processing pipelines available for Hungarian, none of them satisfies the requirements of today’s NLP applications. A language processing pipeline should consist of close to state-of-the-art lemmatization, morphosyntactic analysis, entity recognition and word embeddings. Industrial text processing applications have to satisfy non-functional software quality requirements, what is more, frameworks supporting multiple languages are more and more favored. This paper introduces HuSpaCy, an industryready Hungarian language processing toolkit. The presented tool provides components for the most important basic linguistic analysis tasks. It is open-source and is available under a permissive license. Our system is built upon spaCy’s NLP components resulting in an easily usable, fast yet accurate application. Experiments confirm that HuSpaCy has high accuracy while maintaining resource-efficient prediction capabilities

    Infrastructure networks and the competitiveness of the economy

    Get PDF
    This paper aims to examine how technical infrastructure networks may contribute to improving the competitiveness of the Hungarian economy. Consequently, our main question will be to establish how certain networks or sectors can promote competitiveness of the entire economy rather than how they could be more competitive in their own field. In the macroeconomic or regional sense competitiveness is interpreted as the entirety of safeguards and preconditions that provide a long term basis for success in a competitive market environment. The review of the economic, social, institutional and facility preconditions of competitiveness has highlighted that practically every component must be backed by a good system of relations: both strong, balanced internal relations promoting co-operation and external relations to assure outward linkages. Despite the above correlation, it would be a fallacy to assume that infrastructure networks as linking elements in general are factors per se improving competitiveness. In accordance with the level of development of the economy, the key forms of activity and the realistically attainable objectives, different linkages and service needs become key for the development of the economy in different stages

    XVIII. Magyar Számítógépes Nyelvészeti Konferencia

    Get PDF

    Lightweight diacritics restoration for V4 languages

    Get PDF
    Diacritics restoration became a ubiquitous task in the Latinalphabet-based English-dominated Internet language environment. In this article, we describe a small footprint 1D convolution-based approach, which works on character-level. The model even runs locally in a web browser, and surpasses the performance of similarly sized models. We evaluate our model on the languages of the Visegrád Group, with emphasis on Hungarian

    Towards abstractive summarization in Hungarian

    Get PDF
    We publish an abstractive summarizer for Hungarian, an encoder-decoder model initialized with huBERT, and fine-tuned on the ELTE.DH corpus of former Hungarian news portals. The model produces fluent output in the correct topic, but it hallucinates frequently. Our quantitative evaluation on automatic and human transcripts of news (with automatic and human-made punctuation) shows that the model is robust with respect to errors in either automatic speech recognition or automatic punctuation restoration

    Prema novom jednojezičnom mađarskom objasnidbenom rječniku: pregled mađarskih objasnidbenih rječnika

    Get PDF
    The Lexical Knowledge Representation Research Group at the Department of Lexicology is one of the youngest research groups of the Hungarian Research Centre for Linguistics, founded in February 2020. The group is currently working on a new version of a monolingual explanatory dictionary partly based on The Explanatory Dictionary of the Hungarian Language. The aim is to compile an up-to-date online dictionary of contemporary Hungarian (2001–2020) by corpus-driven methods. The present article describes The Explanatory Dictionary of the Hungarian Language and the Comprehensive Dictionary of Hungarian by presenting their history, the circumstances of their compilation, and the basic editorial guidelines. Then it outlines how the corpus for the planned dictionary is to be set up and how this corpus is to be analysed.Istraživačka skupina za prikaz leksičkog znanja jedna je od najmlađih istraživačkih skupina Mađarskog istraživačkog centra za lingvistiku, osnovana u veljači 2020. Skupina trenutno radi na novoj inačici jednojezičnoga objasnidbenog rječnika proizišloga iz Objasnidbenoga rječnika mađarskog jezika. Cilj joj je kompilirati moderan i ažuriran mrežni rječnik mađarskog jezika (2001–2020) koristeći se korpusom vođenim metodama. Članak opisuje Objasnidbeni rječnik mađarskog jezika i Velikog rječnika mađarskog jezika predstavljanjem njihove povijesti, okolnosti u kojima su kompilirani, te osnovnih uredničkih načela. Potom skicira kako će se organizirati i analizirati korpus planiranoga rječnika
    corecore