Search CORE

3,587 research outputs found

E-magyar -- A Digital Language Processing System

Author: Indig Balázs
Mittelholcz Iván
Novák Attila
Sass Bálint
Simon Eszter
Váradi Tamás
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2018
Field of study

Hungarian Gyerekestül versus Gyerekkel (‘with [the] kid’)

Author: Fekete István
Publication venue: American Hungarian Educators' Association
Publication date: 01/01/2013
Field of study

The paper analyzes the various uses of the Hungarian -stUl (‘together with’, ‘along with’) sociative (associative) suffix (later in the paper referred to simply as “sociative”), as in the example gyerekestül. As opposed to its comitative-instrumental suffix -vAl (‘with’), the - stUl suffix cannot express instrumentality. The paper aims to demonstrate the difference in use between the comitative-instrumental -vAl and the -stUl suffix in contemporary Hungarian, and to illuminate the historical emergence of the suffix as well as its grammatical status. It is argued on the basis of Antal (1960) and Kiefer (2003) that -stUl cannot be analyzed as an inflectional case suffix (such as the -vAl suffix, or -ed, -ing, or the plural in English), but should rather be categorized as a derivational suffix (such as English dis-, re-, in-, -ance, - able, -ish, -like, etc.). The paper also tries to shed light on the hypothetical cognitive psychological distinction between the comitative and the sociative. It is suggested that the sociative is based on the amalgam image schema which is derived from the LINK schema of the comitative. The ironical reading of the sociative is an implicature in the sense of Grice (1989) and Sperber and Wilson (1987). Psycholinguistic experimentation is proposed to follow up on the mental representation of the sociative

Repository of the Academy's Library

BEA – A multifunctional Hungarian spoken language database

Author: Gósy Mária
Publication venue: International Society of Phonetic Sciences
Publication date: 01/01/2013
Field of study

In diverse areas of linguistics, the demand for studying actual language use is on the increase. The aim of developing a phonetically-based multi-purpose database of Hungarian spontaneous speech, dubbed BEA2, is to accumulate a large amount of spontaneous speech of various types together with sentence repetition and reading. Presently, the recorded material of BEA amounts to 260 hours produced by 280 present-day Budapest speakers (ages between 20 and 90, 168 females and 112 males), providing also annotated materials for various types of research and practical applications

Repository of the Academy's Library

Recommended from our members

Consolidated proposal for encoding the Old Hungarian script in the UCS

Author: Everson Michael
Szelp André Szabolcs
Publication venue: eScholarship, University of California
Publication date: 02/10/2012
Field of study

This is a proposal to encode the Old Hungarian script in the international character encoding standard Unicode. The script was published in Unicode Standard version 8.0 in June 2015. The script was used to write the Hungarian language, and continues in limited use today. It is also known by the name Rovásírás as well as other names

eScholarship - University of California

HuSpaCy : an industrial-strength Hungarian natural language processing toolkit

Author: Berkecz Péter
Farkas Richárd
Orosz György
Szabó Gergő
Szántó Zsolt
Publication venue
Publication date: 01/01/2022
Field of study

Although there are a couple of open-source language processing pipelines available for Hungarian, none of them satisfies the requirements of today’s NLP applications. A language processing pipeline should consist of close to state-of-the-art lemmatization, morphosyntactic analysis, entity recognition and word embeddings. Industrial text processing applications have to satisfy non-functional software quality requirements, what is more, frameworks supporting multiple languages are more and more favored. This paper introduces HuSpaCy, an industryready Hungarian language processing toolkit. The presented tool provides components for the most important basic linguistic analysis tasks. It is open-source and is available under a permissive license. Our system is built upon spaCy’s NLP components resulting in an easily usable, fast yet accurate application. Experiments confirm that HuSpaCy has high accuracy while maintaining resource-efficient prediction capabilities

University of Szeged

Infrastructure networks and the competitiveness of the economy

Author: Fleischer Tamás
Publication venue: Ministry of Finance of Hungary
Publication date: 01/01/2003
Field of study

This paper aims to examine how technical infrastructure networks may contribute to improving the competitiveness of the Hungarian economy. Consequently, our main question will be to establish how certain networks or sectors can promote competitiveness of the entire economy rather than how they could be more competitive in their own field. In the macroeconomic or regional sense competitiveness is interpreted as the entirety of safeguards and preconditions that provide a long term basis for success in a competitive market environment. The review of the economic, social, institutional and facility preconditions of competitiveness has highlighted that practically every component must be backed by a good system of relations: both strong, balanced internal relations promoting co-operation and external relations to assure outward linkages. Despite the above correlation, it would be a fallacy to assume that infrastructure networks as linking elements in general are factors per se improving competitiveness. In accordance with the level of development of the economy, the key forms of activity and the realistically attainable objectives, different linkages and service needs become key for the development of the economy in different stages

Munich RePEc Personal Archive

Repository of the Academy's Library

Policy Documentation Center

XVIII. Magyar Számítógépes Nyelvészeti Konferencia

Author
Publication venue: Szegedi Tudományegyetem TTIK Informatikai Intézet
Publication date: 01/01/2022
Field of study

University of Szeged

Lightweight diacritics restoration for V4 languages

Author: Csanády Bálint
Lukács András
Publication venue
Publication date: 01/01/2022
Field of study

Diacritics restoration became a ubiquitous task in the Latinalphabet-based English-dominated Internet language environment. In this article, we describe a small footprint 1D convolution-based approach, which works on character-level. The model even runs locally in a web browser, and surpasses the performance of similarly sized models. We evaluate our model on the languages of the Visegrád Group, with emphasis on Hungarian

University of Szeged

Towards abstractive summarization in Hungarian

Author: Indig Balázs
Makrai Márton
Szaszák György
Tündik Máté Ákos
Publication venue
Publication date: 01/01/2022
Field of study

We publish an abstractive summarizer for Hungarian, an encoder-decoder model initialized with huBERT, and fine-tuned on the ELTE.DH corpus of former Hungarian news portals. The model produces fluent output in the correct topic, but it hallucinates frequently. Our quantitative evaluation on automatic and human transcripts of news (with automatic and human-made punctuation) shows that the model is robust with respect to errors in either automatic speech recognition or automatic punctuation restoration

University of Szeged

Prema novom jednojezičnom mađarskom objasnidbenom rječniku: pregled mađarskih objasnidbenih rječnika

Author: Lipp Veronika
László Simon
Publication venue: 'Leksikografski zavod Miroslav Krleza'
Publication date: 01/01/2021
Field of study

The Lexical Knowledge Representation Research Group at the Department of Lexicology is one of the youngest research groups of the Hungarian Research Centre for Linguistics, founded in February 2020. The group is currently working on a new version of a monolingual explanatory dictionary partly based on The Explanatory Dictionary of the Hungarian Language. The aim is to compile an up-to-date online dictionary of contemporary Hungarian (2001–2020) by corpus-driven methods. The present article describes The Explanatory Dictionary of the Hungarian Language and the Comprehensive Dictionary of Hungarian by presenting their history, the circumstances of their compilation, and the basic editorial guidelines. Then it outlines how the corpus for the planned dictionary is to be set up and how this corpus is to be analysed.Istraživačka skupina za prikaz leksičkog znanja jedna je od najmlađih istraživačkih skupina Mađarskog istraživačkog centra za lingvistiku, osnovana u veljači 2020. Skupina trenutno radi na novoj inačici jednojezičnoga objasnidbenog rječnika proizišloga iz Objasnidbenoga rječnika mađarskog jezika. Cilj joj je kompilirati moderan i ažuriran mrežni rječnik mađarskog jezika (2001–2020) koristeći se korpusom vođenim metodama. Članak opisuje Objasnidbeni rječnik mađarskog jezika i Velikog rječnika mađarskog jezika predstavljanjem njihove povijesti, okolnosti u kojima su kompilirani, te osnovnih uredničkih načela. Potom skicira kako će se organizirati i analizirati korpus planiranoga rječnika

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia