61 research outputs found
Towards a Frame Semantics Lexical Resource for Greek
Proceedings of the Sixth International Workshop on Treebanks and
Linguistic Theories.
Editors: Koenraad De Smedt, Jan Hajič and Sandra Kübler.
NEALT Proceedings Series, Vol. 1 (2007), 55-59.
© 2007 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/4476
Prema novom jednojezičnom mađarskom objasnidbenom rječniku: pregled mađarskih objasnidbenih rječnika
The Lexical Knowledge Representation Research Group at the Department of Lexicology is one of the youngest research groups of the Hungarian Research Centre for Linguistics, founded in February 2020. The group is currently working on a new version of a monolingual explanatory dictionary partly based on The Explanatory Dictionary of the Hungarian Language. The aim is to compile an up-to-date online dictionary of contemporary Hungarian (2001–2020) by corpus-driven methods.
The present article describes The Explanatory Dictionary of the Hungarian Language and the Comprehensive Dictionary of Hungarian by presenting their history, the circumstances of their compilation, and the basic editorial guidelines. Then it outlines how the corpus for the planned dictionary is to be set up and how this corpus is to be analysed.Istraživačka skupina za prikaz leksičkog znanja jedna je od najmlađih istraživačkih skupina Mađarskog istraživačkog centra za lingvistiku, osnovana u veljači 2020. Skupina trenutno radi na novoj inačici jednojezičnoga objasnidbenog rječnika proizišloga iz Objasnidbenoga rječnika mađarskog jezika. Cilj joj je kompilirati moderan i ažuriran mrežni rječnik mađarskog jezika (2001–2020) koristeći se korpusom vođenim metodama. Članak opisuje Objasnidbeni rječnik mađarskog jezika i Velikog rječnika mađarskog jezika predstavljanjem njihove povijesti, okolnosti u kojima su kompilirani, te osnovnih uredničkih načela. Potom skicira kako će se organizirati i analizirati korpus planiranoga rječnika
Syntaxe computationnelle du hongrois : de l'analyse en chunks à la sous-catégorisation verbale
We present the creation of two resources for Hungarian NLP applications: a rule-based shallow parser and a database of verbal subcategorization frames. Hungarian, as a non-configurational language with a rich morphology, presents specific challenges for NLP at the level of morphological and syntactic processing. While efficient and precise morphological analyzers are already available, Hungarian is under-resourced with respect to syntactic analysis. Our work aimed at overcoming this problem by providing resources for syntactic processing. Hungarian language is characterized by a rich morphology and a non-configurational encoding of grammatical functions. These features imply that the syntactic processing of Hungarian has to rely on morphological features rather than on constituent order. The broader interest of our undertaking is to propose representations and methods that are adapted to these specific characteristics, and at the same time are in line with state of the art research methodologies. More concretely, we attempt to adapt current results in argument realization and lexical semantics to the task of labeling sentence constituents according to their syntactic function and semantic role in Hungarian. Syntax and semantics are not completely independent modules in linguistic analysis and language processing: it has been known for decades that semantic properties of words affect their syntactic distribution. Within the syntax-semantics interface, the field of argument realization deals with the (partial or complete) prediction of verbal subcategorization from semantic properties. Research on verbal lexical semantics and semantically motivated mapping has been concentrating on predicting the syntactic realization of arguments, taking for granted (either explicitly or implicitly) that the distinction between arguments and adjuncts is known, and that adjuncts' syntactic realization is governed by productive syntactic rules, not lexical properties. However, besides the correlation between verbal aspect or actionsart and time adverbs (e.g. Vendler, 1967 or Kiefer, 1992 for Hungarian), the distribution of adjuncts among verbs or verb classes did not receive significant attention, especially within the lexical semantics framework. We claim that contrary to the widely shared presumption, adjuncts are often not fully productive. We therefore propose a gradual notion of productivity, defined in relation to Levin-type lexical semantic verb classes (Levin, 1993; Levin and Rappaport-Hovav, 2005). The definition we propose for the argument-adjunct dichotomy is based on evidence from Hungarian and exploits the idea that lexical semantics not only influences complement structure but is the key to the argument-adjunct distinction and the realization of adjunctsLa linguistique informatique est un domaine de recherche qui se concentre sur les méthodes et les perspectives de la modélisation formelle (statistique ou symbolique) de la langue naturelle. La linguistique informatique, tout comme la linguistique théorique, est une discipline fortement modulaire : les niveaux d'analyse linguistique comprennent la segmentation, l'analyse morphologique, la désambiguïsation, l'analyse syntaxique et sémantique. Tandis qu'un nombre d'outils existent déjà pour les traitements de bas niveau (analyse morphologique, étiquetage grammatical), le hongrois peut être considéré comme une langue peu doté pour l'analyse syntaxique et sémantique. Le travail décrit dans la présente thèse vise à combler ce manque en créant des ressources pour le traitement syntaxique du hongrois : notamment, un analyseur en chunks et une base de données lexicale de schémas de sous-catégorisation verbale. La première partie de la recherche présentée ici se concentre sur la création d'un analyseur syntaxique de surface (ou analyseur en chunks) pour le hongrois. La sortie de l'analyseur de surface est conçue pour servir d'entrée pour un traitement ultérieur visant à annoter les relations de dépendance entre le prédicat et ses compléments essentiels et circonstanciels. L'analyseur profond est mis en œuvre dans NooJ (Silberztein, 2004) en tant qu'une cascade de grammaires. Le deuxième objectif de recherche était de proposer une représentation lexicale pour la structure argumentale en hongrois. Cette représentation doit pouvoir gérer la vaste gamme de phénomènes qui échappent à la dichotomie traditionnelle entre un complément essentiel et un circonstanciel (p. ex. des structures partiellement productives, des écarts entre la prédictibilité syntaxique et sémantique). Nous avons eu recours à des résultats de la recherche récente sur la réalisation d'arguments et choisi un cadre qui répond à nos critères et qui est adaptable à une langue non-configurationnelle. Nous avons utilisé la classification sémantique de Levin (1993) comme modèle. Nous avons adapté les notions relatives à cette classification, à savoir celle de la composante sémantique et celle de l'alternance syntaxique, ainsi que la méthodologie d'explorer et de décrire le comportement des prédicats à l'aide de cette représentation, à la tâche de construire une représentation lexicale des verbes dans une langue non-configurationnelle. La première étape consistait à définir les règles de codage et de construire un vaste base de données lexicale pour les verbes et leurs compléments. Par la suite, nous avons entrepris deux expériences pour l'enrichissement de ce lexique avec des informations sémantiques lexicales afin de formaliser des généralisations syntaxiques et sémantiques pertinentes sur les classes de prédicats sous-jacentes. La première approche que nous avons testée consistait en une élaboration manuelle de classification de verbes en fonction de leur structure de compléments et de l'attribution de rôles sémantiques à ces compléments. Nous avons cherché la réponse aux questions suivantes: quelles sont les composants sémantiques pertinents pour définir une classification sémantique des prédicats hongrois? Quelles sont les implications syntaxiques spécifiques à ces classes? Et, plus généralement, quelle est la nature des alternances spécifiques aux classes verbales en hongrois ? Dans la phase finale de la recherche, nous avons étudié le potentiel de l'acquisition automatique pour extraire des classes de verbes à partir de corpus. Nous avons effectué une classification non supervisée, basée sur des données distributionnelles, pour obtenir une classification sémantique pertinente des verbes hongrois. Nous avons également testé la méthode de classification non supervisée sur des données françaises
Cappadocian kinship
Cappadocian kinship systems are very interesting from a sociolinguistic and anthropological perspective because of the mixture of inherited Greek and borrowed Turkish kinship terms. Precisely because the number of Turkish kinship terms differs from one variety to another, it is necessary to talk about Cappadocian kinship systems in the plural rather than about the Cappadocian kinship system in the singular. Although reference will be made to other Cappadocian varieties, this paper will focus on the kinship systems of Mišotika and Aksenitika, the two Central Cappadocian dialects still spoken today in several communities in Greece. Particular attention will be given to the use of borrowed Turkish kinship terms, which sometimes seem to co-exist together with their inherited Greek counterparts, e.g. mána vs. néne ‘mother’, ailfó/aelfó vs. γardáš ‘brother’ etc. In the final part of the paper some kinship terms with obscure or hitherto unknown etymology will be discussed, e.g. káka ‘grandmother’, ižá ‘aunt’, lúva ‘uncle (father’s brother)’ etc
New directions in corpus-based translation studies
Corpus-based translation studies has become a major paradigm and research
methodology and has investigated a wide variety of topics in the last two
decades. The contributions to this volume add to the range of corpus-based
studies by providing examples of some less explored applications of corpus
analysis methods to translation research. They show that the area keeps
evolving as it constantly opens up to different frameworks and approaches,
from appraisal theory to process-oriented analysis, and encompasses multiple
translation settings, including (indirect) literary translation,
machine(-assisted) translation and the practical work of professional legal
translators. The studies included in the volume also expand the range of
application of corpus applications in terms of the tools used to accomplish
the research tasks outlined
New directions in corpus-based translation studies
Corpus-based translation studies has become a major paradigm and research methodology and has investigated a wide variety of topics in the last two decades. The contributions to this volume add to the range of corpus-based studies by providing examples of some less explored applications of corpus analysis methods to translation research. They show that the area keeps evolving as it constantly opens up to different frameworks and approaches, from appraisal theory to process-oriented analysis, and encompasses multiple translation settings, including (indirect) literary translation, machine(-assisted) translation and the practical work of professional legal translators. The studies included in the volume also expand the range of application of corpus applications in terms of the tools used to accomplish the research tasks outlined
New directions in corpus-based translation studies
Corpus-based translation studies has become a major paradigm and research methodology and has investigated a wide variety of topics in the last two decades. The contributions to this volume add to the range of corpus-based studies by providing examples of some less explored applications of corpus analysis methods to translation research. They show that the area keeps evolving as it constantly opens up to different frameworks and approaches, from appraisal theory to process-oriented analysis, and encompasses multiple translation settings, including (indirect) literary translation, machine(-assisted) translation and the practical work of professional legal translators. The studies included in the volume also expand the range of application of corpus applications in terms of the tools used to accomplish the research tasks outlined
- …