    Simple_PLUS: una red de relaciones léxico-semánticas

    Este artículo trata de la base de datos léxico-semántica del italiano, Simple_PLUS, y particularmente de su núcleo central: la red de relaciones léxico-semánticas. Este recurso lexical tiene como base Parole-Simple-Clips, un léxico electrónico con cuatro niveles de descripción, elaborado según el modelo SIMPLE. Simple_PLUS se compone de 30.000 entradas semánticas, sean importadas del léxico fuente, sean recién creadas, todas dotadas de un amplio conjunto de información proporcionado por el modelo subyacente. En Simple_PLUS, aquella representación semántica fue enriquecida con una información relacional esencial, en un proceso semiautomático. Mas de 5.000 lazos que relacionan los eventos con sus participantes y los co-participantes entre sí ─ vínculos que no podían ser descritos antes por falta de medios de representación adecuados ─ fueron codificados mediante un vocabulario descriptivo apropiado, que fue prestado del modelo EuroWordNet. Estos lazos conceptuales, que enriquecen la representación predicativa del léxico, aportan un conocimiento lexical imprescindible para las tareas de PLN y la Web semántica.The present article deals with the Italian lexical-semantic database Simple_PLUS and focuses on its essential core, i.e. the network of lexical semantic relations. This lexical resource builds on Parole-Simple-Clips, a four-layered electronic lexicon of Italian, founded on the SIMPLE model. Simple_PLUS consists of 30,000 semantic entries, partly imported from the source lexicon and partly newly created, but all encoding a wide-ranging set of information provided by the underpinning model. In Simple_PLUS, this semantic representation has been enriched with significant relational information, in a largely automated, inexpensive process. More than 5,000 relationships between events and their participants and among co-participants in events, links which were not capturable previously through lack of suitable representational means, have been encoded with the appropriate descriptive vocabulary borrowed from the EuroWordNet lexical model. Such conceptual links, which efficiently enhance the predicative representation in the lexicon, provide crucial lexical knowledge for NLP systems and for the Semantic Web

    Kvalijos struktūrų modifikacija vartojant santykinius lietuvių, rusų ir vokiečių kalbos būdvardžius

    Relative adjectives are immediate nominal explicators (nouns) that play a key role in meaningful structures in the Russian, Lithuanian, and German languages. This article investigates the semantic representation of a noun in an attribute group with relative adjectives comparatively using the Qualia structure and its modifications. The most commonly used 150 relative adjectives in the electronic corpora of the Russian written language were selected for analysis. They are compared with Lithuanian and German examples. Relative adjectives are classified as quality structures and are considered to imply objective constitutive properties (matter and origin), formal attributes (physical parameters, colour, time, and place) and telic attributes. Other correlating linguistic units, namely, genitive constructs and composites, are also analysed describing the expected realizations of Qualia structures in the noun group.Rusų, lietuvių ir vokiečių kalbose santykiniai būdvardžiai yra tiesioginiai pagrindinį vaidmenį prasminėse struktūrose atliekančių vardažodžių (daiktavardžių) eksplikatoriai. Šiame straipsnyje lyginamuoju būdu tiriama semantinė daiktavardžio reprezentacija atributiniame junginyje, panaudojant kvalijos struktūrą (Qualia structure) ir jos modifikacijas. Analizei pasirinkta 150 dažniausiai vartojamų santykinių būdvardžių, nurodytų elektroniniame rusų rašytinės kalbos tekstyne. Jie lyginami su lietuvių ir vokiečių kalbų pavyzdžiais. Santykiniai būdvardžiai klasifikuojami kaip kvalijos struktūros ir laikoma, kad jie gali reikšti objektyvias konstitucines savybes (medžiagą ir kilmę), formaliąsias savybes (fiziniai parametrai, spalva, laikas, vieta) ir tikslines savybes. Aprašant tikėtinas kvalijos struktūrų realizacijas atributiniame junginyje atliekama analizė ir kitų koreliuojančių kalbinių vienetų, būtent genityvinių atributų ir dūrinių

    Selective Sampling for Example-based Word Sense Disambiguation

    This paper proposes an efficient example sampling method for example-based word sense disambiguation systems. To construct a database of practical size, a considerable overhead for manual sense disambiguation (overhead for supervision) is required. In addition, the time complexity of searching a large-sized database poses a considerable problem (overhead for search). To counter these problems, our method selectively samples a smaller-sized effective subset from a given example set for use in word sense disambiguation. Our method is characterized by the reliance on the notion of training utility: the degree to which each example is informative for future example sampling when used for the training of the system. The system progressively collects examples by selecting those with greatest utility. The paper reports the effectiveness of our method through experiments on about one thousand sentences. Compared to experiments with other example sampling methods, our method reduced both the overhead for supervision and the overhead for search, without the degeneration of the performance of the system.Comment: 25 pages, 14 Postscript figure

    Mapping Events and Abstract Entities from PAROLE-SIMPLE-CLIPS to ItalWordNet

    In the few last years, due to the increasing importance of the web, both computational tools and resources need to be more and more visible and easily accessible to a vast community of scholars, students and researchers. Furthermore, high quality lexical resources are crucially required for a wide range of HLT-NLP applications, among which word sense disambiguation. Vast and consistent electronic lexical resources do exist which can be further enhanced and enriched through their linking and integration. An ILC project dealing with the link of two large lexical semantic resources for the Italian language, namely ItalWordNet and PAROLE-SIMPLE-CLIPS, fits this trend. Concrete entities were already linked and this paper addresses the semi-automatic mapping of events and abstract entities. The lexical models of the two resources, the mapping strategy and the tool that was implemented to this aim are briefly outlined. Special focus is put on the results of the linking process: figures are reported and examples are given which illustrate both the linking and harmonization of the resources but also cases of discrepancies, mainly due to the different underlying semantic models

    Simple_PLUS: a network of lexical semantic relations Simple_PLUS: una red de relaciones l?xico-sem?nticas

    Linking and Integrating two Electronic Lexicons

    Lexicography, much attention is being paid, when building lexical resources, to their interoperability and their easy integration in HLT-NLP applications for an enhanced performance. Concerning already existing computational lexicons, on the other hand, their integration and interoperability is attainable, provided their main features offer a field of comparison. The two largest and extensively encoded electronic lexicons of Italian language fulfill this essential requirement. Although developed according to two different lexical models, ItalWordNet and PAROLE-SIMPLE-CLIPS present in fact many compatible aspects. Linking and eventually merging these lexical resources in a common representation framework seems therefore a wise move to offer the end-user a more exhaustive and in-depth lexical information combining the potentialities and most outstanding features offered by the two lexical models. This paper reports on the ongoing linking of the two lexicons. The mapping of the ontologies on which basis the lexicons are structured is described; an overview of the adopted methodology, of the linking process and of the results of the first mapping phase regarding 1stOrder Entities is provided. Reciprocal benefits and enhancements for the two resources are also illustrated that definitely justify the soundness of our linking initiative

    Scientific Terminology in Technical Texts: Linguodidactic Aspect

    In modern didactics, there is a theory of competent approach to teaching and teaching educational disciplines. The purpose of this article is to consider the formation of linguistic and professional competence among students when teaching terminological vocabulary of a non-native language on the material of technical specialty texts.The communicative method regarded as the main method does not solve the problems of teaching the scientific style of a non-native language. Therefore, several approaches should be combined in teaching the professional Russian language: structural, aimed at assimilating the norms of the language, linguadidactic, representing methodological techniques, communicative, involving an understanding of the context, situation, competent, related to the connection of professional and linguistic knowledge, etc. Such multidimensional work with terminology contributes to the coverage of a large number of issues aimed at a detailed understanding of the scientific text and the role of terms. The formal organization of the scientific text is considered as a surface level that exposes the deep semantic level.To implement the tasks of didactic, linguistic, pragmatic, text and tasks are presented in which general scientific and technical terms are used. Logical diagrams are proposed to be complied in order to test the ability to understand professionally-oriented text, vocabulary, and grammatical forms, retell texts by specialty and talk about professional topics. Work with the text is based on the native language and takes into account the terminology field, which is the text.В современной дидактике существует теория компетентностного подхода к обучению и преподаванию учебных дисциплин. Целью данной статьи является рассмотрение вопросов формирования языковой и профессиональной компетенции у студентов при обучении терминологической лексике неродного языка на материале текстов технической специальности.Провозглашаемый в качестве основного коммуникативный метод не решает проблемы обучения научному стилю неродного языка. Поэтому в преподавании профессионального русского языка должны соединяться несколько подходов: системно-структурный, направленный на усвоение норм языка, лингводидактический, представляющий методические приемы, коммуникативный, предполагающий понимание контекста, ситуации, компетентностный, связанный с вопросами соединения профессиональных и языковых знаний, и др. Такая многоаспектная работа с терминологией способствует охвату большого числа вопросов, направленных на детальное понимание научного текста и роли терминов в нем. Формальная организация научного текста рассматривается как поверхностный уровень, эксплицирующий глубинный смысловой уровень.Для реализации задач дидактических, лингвистических, прагматических предлагаются текст и задания, в которых употребляются общенаучные и технические термины, предлагается составление логических схем с целью проверки уровня понимания профессионально ориентированного текста, лексики и грамматических форм. Устная речь формируется посредством пересказа текста, беседы на профессиональные темы. Терминологическая работа строится с опорой на родной язык и с учетом терминологического поля, каковым является текст