773 research outputs found

    Building diachronical reference corpora for the French language

    Get PDF
    International audienceВ докладе представлены проблемы создания референтных диахронических корпусов французского языка пути их решения, предложенные несколькими взаимосвязанными проектами: Репрезентативный корпус первых французских текстов (CoRPTeF), Большая историческая грамматика французского языка (GGHF) и Эволюция системы предлогов во французском языке (PRESTO). Многие из этих проблем релевантны для проекта референтного корпуса любого языка, охватывающего широкую диахроническую перспективу.This paper deals with the problems in creating diachronical reference corpora of the French language and with their solutions proposed by several interconnected projects: the Representative Corpus of the First French Texts (CoRPTeF), the Big Historical Grammar of French (GGHF) and the Evolution of the French Prepositional System (PRESTO). Many of these problems are relevant for a project of a reference corpus of any language including a large diachronical dimension

    A Tag For Punctuation

    Get PDF
    In this paper I will argue that it may be useful to introduce a special tag (e.g. ) for punctuation marks, which could join the TEI Analysis module along with , , and other "segLike" elements. I will first discuss the reasons why punctuation marks may need tagging, and then consider the TEI tags that might be used for that purpose. None of them appears to be perfect for this job. After discussing linguistic properties of punctuation marks, I will propose a tentative formal definition for the element

    Développement de la Base de français médiéval : qualité philologique, ouverture et outillage textométrique

    Get PDF
    International audienceThis paper presents three key aspects in the development of the Base de Français Médiéval Old French text corpus: the quality of data, open source policy for texts and software, and improvement of tools for reading, searching and and analyzing the texts of the corpus.Cette communication présente trois aspects fondamentaux du développement de la Base de français médiéval : la qualité des données pour la recherche, la politique de diffusion ouverte des textes et des outils d’analyse et l’amélioration des outils de lecture, de recherche et d’analyse des textes du corpus.В докладе представлены три ключевых аспекта развития Базы средневекового французского языка (BFM): повышение качества данных, открытость текстов и программного обеспечения и улучшение инструментов для чтения, поиска и анализа текстов корпуса

    Magnetization patterns in ferromagnetic nano-elements as functions of complex variable

    Full text link
    Assumption of certain hierarchy of soft ferromagnet energy terms, realized in small enough flat nano-elements, allows to obtain explicit expressions for their magnetization distributions. By minimising the energy terms sequentially, from most to the least important, magnetization distributions are expressed as solutions of Riemann-Hilbert boundary value problem for a function of complex variable. A number of free parameters, corresponding to positions of vortices and anti-vortices, still remain in the expression. These parameters can be found by computing and minimizing the total magnetic energy of the particle with no approximations. Thus, the presented approach is a factory of realistic Ritz functions for analytical micromagnetic calculations. These functions are so versatile, that they may even find applications on their own (e.g. for fitting magnetic microscopy images). Examples are given for multi-vortex magnetization distributions in circular cylinder, and for 2-dimensional domain walls in thin magnetic strips.Comment: 4 pages, 3 figures, 2 refs added, fixed typo

    From the Holy Grail to the Good Health: a Digital Edition of a 15th Century French Medical Treatise on the BFM Web Portal

    Get PDF
    International audienceThis paper presents a project of a digital edition of the medical treatise entitled L'enseignement ou la manière de garder et conserver la santé (Treatise on the preservation of health), which is translated into Middle French from the Latin work by Guido Parato entitled Libellus de sanitate conservanda (1459). The publication is based on the manuscript St. Petersburg, Russian National Library, Fr.Q.v.VI.1. Methodological principles and technological solutions used in the publication have been developed in the "Quest of the Holy Grail" digital edition project, and are adopted for the future Base de Français Médiéval digital library collection (BFM, http://txm.bfm-corpus.org).В докладе представлен проект электронного издания медицинского трактата "L'enseignement ou la manière de garder et conserver la santé" ("Трактат о сохранении здоровья"), представляющего собой перевод на среднефранцузский язык латинского произведения Гвидо Парато "Libellus de sanitate conservanda" (1459). Издание готовится на основе рукописи СПб., РНБ, Fr.Q.v.VI.1. Методологические принципы и технологические решения, используемые в издании, разработаны в рамках проекта издания "Поиска святого Грааля", а в перспективе издание должно войти в коллекцию электронных текстов Базы средневекового французского (BFM, http://txm.bfm-corpus.org)

    The TXM Portal Software giving access to Old French Manuscripts Online

    Get PDF
    Texte intégral en ligne : http://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdfInternational audiencehttp://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdf This paper presents the new TXM software platform giving online access to Old French Text Manuscripts images and tagged transcriptions for concordancing and text mining. This platform is able to import medieval sources encoded in XML according to the TEI Guidelines for linking manuscript images to transcriptions, encode several diplomatic levels of transcription including abbreviations and word level corrections. It includes a sophisticated tokenizer able to deal with TEI tags at different levels of linguistic hierarchy. Words are tagged on the fly during the import process using IMS TreeTagger tool with a specific language model. Synoptic editions displaying side by side manuscript images and text transcriptions are automatically produced during the import process. Texts are organized in a corpus with their own metadata (title, author, date, genre, etc.) and several word properties indexes are produced for the CQP search engine to allow efficient word patterns search to build different type of frequency lists or concordances. For syntactically annotated texts, special indexes are produced for the Tiger Search engine to allow efficient syntactic concordances building. The platform has also been tested on classical Latin, ancient Greek, Old Slavonic and Old Hieroglyphic Egyptian corpora (including various types of encoding and annotations)

    Метод структурных схем компьютерного морфологического анализа словоформ естественного языка

    Get PDF
    International audiencehttp://mech.math.msu.su/~fpm/ps/k14/k143/k14303.pdfIn this paper, a computerized model for morphological analysis of languages with word-formation based on affixation processes is proposed. The main idea consists in defining structural patterns of words and corresponding lists of suffixes. First, a detaileddescription of a stemming algorithm, its modification, and the technique of determining grammatical characteristics of word-forms are given. The next part of this work focuses on the application of the proposed algorithms for the French language. Finally, some results of execution of these algorithms are provided.http://mech.math.msu.su/~fpm/ps/k14/k143/k14303.pdfВ работе предлагается метод структурных схем в качестве модели морфологического анализа словоформ естественного языка с развитым аффиксальным словообразованием и словоизменением. Дано описание алгоритма выделения псевдоосновы, его модификация, а также алгоритм восстановления грамматических характеристик словоформ. Описано применение предложенного метода для анализа словоформ французского языка. Представлены результаты работы предложенных алгоритмов
    corecore