Search CORE

773 research outputs found

Building diachronical reference corpora for the French language

Author: Lavrentiev Alexei
Publication venue: SPBGU
Publication date: 25/06/2013
Field of study

International audienceВ докладе представлены проблемы создания референтных диахронических корпусов французского языка пути их решения, предложенные несколькими взаимосвязанными проектами: Репрезентативный корпус первых французских текстов (CoRPTeF), Большая историческая грамматика французского языка (GGHF) и Эволюция системы предлогов во французском языке (PRESTO). Многие из этих проблем релевантны для проекта референтного корпуса любого языка, охватывающего широкую диахроническую перспективу.This paper deals with the problems in creating diachronical reference corpora of the French language and with their solutions proposed by several interconnected projects: the Representative Corpus of the First French Texts (CoRPTeF), the Big Historical Grammar of French (GGHF) and the Evolution of the French Prepositional System (PRESTO). Many of these problems are relevant for a project of a reference corpus of any language including a large diachronical dimension

HAL-ENS-LYON

HAL

A Tag For Punctuation

Author: Lavrentiev Alexei
Publication venue: HAL CCSD
Publication date: 06/11/2008
Field of study

In this paper I will argue that it may be useful to introduce a special tag (e.g. ) for punctuation marks, which could join the TEI Analysis module along with , , and other "segLike" elements. I will first discuss the reasons why punctuation marks may need tagging, and then consider the TEI tags that might be used for that purpose. None of them appears to be perfect for this job. After discussing linguistic properties of punctuation marks, I will propose a tentative formal definition for the element

HAL-ENS-LYON

HAL

Développement de la Base de français médiéval : qualité philologique, ouverture et outillage textométrique

Author: Lavrentiev Alexei
Publication venue: HAL CCSD
Publication date: 16/09/2015
Field of study

International audienceThis paper presents three key aspects in the development of the Base de Français Médiéval Old French text corpus: the quality of data, open source policy for texts and software, and improvement of tools for reading, searching and and analyzing the texts of the corpus.Cette communication présente trois aspects fondamentaux du développement de la Base de français médiéval : la qualité des données pour la recherche, la politique de diffusion ouverte des textes et des outils d’analyse et l’amélioration des outils de lecture, de recherche et d’analyse des textes du corpus.В докладе представлены три ключевых аспекта развития Базы средневекового французского языка (BFM): повышение качества данных, открытость текстов и программного обеспечения и улучшение инструментов для чтения, поиска и анализа текстов корпуса

HAL-ENS-LYON

HAL

Magnetization patterns in ferromagnetic nano-elements as functions of complex variable

Author: A. Aharoni
A. A. Belavin
Konstantin L. Metlov
M. A. Lavrentiev
Publication venue: 'American Physical Society (APS)'
Publication date: 27/07/2010
Field of study

Assumption of certain hierarchy of soft ferromagnet energy terms, realized in small enough flat nano-elements, allows to obtain explicit expressions for their magnetization distributions. By minimising the energy terms sequentially, from most to the least important, magnetization distributions are expressed as solutions of Riemann-Hilbert boundary value problem for a function of complex variable. A number of free parameters, corresponding to positions of vortices and anti-vortices, still remain in the expression. These parameters can be found by computing and minimizing the total magnetic energy of the particle with no approximations. Thus, the presented approach is a factory of realistic Ritz functions for analytical micromagnetic calculations. These functions are so versatile, that they may even find applications on their own (e.g. for fitting magnetic microscopy images). Examples are given for multi-vortex magnetization distributions in circular cylinder, and for 2-dimensional domain walls in thin magnetic strips.Comment: 4 pages, 3 figures, 2 refs added, fixed typo

arXiv.org e-Print Archive

Crossref

From the Holy Grail to the Good Health: a Digital Edition of a 15th Century French Medical Treatise on the BFM Web Portal

Author: Lavrentiev Alexei
Markova Elena
Publication venue: Cyrillo-Methodian Research Center and Izhevsk University
Publication date: 15/09/2014
Field of study

International audienceThis paper presents a project of a digital edition of the medical treatise entitled L'enseignement ou la manière de garder et conserver la santé (Treatise on the preservation of health), which is translated into Middle French from the Latin work by Guido Parato entitled Libellus de sanitate conservanda (1459). The publication is based on the manuscript St. Petersburg, Russian National Library, Fr.Q.v.VI.1. Methodological principles and technological solutions used in the publication have been developed in the "Quest of the Holy Grail" digital edition project, and are adopted for the future Base de Français Médiéval digital library collection (BFM, http://txm.bfm-corpus.org).В докладе представлен проект электронного издания медицинского трактата "L'enseignement ou la manière de garder et conserver la santé" ("Трактат о сохранении здоровья"), представляющего собой перевод на среднефранцузский язык латинского произведения Гвидо Парато "Libellus de sanitate conservanda" (1459). Издание готовится на основе рукописи СПб., РНБ, Fr.Q.v.VI.1. Методологические принципы и технологические решения, используемые в издании, разработаны в рамках проекта издания "Поиска святого Грааля", а в перспективе издание должно войти в коллекцию электронных текстов Базы средневекового французского (BFM, http://txm.bfm-corpus.org)

HAL-ENS-LYON

HAL

The TXM Portal Software giving access to Old French Manuscripts Online

Author: Heiden Serge
Lavrentiev Alexei
Publication venue: HAL CCSD
Publication date: 21/05/2012
Field of study

Texte intégral en ligne : http://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdfInternational audiencehttp://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdf This paper presents the new TXM software platform giving online access to Old French Text Manuscripts images and tagged transcriptions for concordancing and text mining. This platform is able to import medieval sources encoded in XML according to the TEI Guidelines for linking manuscript images to transcriptions, encode several diplomatic levels of transcription including abbreviations and word level corrections. It includes a sophisticated tokenizer able to deal with TEI tags at different levels of linguistic hierarchy. Words are tagged on the fly during the import process using IMS TreeTagger tool with a specific language model. Synoptic editions displaying side by side manuscript images and text transcriptions are automatically produced during the import process. Texts are organized in a corpus with their own metadata (title, author, date, genre, etc.) and several word properties indexes are produced for the CQP search engine to allow efficient word patterns search to build different type of frequency lists or concordances. For syntactically annotated texts, special indexes are produced for the Tiger Search engine to allow efficient syntactic concordances building. The platform has also been tested on classical Latin, ancient Greek, Old Slavonic and Old Hieroglyphic Egyptian corpora (including various types of encoding and annotations)

HAL-ENS-LYON

Метод структурных схем компьютерного морфологического анализа словоформ естественного языка

Author: Chepovskiy Andrey
Egorova Elena
Lavrentiev Alexei
Publication venue: Открытые системы
Publication date: 20/02/2015
Field of study

International audiencehttp://mech.math.msu.su/~fpm/ps/k14/k143/k14303.pdfIn this paper, a computerized model for morphological analysis of languages with word-formation based on affixation processes is proposed. The main idea consists in defining structural patterns of words and corresponding lists of suffixes. First, a detaileddescription of a stemming algorithm, its modification, and the technique of determining grammatical characteristics of word-forms are given. The next part of this work focuses on the application of the proposed algorithms for the French language. Finally, some results of execution of these algorithms are provided.http://mech.math.msu.su/~fpm/ps/k14/k143/k14303.pdfВ работе предлагается метод структурных схем в качестве модели морфологического анализа словоформ естественного языка с развитым аффиксальным словообразованием и словоизменением. Дано описание алгоритма выделения псевдоосновы, его модификация, а также алгоритм восстановления грамматических характеристик словоформ. Описано применение предложенного метода для анализа словоформ французского языка. Представлены результаты работы предложенных алгоритмов

HAL-ENS-LYON