Search CORE

1,329 research outputs found

Vorosweep: a fast generalized crystal growing Voronoi diagram generation algorithm

Author: Béchet Eric
Mouton Thibaud
Publication venue
Publication date: 01/01/2014
Field of study

We propose a new algorithm for generating quickly approximate generalized Voronoi diagrams of point sites associated to arbitrary convex distance metric in the Euclidian plane. This algorithm produces connected cells by emulating the growth of crystals starting at the point sites, in order to reduce the complexity of the diagram. The main practical contribution is the Vorosweep package which is the reference implementation of the algorithm. Experimental results and benchmarks are given to demonstrate the versatility of this approach.WIST 3 grant 1017074 DOMHEX (Dominant Hexahedral Mesh Generation

Open Repository and Bibliography - Liège

Benchmarking benchmarks: introducing new automatic indicators for benchmarking Spoken Language Understanding corpora

Author: Béchet Frédéric
Raymond Christian
Publication venue: HAL CCSD
Publication date: 14/09/2019
Field of study

International audienceEmpirical evaluation is nowadays the main evaluation paradigm in Natural Language Processing for assessing the relevance of a new machine-learning based model. If large corpora are available for tasks such as Automatic Speech Recognition , this is not the case for other tasks such as Spoken Language Understanding (SLU), consisting in translating spoken transcriptions into a formal representation often based on semantic frames. Corpora such as ATIS or SNIPS are widely used to compare systems, however differences in performance among systems are often very small, not statistically significant , and can be produced by biases in the data collection or the annotation scheme, as we presented on the ATIS corpus ("Is ATIS too shallow?, IS2018"). We propose in this study a new methodology for assessing the relevance of an SLU corpus. We claim that only taking into account systems performance does not provide enough insight about what is covered by current state-of-the-art models and what is left to be done. We apply our methodology on a set of 4 SLU systems and 5 benchmark corpora (ATIS, SNIPS, M2M, MEDIA) and automatically produce several indicators assessing the relevance (or not) of each corpus for benchmarking SLU models

Crossref

HAL AMU

Fouille de texte : une approche séquentielle pour découvrir des relations spatiales

Author: Alatrista Salas Hugo
Béchet Nicolas
Publication venue: HAL CCSD
Publication date: 01/01/2014
Field of study

National audienceDans cet article, nous présentons les premières étapes d'un projet de fouille de données textuelles. Plus précisément, nous appliquons un algorithme d'extraction de motifs séquentiels sous contraintes multiples afin d'identifier des relations entre entités spatiales. Les premiers résultats obtenus montrent l'intérêt de l'utilisation de cette approche et ses limites. Dans cet article, nous détaillons les premières bases de travaux plus ambitieux dont l'objectif est d'apporter des informations cruciales permettant de compléter l'analyse des images satellitaires

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

HAL-CIRAD

HAL-Rennes 1

Learnability of Pregroup Grammars

Author: Béchet Denis
Foret Annie
Tellier Isabelle
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

International audienceThis paper investigates the learnability by positive examples in the sense of Gold of Pregroup Grammars. In a first part, Pregroup Grammars are presented and a new parsing strategy is proposed. Then, theoretical learnability and non-learnability results for subclasses of Pregroup Grammars are proved. In the last two parts, we focus on learning Pregroup Grammars from a special kind of input called feature-tagged examples. A learning algorithm based on the parsing strategy presented in the first part is given. Its validity is proved and its properties are examplified

HAL-CentraleSupelec

HAL - Lille 3

CiteSeerX

INRIA a CCSD electronic archive server

HAL-Rennes 1

The effect of Time Scales in Photosynthesis on microalgae Productivity

Author: Bernard Olivier
Béchet Quentin
Hartmann Philipp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

International audienceMicroalgae are often seen as a potential biofuel producer. In order to predict achievable productivities in the so called raceway culturing system, the dy- namics of photosynthesis has to be taken into account. In particular, the dynami- cal effect of inhibition by an excess of light (photoinhibition) must be represented. We propose a model considering both photosynthesis and growth dynamics. This model involves three different time scales. We study the response of this model to uctuating light with different frequencies by slow/fast approximations. Therefore, we identify three different regimes for which a simplified expression for the model can be derived. These expressions give a hint on productivity improvement which can be expected by stimulating photosynthesis with a faster hydrodynamics

INRIA a CCSD electronic archive server

HAL-INSU

Coopération de méthodes statistiques et symboliques pour l'adaptation non-supervisée d'un système d'étiquetage en entités nommées

Author: Béchet Frédéric
Sagot Benoît
Stern Rosa
Publication venue: HAL CCSD
Publication date: 27/06/2011
Field of study

International audienceNamed entity recognition and typing is achieved both by symbolic and probabilistic systems. We report on an experiment for making the rule-based system NP, a high-precision system developed on AFP news corpora and relies on the Aleda named entity database, interact with LIANE, a high-recall probabilistic system trained on oral transcriptions from the ESTER corpus. We show that a probabilistic system such as LIANE can be adapted to a new type of corpus in a non-supervized way thanks to large-scale corpora automatically annotated by NP. This adaptation does not require any additional manual anotation and illustrates the complementarity between numeric and symbolic techniques for tackling linguistic tasks.La détection et le typage des entités nommées sont des tâches pour lesquelles ont étéd éveloppés à la fois des systèmes symboliques et probabilistes. Nous présentons les résultats d'une expérience visant à faire interagir le système à base de règles NP, développé sur des corpus provenant de l'AFP, intégrant la base d'entités Aleda et qui a une bonne précision, et le système LIANE, entraîné sur des transcriptions de l'oral provenant du corpus ESTER et qui a un bon rappel. Nous montrons qu'on peut adapter à un nouveau type de corpus, de manière non supervisée, un système probabiliste tel que LIANE grâce à des corpus volumineux annotés automatiquement par NP. Cette adaptation ne nécessite aucune annotation manuelle supplémentaire et illustre la complémentarité des méthodes numériques et symboliques pour la résolution de tâches linguistiques

HAL AMU

INRIA a CCSD electronic archive server

Hal-Diderot

Comparing Sanskrit Texts for Critical Editions: the sequences move problem

Author: Béchet Nicolas
Csernel Marc
Le Pouliquen Marc
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceA critical edition takes into account various versions of the same text in order to show the differences between two distinct versions, in terms of words that have been missing, changed, omitted or displaced. Traditionally, Sanskrit is written without spaces between words, and the word order can be changed without altering the meaning of a sentence. This paper describes the characteristics which make Sanskrit text comparisons a specific matter. It presents two different methods for comparing Sanskrit texts, which can be used to develop a computer assisted critical edition. The first one method uses the L.C.S., while the second one uses the global alignment algorithm. Comparing them, we see that the second method provides better results, but that neither of these methods can detect when a word or a sentence fragment has been moved. We then present a method based on N-gram that can detect such a movement when it is not too far from its original location. We will see how the method behaves on several examples and look for future possible developments

HAL - Normandie Université

INRIA a CCSD electronic archive server

HAL-Université de Bretagne Occidentale

Label Pre-annotation for Building Non-projective Dependency Treebanks for French

Author: Boudin Florian
Béchet Denis
Lacroix Ophélie
Publication venue: HAL CCSD
Publication date: 06/04/2014
Field of study

posterInternational audienceThe current interest in accurate dependency parsing make it necessary to build dependency treebanks for French containing both projective and non-projective dependencies. In order to alleviate the work of the annotator, we propose to automatically pre-annotate the sentences with the labels of the dependencies ending on the words. The selection of the dependency labels reduces the ambiguity of the parsing. We show that a maximum entropy Markov model method reaches the label accuracy score of a standard dependency parser (MaltParser). Moreover, this method allows to find more than one label per word, i.e. the more probable ones, in order to improve the recall score. It improves the quality of the parsing step of the annotation process. Therefore, the inclusion of the method in the process of annotation makes the work quicker and more natural to annotators

Adapting a FrameNet Semantic Parser for Spoken Language Understanding Using Adversarial Learning

Author: Béchet Frédéric
Damnati Geraldine
Marzinotto Gabriel
Publication venue: 'International Speech Communication Association'
Publication date: 15/09/2019
Field of study

International audienceThis paper presents a new semantic frame parsing model, based on Berkeley FrameNet, adapted to process spoken documents in order to perform information extraction from broadcast contents. Building upon previous work that had shown the effectiveness of adversarial learning for domain generalization in the context of semantic parsing of encyclopedic written documents, we propose to extend this approach to elocutionary style generalization. The underlying question throughout this study is whether adversarial learning can be used to combine data from different sources and train models on a higher level of abstraction in order to increase their robustness to lexical and stylistic variations as well as automatic speech recognition errors. The proposed strategy is evaluated on a French corpus of encyclopedic written documents and a smaller corpus of radio podcast transcriptions, both annotated with a FrameNet paradigm. We show that adversarial learning increases all models generalization capabilities both on manual and automatic speech transcription as well as on encyclopedic data

arXiv.org e-Print Archive

HAL AMU