Search CORE

51 research outputs found

Curriculum Learning for Handwritten Text Line Recognition

Author: Kermorvant Christopher
Louradour Jérôme
Publication venue
Publication date: 05/12/2013
Field of study

Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition. At the same time, learning RNN by gradient descent leads to slow convergence, and training times are particularly long when the training database consists of full lines of text. In this paper, we propose an easy way to accelerate stochastic gradient descent in this set-up, and in the general context of learning to recognize sequences. The principle is called Curriculum Learning, or shaping. The idea is to first learn to recognize short sequences before training on all available training sequences. Experiments on three different handwritten text databases (Rimes, IAM, OpenHaRT) show that a simple implementation of this strategy can significantly speed up the training of RNN for Text Recognition, and even significantly improve performance in some cases

arXiv.org e-Print Archive

Crossref

The Claire French Dialogue Dataset

Author: Harrando Ismaïl
Hunter Julie
Lorré Jean-Pierre
Louradour Jérôme
Rennard Virgile
Shang Guokan
Publication venue
Publication date: 28/11/2023
Field of study

We present the Claire French Dialogue Dataset (CFDD), a resource created by members of LINAGORA Labs in the context of the OpenLLM France initiative. CFDD is a corpus containing roughly 160 million words from transcripts and stage plays in French that we have assembled and publicly released in an effort to further the development of multilingual, open source language models. This paper describes the 24 individual corpora of which CFDD is composed and provides links and citations to their original sources. It also provides our proposed breakdown of the full CFDD dataset into eight categories of subcorpora and describes the process we followed to standardize the format of the final dataset. We conclude with a discussion of similar work and future directions

arXiv.org e-Print Archive

A Novel Strategy for Speaker Verification based on SVM Classification of Pairs of Speech Sequence

Author: Daoudi Khalid
Louradour Jérôme
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/02/2007
Field of study

International audienc

Scientific Publications of the University of Toulouse II Le Mirail

Curriculum learning for handwritten text line recognition. preprint arXiv:1312.1737

Author: Christopher Kermorvant
Jérôme Louradour
Publication venue
Publication date: 01/01/2013
Field of study

Abstract-Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition. At the same time, learning RNN by gradient descent leads to slow convergence, and training times are particularly long when the training database consists of full lines of text. In this paper, we propose an easy way to accelerate stochastic gradient descent in this set-up, and in the general context of learning to recognize sequences. The principle is called Curriculum Learning, or shaping. The idea is to first learn to recognize short sequences before training on all available training sequences. Experiments on three different handwritten text databases (Rimes, IAM, OpenHaRT) show that a simple implementation of this strategy can significantly speed up the training of RNN for Text Recognition, and even significantly improve performance in some cases

CiteSeerX

Pair-of-Sequences SVM Speaker Verification

Author: Daoudi Khalid
Louradour Jérôme
Publication venue: HAL CCSD
Publication date: 03/09/2007
Field of study

International audienc

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

State-Of-The-Art Sequence Kernels For SVM Speaker Verification

Author: Khalid Daoudi
Louradour Jérôme
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/10/2008
Field of study

International audienc

Scientific Publications of the University of Toulouse II Le Mirail

HAL Descartes