Search CORE

28 research outputs found

Batch-adaptive rejection threshold estimation with application to OCR post-processing

Author: Amengual
Arlandis
Berghel
Bertolami
Chow
Farooq
Fawcett
Fumera
Garcia
Grall-Maës
Hall
Hanczar
He
Hull
J. Ramon Navarro-Cerdan
Joaquim Arlandis
Juan-Carlos Perez-Cortes
Kolak
Köksal
Li
Li
Neuhoff
Paladini
Pitrelli
Rafael Llobet
Rijsbergen
Serrano
Wu
Publication venue: 'Elsevier BV'
Publication date: 24/06/2015
Field of study

An OCR process is often followed by the application of a language model to find the best transformation of an OCR hypothesis into a string compatible with the constraints of the document, field or item under consideration. The cost of this transformation can be taken as a confidence value and compared to a threshold to decide if a string is accepted as correct or rejected in order to satisfy the need for bounding the error rate of the system. Widespread tools like ROC, precision-recall, or error-reject curves, are commonly used along with fixed thresholding in order to achieve that goal. However, those methodologies fail when a test sample has a confidence distribution that differs from the one of the sample used to train the system, which is a very frequent case in post-processed OCR strings (e.g., string batches showing particularly careful handwriting styles in contrast to free styles). In this paper, we propose an adaptive method for the automatic estimation of the rejection threshold that overcomes this drawback, allowing the operator to define an expected error rate within the set of accepted (non-rejected) strings of a complete batch of documents (as opposed to trying to establish or control the probability of error of a single string), regardless of its confidence distribution. The operator (expert) is assumed to know the error rate that can be acceptable to the user of the resulting data. The proposed system transforms that knowledge into a suitable rejection threshold. The approach is based on the estimation of an expected error vs. transformation cost distribution. First, a model predicting the probability of a cost to arise from an erroneously transcribed string is computed from a sample of supervised OCR hypotheses. Then, given a test sample, a cumulative error vs. cost curve is computed and used to automatically set the appropriate threshold that meets the user-defined error rate on the overall sample. The results of experiments on batches coming from different writing styles show very accurate error rate estimations where fixed thresholding clearly fails. An original procedure to generate distorted strings from a given language is also proposed and tested, which allows the use of the presented method in tasks where no real supervised OCR hypotheses are available to train the system.Navarro Cerdan, JR.; Arlandis Navarro, JF.; Llobet Azpitarte, R.; Perez-Cortes, J. (2015). Batch-adaptive rejection threshold estimation with application to OCR post-processing. Expert Systems with Applications. 42(21):8111-8122. doi:10.1016/j.eswa.2015.06.022S81118122422

Crossref

RiuNet

Geographic distribution of epiphytic bromeliads of the Una region, Northeastern Brazil

Author: ALVES M.C.
ALVES T.F.
AMORIM A.M.
AMORIM A.M.A.
ANDRESEN M.
CASWELL H.
Flavio Antonio Maës dos Santos
FONTOURA T.
GIVNISH T.J.
JOHNS N.D.
KESSLER M.
KESSLER M.
KESSLER M.
KESSLER M.
KREFT H.
LEME E.M.C.
LEME E.M.C.
LEME E.M.C.
LEME E.M.C.
MARTINELLI G.
MARTINI A.M.Z.
MORI S.A.
MORI S.A.
MYERS N.
NIEDER J.
NIEDER J.
PERRY D.R.
REIS J.R.M.
SCARANO F.R.
SIQUEIRA-FILHO J.A.
SIQUEIRA-FILHO J.A.
SMITH L.B.
SMITH L.B.
SMITH L.B.
SOUSA G.M.
SOUSA L.O.F.
Talita Fontoura
THOMAS W.M.
VELOSO H.P.
VERSIEUX L.M.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Ontogeny, allometry and architecture of Psychotria tenuinervis (Rubiaceae)

Crossref

Efeito da exploração madeireira sobre o número de indivíduos férteis de três espécies arbóreas comerciais na Amazônia oriental

Author: Ayres M.
Barreto P.
Chapmam C.A
Chapman C.A.
De Steven D.
Edson Vidal
Feer F.
Flavio Antonio Maës dos Santos
Fonseca M.G.
Ghazoul J.
Gourlet-Fleury S.
Grogan J.
Grogan J.E.
Guariguata M.R.
Gullison R.E
Holmes T.P
Johns A.D.
Johns J.
Kelly D.
Lacerda A.E.B.
Lepsch-Cunha N.
Lewis G.P.
Loureiro A.A.
Marisa Gesteira Fonseca
Martini A.M.Z.
Mori S.A.
Parrota J.A.
Pedroni F.
Pennington T.D.
Plumptre A.J.
Sheil D.
Svenning J.C.
Teixeira D.E.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Crossref

Is size structure a good measure of future trends of plant populations? an empirical approach using five woody species from the Cerrado (Brazilian savanna)

Crossref

Microclimate of Atlantic forest fragments: regional and local scale heterogeneity

Crossref

Padrões espaciais de Qualea grandiflora Mart. em fragmentos de cerrado no estado de São Paulo

Crossref

Sensitivity of South American tropical forests to an extreme climate anomaly

Author: Alencar Fagundes NC
Almeida de Oliveira E
Alvarez Loayza P
Alvarez-Davila E
Alves LF
Andrade Maia V
Aparecida Vieira S
Aragão LEOC
Araujo-Murakami A
Arets EJMM
Arroyo L
Baker TR
Baraloto C
Barbosa Camargo P
Barlow J
Barroso J
Bennett AC
Bento da Silva W
Berenguer E
Bonal D
Borges Miranda Santos A
Brienen RJW
Brown F
Burban B
Bánki O
Castilho CV
Castro W
Cerruto Ribeiro S
Chama Moscoso V
Chavez E
Coelho de Souza F
Comiskey JA
Cornejo Valverde F
Costa FRC
de Aguiar-Campos N
de Oliveira Melo L
del Aguila Pasquel J
Delgado Santana F
Derroire G
Disney M
do Socorro M
Dourdain A
Duque LF
Dávila Cardozo N
Elias F
Emilio T
Esquivel-Muelbert A
Feldpausch TR
Ferreira J
Flora Ramos R
Flores Llampazo G
Forni Martins V
Galbraith D
Gardner T
Gloor E
Guillen R
Gutierrez Sibauty G
Hase E
Honorio Coronado EN
Huaraca Huasco W
Hérault B
Janovec JP
Jimenez-Rojas E
Joly C
Kalamandeen M
Killeen TJ
Lais Farrapo C
Levesley A
Lewis SL
Lizon Romano L
Lopez Gonzalez G
Lopez W
Magnusson WE
Malhi Y
Manoel dos Santos R
Marimon BS
Marimon-Junior BH
Matias de Almeida Reis S
Maës dos Santos FA
Melgaço K
Melo Cruz OA
Mendoza Polo I
Montañez T
Monteagudo-Mendoza A
Morandi PS
Morel JD
Núñez Vargas MP
Oliveira de Araújo R
Pallqui Camacho NC
Parada Gutierrez A
Pennington T
Phillips OL
Pickavance GC
Pipoly J
Pitman NCA
Prestes NCCS
Quesada C
Ramirez Arevalo F
Ramos E
Ramírez‐Angulo H
Restrepo Correa Z
Richardson JE
Rodrigo de Souza C
Rodrigues de Sousa T
Roopsind A
Schietti J
Schwartz G
Silva Espejo J
Silva RC
Silveira M
Singh J
Soto Shareva Y
Steininger M
Stropp J
Sullivan MJP
Talbot J
ter Steege H
Terborgh J
Thomas R
Valenzuela Gamarra L
van der Heijden G
van der Hout P
Vasquez Martinez R
Vilanova Torre E
Viscarra LJ
Zagt R
Publication venue: Nature Research
Publication date: 06/11/2023
Field of study

This is the final version. Available on open access from Nature Research via the DOI in this recordData availability: Publicly available climate data used in this paper are available from ERA5 (ref. 64), CRU ts.4.03 (ref. 65), WorldClim v2 (ref. 66), TRMM product 3B43 V7 (ref. 67) and GPCC, Version 7 (ref. 68). The input data are available on ForestPlots42.Code availability R code for graphics and analyses is available on ForestPlots42.The tropical forest carbon sink is known to be drought sensitive, but it is unclear which forests are the most vulnerable to extreme events. Forests with hotter and drier baseline conditions may be protected by prior adaptation, or more vulnerable because they operate closer to physiological limits. Here we report that forests in drier South American climates experienced the greatest impacts of the 2015–2016 El Niño, indicating greater vulnerability to extreme temperatures and drought. The long-term, ground-measured tree-by-tree responses of 123 forest plots across tropical South America show that the biomass carbon sink ceased during the event with carbon balance becoming indistinguishable from zero (−0.02 ± 0.37 Mg C ha−1 per year). However, intact tropical South American forests overall were no more sensitive to the extreme 2015–2016 El Niño than to previous less intense events, remaining a key defence against climate change as long as they are protected

Open Research Exeter

Ultrasonic Doppler measurement using a pseudo-continuous mode

Author: Grall-Maës Edith
Meister J.-J.
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/05/1997
Field of study

International audienceA pseudo-continuous ultrasound Doppler method was studied theoretically and experimentally. The principle of the method is based on the pulsed-wave mode. Short bursts are emitted with a high pulse repetition frequency, and demodulated echoes are integrated between emitted pulses. Gaps arise in the measurable range due to the lost echoes during emitting time, but a method for partly solving this problem that uses two alternating pulse repetition frequencies was suggested. The proposed pseudo-continuous method is useful for measuring high blood velocities, up to 3.8 m/s, with an emitted frequency of 3.2 MHz, at any depth up to 17 cm. Using the pulsed-wave mode, the maximum measurable velocity under similar conditions would be only 0.6 m/s. Thus, the maximum measurable velocity is six times higher in the pseudo-continuous mode. These results demonstrate the possibility of measuring high blood-flow velocities using a transducer and electronics compatible with the pulsed-wave mode

HAL Descartes

Hal-Diderot

Optimal Rate for Nonparametric Estimation in Deterministic Dynamical Systems

Author: E. Guerre
J. Maës
Publication venue
Publication date
Field of study

AMS 1991 Subject Classification: 62G07, 58F03., deterministic dynamical systems, nonparametric statistics, minimax principle, recurrence, Mots–clés: approche minimax, statistique non–paramétrique, systèmes dynamiques déterministes, récurrence.,

Research Papers in Economics