Search CORE

55,015 research outputs found

Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

Author: Li Hui
Shen Chunhua
Wang Peng
Zhang Guyu
Publication venue
Publication date: 16/03/2019
Field of study

Recognizing irregular text in natural scene images is challenging due to the large variance in text appearance, such as curvature, orientation and distortion. Most existing approaches rely heavily on sophisticated model designs and/or extra fine-grained annotations, which, to some extent, increase the difficulty in algorithm implementation and data collection. In this work, we propose an easy-to-implement strong baseline for irregular scene text recognition, using off-the-shelf neural network components and only word-level annotations. It is composed of a

31

-layer ResNet, an LSTM-based encoder-decoder framework and a 2-dimensional attention module. Despite its simplicity, the proposed method is robust and achieves state-of-the-art performance on both regular and irregular scene text recognition benchmarks. Code is available at: https://tinyurl.com/ShowAttendReadComment: Accepted to Proc. AAAI Conference on Artificial Intelligence 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

AON: Towards Arbitrarily-Oriented Text Recognition

Author: Bai Fan
Cheng Zhanzhan
Niu Yi
Pu Shiliang
Xu Yangliu
Zhou Shuigeng
Publication venue
Publication date: 22/03/2018
Field of study

Recognizing text from natural images is a hot research topic in computer vision due to its various applications. Despite the enduring research of several decades on optical character recognition (OCR), recognizing texts from natural images is still a challenging task. This is because scene texts are often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted) arrangements, which have not yet been well addressed in the literature. Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts. In this paper, we develop the arbitrary orientation network (AON) to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence. The whole network can be trained end-to-end by using only images and word-level annotations. Extensive experiments on various benchmarks, including the CUTE80, SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed AON-based method achieves the-state-of-the-art performance in irregular datasets, and is comparable to major existing methods in regular datasets.Comment: Accepted by CVPR201

arXiv.org e-Print Archive

Crossref

Recommended from our members

Mad about the LAD

Author: Vainikka Anne Marjatta
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2003
Field of study

ScholarWorks@UMass Amherst

DeepCare: A Deep Dynamic Memory Model for Predictive Medicine

Author: A Graves
AB Jensen
BB Granger
J Futoma
JM Corbin
JS Mathias
K Orphanou
PB Jensen
R Henriques
S Hochreiter
SJ Henly
T Tran
T Tran
Y LeCun
Publication venue
Publication date: 01/01/2016
Field of study

Personalized predictive medicine necessitates the modeling of patient illness and care processes, which inherently have long-term temporal dependencies. Healthcare observations, recorded in electronic medical records, are episodic and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural network that reads medical records, stores previous illness history, infers current illness states and predicts future medical outcomes. At the data level, DeepCare represents care episodes as vectors in space, models patient health state trajectories through explicit memory of historical records. Built on Long Short-Term Memory (LSTM), DeepCare introduces time parameterizations to handle irregular timed events by moderating the forgetting and consolidation of memory cells. DeepCare also incorporates medical interventions that change the course of illness and shape future medical risk. Moving up to the health state level, historical and present health states are then aggregated through multiscale temporal pooling, before passing through a neural network that estimates future outcomes. We demonstrate the efficacy of DeepCare for disease progression modeling, intervention recommendation, and future risk prediction. On two important cohorts with heavy social and economic burden -- diabetes and mental health -- the results show improved modeling and risk prediction accuracy.Comment: Accepted at JBI under the new name: "Predicting healthcare trajectories from medical records: A deep learning approach

arXiv.org e-Print Archive

Deakin Research Online

Crossref

The role of word frequency and morpho-orthography in agreement processing

Author: Brehm L.
Christianson K.
Hussey E.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2020
Field of study

Agreement attraction in comprehension (when an ungrammatical verb is read quickly if preceded by a feature-matching local noun) is well described by a cue-based retrieval framework. This suggests a role for lexical retrieval in attraction. To examine this, we manipulated two probabilistic factors known to affect lexical retrieval: local noun word frequency and morpho-orthography (agreement morphology realised with or without –s endings) in a self-paced reading study. Noun number and word frequency affected noun and verb region reading times, with higher-frequency words not eliciting attraction. Morpho-orthography impacted verb processing but not attraction: atypical plurals led to slower verb reading times regardless of verb number. Exploratory individual difference analyses further underscore the importance of lexical retrieval dynamics in sentence processing. This provides evidence that agreement operates via a cue-based retrieval mechanism over lexical representations that vary in their strength and association to number features

MPG.PuRe

Can a connectionist model explain the processing of regularly and irregularly inflected words in German as L1 and L2?

Author: Birdsong D.
Francis W. N.
Maratsos M.
Meier H.
Pfeffer J.
Pinker S.
Rumelhart D.
Smolka E.
Tilo Strobach
Ute Schönpflug
Publication venue: 'SAGE Publications'
Publication date: 01/01/2011
Field of study

The connectionist model is a prevailing model of the structure and functioning of the cognitive system of the processing of morphology. According to this model, the morphology of regularly and irregularly inflected words (e.g., verb participles and noun plurals) is processed in the same cognitive network. A validation of the connectionist model of the processing of morphology in German as L2 has yet to be achieved. To investigate L2-specific aspects, we compared a group of L1 speakers of German with speakers of German as L2. L2 and L1 speakers of German were assigned to their respective group by their reaction times in picture naming prior to the central task. The reaction times in the lexical decision task of verb participles and noun plurals were largely consistent with the assumption of the connectionist model. Interestingly, speakers of German as L2 showed a specific advantage for irregular compared with regular verb participles

Institutional Repository of the Freie Universität Berlin

Crossref

Open Access LMU