55,015 research outputs found
Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition
Recognizing irregular text in natural scene images is challenging due to the
large variance in text appearance, such as curvature, orientation and
distortion. Most existing approaches rely heavily on sophisticated model
designs and/or extra fine-grained annotations, which, to some extent, increase
the difficulty in algorithm implementation and data collection. In this work,
we propose an easy-to-implement strong baseline for irregular scene text
recognition, using off-the-shelf neural network components and only word-level
annotations. It is composed of a -layer ResNet, an LSTM-based
encoder-decoder framework and a 2-dimensional attention module. Despite its
simplicity, the proposed method is robust and achieves state-of-the-art
performance on both regular and irregular scene text recognition benchmarks.
Code is available at: https://tinyurl.com/ShowAttendReadComment: Accepted to Proc. AAAI Conference on Artificial Intelligence 201
AON: Towards Arbitrarily-Oriented Text Recognition
Recognizing text from natural images is a hot research topic in computer
vision due to its various applications. Despite the enduring research of
several decades on optical character recognition (OCR), recognizing texts from
natural images is still a challenging task. This is because scene texts are
often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted)
arrangements, which have not yet been well addressed in the literature.
Existing methods on text recognition mainly work with regular (horizontal and
frontal) texts and cannot be trivially generalized to handle irregular texts.
In this paper, we develop the arbitrary orientation network (AON) to directly
capture the deep features of irregular texts, which are combined into an
attention-based decoder to generate character sequence. The whole network can
be trained end-to-end by using only images and word-level annotations.
Extensive experiments on various benchmarks, including the CUTE80,
SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed
AON-based method achieves the-state-of-the-art performance in irregular
datasets, and is comparable to major existing methods in regular datasets.Comment: Accepted by CVPR201
DeepCare: A Deep Dynamic Memory Model for Predictive Medicine
Personalized predictive medicine necessitates the modeling of patient illness
and care processes, which inherently have long-term temporal dependencies.
Healthcare observations, recorded in electronic medical records, are episodic
and irregular in time. We introduce DeepCare, an end-to-end deep dynamic neural
network that reads medical records, stores previous illness history, infers
current illness states and predicts future medical outcomes. At the data level,
DeepCare represents care episodes as vectors in space, models patient health
state trajectories through explicit memory of historical records. Built on Long
Short-Term Memory (LSTM), DeepCare introduces time parameterizations to handle
irregular timed events by moderating the forgetting and consolidation of memory
cells. DeepCare also incorporates medical interventions that change the course
of illness and shape future medical risk. Moving up to the health state level,
historical and present health states are then aggregated through multiscale
temporal pooling, before passing through a neural network that estimates future
outcomes. We demonstrate the efficacy of DeepCare for disease progression
modeling, intervention recommendation, and future risk prediction. On two
important cohorts with heavy social and economic burden -- diabetes and mental
health -- the results show improved modeling and risk prediction accuracy.Comment: Accepted at JBI under the new name: "Predicting healthcare
trajectories from medical records: A deep learning approach
The role of word frequency and morpho-orthography in agreement processing
Agreement attraction in comprehension (when an ungrammatical verb is read quickly if preceded by a feature-matching local noun) is well described by a cue-based retrieval framework. This suggests a role for lexical retrieval in attraction. To examine this, we manipulated two probabilistic factors known to affect lexical retrieval: local noun word frequency and morpho-orthography (agreement morphology realised with or without –s endings) in a self-paced reading study. Noun number and word frequency affected noun and verb region reading times, with higher-frequency words not eliciting attraction. Morpho-orthography impacted verb processing but not attraction: atypical plurals led to slower verb reading times regardless of verb number. Exploratory individual difference analyses further underscore the importance of lexical retrieval dynamics in sentence processing. This provides evidence that agreement operates via a cue-based retrieval mechanism over lexical representations that vary in their strength and association to number features
Can a connectionist model explain the processing of regularly and irregularly inflected words in German as L1 and L2?
The connectionist model is a prevailing model of the structure and functioning of the cognitive system of the processing of morphology. According to this model, the morphology of regularly and irregularly inflected words (e.g., verb participles and noun plurals) is processed in the same cognitive network. A validation of the connectionist model of the processing of morphology in German as L2 has yet to be achieved. To investigate L2-specific aspects, we compared a group of L1 speakers of German with speakers of German as L2. L2 and L1 speakers of German were assigned to their respective group by their reaction times in picture naming prior to the central task. The reaction times in the lexical decision task of verb participles and noun plurals were largely consistent with the assumption of the connectionist model. Interestingly, speakers of German as L2 showed a specific advantage for irregular compared with regular verb participles
- …