2,564 research outputs found
CloudScan - A configuration-free invoice analysis system using recurrent neural networks
We present CloudScan; an invoice analysis system that requires zero
configuration or upfront annotation. In contrast to previous work, CloudScan
does not rely on templates of invoice layout, instead it learns a single global
model of invoices that naturally generalizes to unseen invoice layouts. The
model is trained using data automatically extracted from end-user provided
feedback. This automatic training data extraction removes the requirement for
users to annotate the data precisely. We describe a recurrent neural network
model that can capture long range context and compare it to a baseline logistic
regression model corresponding to the current CloudScan production system. We
train and evaluate the system on 8 important fields using a dataset of 326,471
invoices. The recurrent neural network and baseline model achieve 0.891 and
0.887 average F1 scores respectively on seen invoice layouts. For the harder
task of unseen invoice layouts, the recurrent neural network model outperforms
the baseline with 0.840 average F1 compared to 0.788.Comment: Presented at ICDAR 201
STNet: Selective Tuning of Convolutional Networks for Object Localization
Visual attention modeling has recently gained momentum in developing visual
hierarchies provided by Convolutional Neural Networks. Despite recent successes
of feedforward processing on the abstraction of concepts form raw images, the
inherent nature of feedback processing has remained computationally
controversial. Inspired by the computational models of covert visual attention,
we propose the Selective Tuning of Convolutional Networks (STNet). It is
composed of both streams of Bottom-Up and Top-Down information processing to
selectively tune the visual representation of Convolutional networks. We
experimentally evaluate the performance of STNet for the weakly-supervised
localization task on the ImageNet benchmark dataset. We demonstrate that STNet
not only successfully surpasses the state-of-the-art results but also generates
attention-driven class hypothesis maps
Cortex, countercurrent context, and dimensional integration of lifetime memory
The correlation between relative neocortex size and longevity in mammals encourages a search for a cortical function specifically related to the life-span. A candidate in the domain of permanent and cumulative memory storage is proposed and explored in relation to basic aspects of cortical organization. The pattern of cortico-cortical connectivity between functionally specialized areas and the laminar organization of that connectivity converges on a globally coherent representational space in which contextual embedding of information emerges as an obligatory feature of cortical function. This brings a powerful mode of inductive knowledge within reach of mammalian adaptations, a mode which combines item specificity with classificatory generality. Its neural implementation is proposed to depend on an obligatory interaction between the oppositely directed feedforward and feedback currents of cortical activity, in countercurrent fashion. Direct interaction of the two streams along their cortex-wide local interface supports a scheme of "contextual capture" for information storage responsible for the lifelong cumulative growth of a uniquely cortical form of memory termed "personal history." This approach to cortical function helps elucidate key features of cortical organization as well as cognitive aspects of mammalian life history strategies
Helping AI to Play Hearthstone: AAIA'17 Data Mining Challenge
This paper summarizes the AAIA'17 Data Mining Challenge: Helping AI to Play
Hearthstone which was held between March 23, and May 15, 2017 at the Knowledge
Pit platform. We briefly describe the scope and background of this competition
in the context of a more general project related to the development of an AI
engine for video games, called Grail. We also discuss the outcomes of this
challenge and demonstrate how predictive models for the assessment of player's
winning chances can be utilized in a construction of an intelligent agent for
playing Hearthstone. Finally, we show a few selected machine learning
approaches for modeling state and action values in Hearthstone. We provide
evaluation for a few promising solutions that may be used to create more
advanced types of agents, especially in conjunction with Monte Carlo Tree
Search algorithms.Comment: Federated Conference on Computer Science and Information Systems,
Prague (FedCSIS-2017) (Prague, Czech Republic
Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels
Computer vision algorithms with pixel-wise labeling tasks, such as semantic
segmentation and salient object detection, have gone through a significant
accuracy increase with the incorporation of deep learning. Deep segmentation
methods slightly modify and fine-tune pre-trained networks that have hundreds
of millions of parameters. In this work, we question the need to have such
memory demanding networks for the specific task of salient object segmentation.
To this end, we propose a way to learn a memory-efficient network from scratch
by training it only on salient object detection datasets. Our method encodes
images to gridized superpixels that preserve both the object boundaries and the
connectivity rules of regular pixels. This representation allows us to use
convolutional neural networks that operate on regular grids. By using these
encoded images, we train a memory-efficient network using only 0.048\% of the
number of parameters that other deep salient object detection networks have.
Our method shows comparable accuracy with the state-of-the-art deep salient
object detection methods and provides a faster and a much more memory-efficient
alternative to them. Due to its easy deployment, such a network is preferable
for applications in memory limited devices such as mobile phones and IoT
devices.Comment: 6 pages, submitted to MMSP 201
- …