2,564 research outputs found

    CloudScan - A configuration-free invoice analysis system using recurrent neural networks

    Get PDF
    We present CloudScan; an invoice analysis system that requires zero configuration or upfront annotation. In contrast to previous work, CloudScan does not rely on templates of invoice layout, instead it learns a single global model of invoices that naturally generalizes to unseen invoice layouts. The model is trained using data automatically extracted from end-user provided feedback. This automatic training data extraction removes the requirement for users to annotate the data precisely. We describe a recurrent neural network model that can capture long range context and compare it to a baseline logistic regression model corresponding to the current CloudScan production system. We train and evaluate the system on 8 important fields using a dataset of 326,471 invoices. The recurrent neural network and baseline model achieve 0.891 and 0.887 average F1 scores respectively on seen invoice layouts. For the harder task of unseen invoice layouts, the recurrent neural network model outperforms the baseline with 0.840 average F1 compared to 0.788.Comment: Presented at ICDAR 201

    STNet: Selective Tuning of Convolutional Networks for Object Localization

    Full text link
    Visual attention modeling has recently gained momentum in developing visual hierarchies provided by Convolutional Neural Networks. Despite recent successes of feedforward processing on the abstraction of concepts form raw images, the inherent nature of feedback processing has remained computationally controversial. Inspired by the computational models of covert visual attention, we propose the Selective Tuning of Convolutional Networks (STNet). It is composed of both streams of Bottom-Up and Top-Down information processing to selectively tune the visual representation of Convolutional networks. We experimentally evaluate the performance of STNet for the weakly-supervised localization task on the ImageNet benchmark dataset. We demonstrate that STNet not only successfully surpasses the state-of-the-art results but also generates attention-driven class hypothesis maps

    Cortex, countercurrent context, and dimensional integration of lifetime memory

    Get PDF
    The correlation between relative neocortex size and longevity in mammals encourages a search for a cortical function specifically related to the life-span. A candidate in the domain of permanent and cumulative memory storage is proposed and explored in relation to basic aspects of cortical organization. The pattern of cortico-cortical connectivity between functionally specialized areas and the laminar organization of that connectivity converges on a globally coherent representational space in which contextual embedding of information emerges as an obligatory feature of cortical function. This brings a powerful mode of inductive knowledge within reach of mammalian adaptations, a mode which combines item specificity with classificatory generality. Its neural implementation is proposed to depend on an obligatory interaction between the oppositely directed feedforward and feedback currents of cortical activity, in countercurrent fashion. Direct interaction of the two streams along their cortex-wide local interface supports a scheme of "contextual capture" for information storage responsible for the lifelong cumulative growth of a uniquely cortical form of memory termed "personal history." This approach to cortical function helps elucidate key features of cortical organization as well as cognitive aspects of mammalian life history strategies

    Helping AI to Play Hearthstone: AAIA'17 Data Mining Challenge

    Full text link
    This paper summarizes the AAIA'17 Data Mining Challenge: Helping AI to Play Hearthstone which was held between March 23, and May 15, 2017 at the Knowledge Pit platform. We briefly describe the scope and background of this competition in the context of a more general project related to the development of an AI engine for video games, called Grail. We also discuss the outcomes of this challenge and demonstrate how predictive models for the assessment of player's winning chances can be utilized in a construction of an intelligent agent for playing Hearthstone. Finally, we show a few selected machine learning approaches for modeling state and action values in Hearthstone. We provide evaluation for a few promising solutions that may be used to create more advanced types of agents, especially in conjunction with Monte Carlo Tree Search algorithms.Comment: Federated Conference on Computer Science and Information Systems, Prague (FedCSIS-2017) (Prague, Czech Republic

    Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels

    Full text link
    Computer vision algorithms with pixel-wise labeling tasks, such as semantic segmentation and salient object detection, have gone through a significant accuracy increase with the incorporation of deep learning. Deep segmentation methods slightly modify and fine-tune pre-trained networks that have hundreds of millions of parameters. In this work, we question the need to have such memory demanding networks for the specific task of salient object segmentation. To this end, we propose a way to learn a memory-efficient network from scratch by training it only on salient object detection datasets. Our method encodes images to gridized superpixels that preserve both the object boundaries and the connectivity rules of regular pixels. This representation allows us to use convolutional neural networks that operate on regular grids. By using these encoded images, we train a memory-efficient network using only 0.048\% of the number of parameters that other deep salient object detection networks have. Our method shows comparable accuracy with the state-of-the-art deep salient object detection methods and provides a faster and a much more memory-efficient alternative to them. Due to its easy deployment, such a network is preferable for applications in memory limited devices such as mobile phones and IoT devices.Comment: 6 pages, submitted to MMSP 201
    corecore