6,258 research outputs found

    Recovering Homography from Camera Captured Documents using Convolutional Neural Networks

    Get PDF
    Removing perspective distortion from hand held camera captured document images is one of the primitive tasks in document analysis, but unfortunately, no such method exists that can reliably remove the perspective distortion from document images automatically. In this paper, we propose a convolutional neural network based method for recovering homography from hand-held camera captured documents. Our proposed method works independent of document's underlying content and is trained end-to-end in a fully automatic way. Specifically, this paper makes following three contributions: Firstly, we introduce a large scale synthetic dataset for recovering homography from documents images captured under different geometric and photometric transformations; secondly, we show that a generic convolutional neural network based architecture can be successfully used for regressing the corners positions of documents captured under wild settings; thirdly, we show that L1 loss can be reliably used for corners regression. Our proposed method gives state-of-the-art performance on the tested datasets, and has potential to become an integral part of document analysis pipeline.Comment: 10 pages, 8 figure

    Defense against Universal Adversarial Perturbations

    Full text link
    Recent advances in Deep Learning show the existence of image-agnostic quasi-imperceptible perturbations that when applied to `any' image can fool a state-of-the-art network classifier to change its prediction about the image label. These `Universal Adversarial Perturbations' pose a serious threat to the success of Deep Learning in practice. We present the first dedicated framework to effectively defend the networks against such perturbations. Our approach learns a Perturbation Rectifying Network (PRN) as `pre-input' layers to a targeted model, such that the targeted model needs no modification. The PRN is learned from real and synthetic image-agnostic perturbations, where an efficient method to compute the latter is also proposed. A perturbation detector is separately trained on the Discrete Cosine Transform of the input-output difference of the PRN. A query image is first passed through the PRN and verified by the detector. If a perturbation is detected, the output of the PRN is used for label prediction instead of the actual image. A rigorous evaluation shows that our framework can defend the network classifiers against unseen adversarial perturbations in the real-world scenarios with up to 97.5% success rate. The PRN also generalizes well in the sense that training for one targeted network defends another network with a comparable success rate.Comment: Accepted in IEEE CVPR 201

    AON: Towards Arbitrarily-Oriented Text Recognition

    Full text link
    Recognizing text from natural images is a hot research topic in computer vision due to its various applications. Despite the enduring research of several decades on optical character recognition (OCR), recognizing texts from natural images is still a challenging task. This is because scene texts are often in irregular (e.g. curved, arbitrarily-oriented or seriously distorted) arrangements, which have not yet been well addressed in the literature. Existing methods on text recognition mainly work with regular (horizontal and frontal) texts and cannot be trivially generalized to handle irregular texts. In this paper, we develop the arbitrary orientation network (AON) to directly capture the deep features of irregular texts, which are combined into an attention-based decoder to generate character sequence. The whole network can be trained end-to-end by using only images and word-level annotations. Extensive experiments on various benchmarks, including the CUTE80, SVT-Perspective, IIIT5k, SVT and ICDAR datasets, show that the proposed AON-based method achieves the-state-of-the-art performance in irregular datasets, and is comparable to major existing methods in regular datasets.Comment: Accepted by CVPR201
    corecore