32 research outputs found

    Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks

    Full text link
    Image orientation detection requires high-level scene understanding. Humans use object recognition and contextual scene information to correctly orient images. In literature, the problem of image orientation detection is mostly confronted by using low-level vision features, while some approaches incorporate few easily detectable semantic cues to gain minor improvements. The vast amount of semantic content in images makes orientation detection challenging, and therefore there is a large semantic gap between existing methods and human behavior. Also, existing methods in literature report highly discrepant detection rates, which is mainly due to large differences in datasets and limited variety of test images used for evaluation. In this work, for the first time, we leverage the power of deep learning and adapt pre-trained convolutional neural networks using largest training dataset to-date for the image orientation detection task. An extensive evaluation of our model on different public datasets shows that it remarkably generalizes to correctly orient a large set of unconstrained images; it also significantly outperforms the state-of-the-art and achieves accuracy very close to that of humans

    Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks

    Full text link
    Image orientation detection requires high-level scene understanding. Humans use object recognition and contextual scene information to correctly orient images. In literature, the problem of image orientation detection is mostly confronted by using low-level vision features, while some approaches incorporate few easily detectable semantic cues to gain minor improvements. The vast amount of semantic content in images makes orientation detection challenging, and therefore there is a large semantic gap between existing methods and human behavior. Also, existing methods in literature report highly discrepant detection rates, which is mainly due to large differences in datasets and limited variety of test images used for evaluation. In this work, for the first time, we leverage the power of deep learning and adapt pre-trained convolutional neural networks using largest training dataset to-date for the image orientation detection task. An extensive evaluation of our model on different public datasets shows that it remarkably generalizes to correctly orient a large set of unconstrained images; it also significantly outperforms the state-of-the-art and achieves accuracy very close to that of humans

    Pre-classification for automatic image orientation

    Get PDF
    In this paper, we propose a novel method for automatic orientation of digital images. The approach is based on exploiting the properties of local statistics of natural scenes. In this way, we address some of the difficulties encountered in previous works in this area. The main contribution of this paper is to introduce a pre-classification step into carefully defined categories in order to simplify subsequent orientation detection. The proposed algorithm was tested on 9068 images and compared to existing state of the art in the area. Results show a significant improvement over previous work

    Compensating for Large In-Plane Rotations in Natural Images

    Full text link
    Rotation invariance has been studied in the computer vision community primarily in the context of small in-plane rotations. This is usually achieved by building invariant image features. However, the problem of achieving invariance for large rotation angles remains largely unexplored. In this work, we tackle this problem by directly compensating for large rotations, as opposed to building invariant features. This is inspired by the neuro-scientific concept of mental rotation, which humans use to compare pairs of rotated objects. Our contributions here are three-fold. First, we train a Convolutional Neural Network (CNN) to detect image rotations. We find that generic CNN architectures are not suitable for this purpose. To this end, we introduce a convolutional template layer, which learns representations for canonical 'unrotated' images. Second, we use Bayesian Optimization to quickly sift through a large number of candidate images to find the canonical 'unrotated' image. Third, we use this method to achieve robustness to large angles in an image retrieval scenario. Our method is task-agnostic, and can be used as a pre-processing step in any computer vision system.Comment: Accepted at Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP) 201

    Automatic Photo Orientation Detection with Convolutional Neural Networks

    Full text link
    We apply convolutional neural networks (CNN) to the problem of image orientation detection in the context of determining the correct orientation (from 0, 90, 180, and 270 degrees) of a consumer photo. The problem is especially important for digitazing analog photographs. We substantially improve on the published state of the art in terms of the performance on one of the standard datasets, and test our system on a more difficult large dataset of consumer photos. We use Guided Backpropagation to obtain insights into how our CNN detects photo orientation, and to explain its mistakes

    An Unsupervised Classification Technique for Detection of Flipped Orientations in Document Images

    Get PDF
    Detection of text orientation in document images is of preliminary concern prior to processing of documents by Optical Character Reader. The text direction in document images should exist generally in a specific orientation, i.e.,   text direction for any automated document reading system. The flipped text orientation leads to an unambiguous result in such fully automated systems. In this paper, we focus on development of text orientation direction detection module which can be incorporated as the perquisite process in automatic reading system. Orientation direction detection of text is performed through employing directional gradient features of document image and adapts an unsupervised learning approach for detection of flipped text orientation at which the document has been originally fed into scanning device. The unsupervised learning is built on the directional gradient features of text of document based on four possible different orientations. The algorithm is experimented on document samples of printed plain English text as well as filled in pre-printed forms of Telugu script. The outcome attained by algorithm proves to be consistent and adequate with an average accuracy around 94%

    SR-POD : sample rotation based on principal-axis orientation distribution for data augmentation in deep object detection

    Get PDF
    Convolutional neural networks (CNNs) have outperformed most state-of-the-art methods in object detection. However, CNNs suffer the difficulty of detecting objects with rotation, because the dataset used to train the CCNs often does not contain sufficient samples with various angles of orientation. In this paper, we propose a novel data-augmentation approach to handle samples with rotation, which utilizes the distribution of the object's orientation without the time-consuming process of rotating the sample images. Firstly, we present an orientation descriptor, named as "principal-axis orientation" to describe the orientation of the object's principal axis in an image and estimate the distribution of objects’ principal-axis orientations (PODs) of the whole dataset. Secondly, we define a similarity metric to calculate the POD similarity between the training set and an additional dataset, which is built by randomly selecting images from the benchmark ImageNet ILSVRC2012 dataset. Finally, we optimize a cost function to obtain an optimal rotation angle, which indicates the highest POD similarity between the two aforementioned data sets. In order to evaluate our data augmentation method for object detection, experiments, conducted on the benchmark PASCAL VOC2007 dataset, show that with the training set augmented using our method, the average precision (AP) of the Faster RCNN in the TV-monitor is improved by 7.5%. In addition, our experimental results also demonstrate that new samples generated by random rotation are more likely to result in poor performance of object detection

    PERBAIKAN ORIENTASI CITRA BERDASARKAN KEBERADAAN MANUSIA MENGGUNAKAN FITUR GRADIEN DAN HAAR-LIKE

    Get PDF
    Perkembangan dan penggunaan teknologi kamera digital saat ini sudah sangat meningkat, ditandai dengan banyaknya jenis kamera serta disematkannya kamera pada berbagai perangkat seperti laptop, ponsel, tab, jam tangan, dan gadget lainnya, sehingga kegiatan fotografi menjadi semakin mudah. Namun sebagian besar perangkat-perangkat tersebut tidak memiliki sensor untuk menyimpan informasi mengenai orientasi foto yang diambil apakah itu portrait atau landscape. Karena itu, kebanyakan foto-foto yang mengalami rotasi tidak sebagaimana mestinya baru disadari ketika disajikan di depan layar komputer atau televisi atau yang lainnya. Pada penelitian ini diajukan sebuah metode perbaikan orientasi citra dengan fitur haar-like dan gradien magnitude citra untuk mendeteksi objek manusia yang ada didalamnya. Objek manusia yang ditemukan dijadikan acuan untuk perbaikan orientasi. Sedangkan klasifikasi yang digunakan adalah klas-ifikasi cascade adaboost. Dengan sedikit modifikasi penerapan klasifikasi cascade adaboost diperoleh akurasi hingga 79% dimana akurasi tersebut lebih baik dari pada tanpa menggunakan modifikasi. Penerapan modifikasi juga mening-katkan kecepatan pemrosesan hingga dua kali lipat

    Image orientation detection using LBP-based features and logistic regression

    Get PDF
    open3noopenGianluigi Ciocca;Claudio Cusano;Raimondo SchettiniGianluigi, Ciocca; Cusano, Claudio; Raimondo, Schettin
    corecore