4,006 research outputs found

    Vanishing Point Detection with Direct and Transposed Fast Hough Transform inside the neural network

    Get PDF
    In this paper, we suggest a new neural network architecture for vanishing point detection in images. The key element is the use of the direct and transposed Fast Hough Transforms separated by convolutional layer blocks with standard activation functions. It allows us to get the answer in the coordinates of the input image at the output of the network and thus to calculate the coordinates of the vanishing point by simply selecting the maximum. Besides, it was proved that calculation of the transposed Fast Hough Transform can be performed using the direct one. The use of integral operators enables the neural network to rely on global rectilinear features in the image, and so it is ideal for detecting vanishing points. To demonstrate the effectiveness of the proposed architecture, we use a set of images from a DVR and show its superiority over existing methods. Note, in addition, that the proposed neural network architecture essentially repeats the process of direct and back projection used, for example, in computed tomography.Comment: 9 pages, 9 figures, submitted to "Computer Optics"; extra experiment added, new theorem proof added, references added; typos correcte

    Advanced Hough-based method for on-device document localization

    Get PDF
    The demand for on-device document recognition systems increases in conjunction with the emergence of more strict privacy and security requirements. In such systems, there is no data transfer from the end device to a third-party information processing servers. The response time is vital to the user experience of on-device document recognition. Combined with the unavailability of discrete GPUs, powerful CPUs, or a large RAM capacity on consumer-grade end devices such as smartphones, the time limitations put significant constraints on the computational complexity of the applied algorithms for on-device execution. In this work, we consider document location in an image without prior knowledge of the document content or its internal structure. In accordance with the published works, at least 5 systems offer solutions for on-device document location. All these systems use a location method which can be considered Hough-based. The precision of such systems seems to be lower than that of the state-of-the-art solutions which were not designed to account for the limited computational resources. We propose an advanced Hough-based method. In contrast with other approaches, it accounts for the geometric invariants of the central projection model and combines both edge and color features for document boundary detection. The proposed method allowed for the second best result for SmartDoc dataset in terms of precision, surpassed by U-net like neural network. When evaluated on a more challenging MIDV-500 dataset, the proposed algorithm guaranteed the best precision compared to published methods. Our method retained the applicability to on-device computations.Comment: This is a preprint of the article submitted for publication in the journal "Computer Optics

    Identifying music documents in a collection of images

    Get PDF
    Digital libraries and search engines are now well-equipped to find images of documents based on queries. Many images of music scores are now available, often mixed up with textual documents and images. For example, using the Google “images” search feature, a search for “Beethoven” will return a number of scores and manuscripts as well as pictures of the composer. In this paper we report on an investigation into methods to mechanically determine if a particular document is indeed a score, so that the user can specify that only musical scores should be returned. The goal is to find a minimal set of features that can be used as a quick test that will be applied to large numbers of documents. A variety of filters were considered, and two promising ones (run-length ratios and Hough transform) were evaluated. We found that a method based around run-lengths in vertical scans (RL) that out-performs a comparable algorithm using the Hough transform (HT). On a test set of 1030 images, RL achieved recall and precision of 97.8% and 88.4% respectively while HT achieved 97.8% and 73.5%. In terms of processor time, RL was more than five times as fast as HT

    Proceedings of the 4th field robot event 2006, Stuttgart/Hohenheim, Germany, 23-24th June 2006

    Get PDF
    Zeer uitgebreid verslag van het 4e Fieldrobotevent, dat gehouden werd op 23 en 24 juni 2006 in Stuttgart/Hohenhei

    HoughNet: neural network architecture for vanishing points detection

    Full text link
    In this paper we introduce a novel neural network architecture based on Fast Hough Transform layer. The layer of this type allows our neural network to accumulate features from linear areas across the entire image instead of local areas. We demonstrate its potential by solving the problem of vanishing points detection in the images of documents. Such problem occurs when dealing with camera shots of the documents in uncontrolled conditions. In this case, the document image can suffer several specific distortions including projective transform. To train our model, we use MIDV-500 dataset and provide testing results. The strong generalization ability of the suggested method is proven with its applying to a completely different ICDAR 2011 dewarping contest. In previously published papers considering these dataset authors measured the quality of vanishing point detection by counting correctly recognized words with open OCR engine Tesseract. To compare with them, we reproduce this experiment and show that our method outperforms the state-of-the-art result.Comment: 6 pages, 6 figures, 2 tables, 28 references, conferenc

    Form processing with the Hough transform

    Full text link
    A form document processing system based on the Hough transform (HT) is developed. It performs form identification and form registration. For form identification, HT is applied off-line to master forms to calculate form features and build-up the feature database, and it is performed on-line for the input (scanned) forms to extract features to identify the form type based on feature matching. The derived features are rotation, translation and scale invariant. The proposed form description is compact, thereby allows for fast identification. The registration is feature/knowledge based. Two methods for control points detection are discussed; one implements template matching for finding frame corners. The second approach is based on detection of line crossings via the analysis of the parameter space of the HT. Detected control points are used to calculate parameters of geometrical transform and perform coordinates translation. Linear conformal and projective transforms are tested. The system is featured by fast and reliable type identification, and the moderate preprocessing time, which is attained by proper design of the Hough space

    All-sky search for periodic gravitational waves in LIGO S4 data

    Get PDF
    We report on an all-sky search with the LIGO detectors for periodic gravitational waves in the frequency range 50-1000 Hz and with the frequency's time derivative in the range -1.0E-8 Hz/s to zero. Data from the fourth LIGO science run (S4) have been used in this search. Three different semi-coherent methods of transforming and summing strain power from Short Fourier Transforms (SFTs) of the calibrated data have been used. The first, known as "StackSlide", averages normalized power from each SFT. A "weighted Hough" scheme is also developed and used, and which also allows for a multi-interferometer search. The third method, known as "PowerFlux", is a variant of the StackSlide method in which the power is weighted before summing. In both the weighted Hough and PowerFlux methods, the weights are chosen according to the noise and detector antenna-pattern to maximize the signal-to-noise ratio. The respective advantages and disadvantages of these methods are discussed. Observing no evidence of periodic gravitational radiation, we report upper limits; we interpret these as limits on this radiation from isolated rotating neutron stars. The best population-based upper limit with 95% confidence on the gravitational-wave strain amplitude, found for simulated sources distributed isotropically across the sky and with isotropically distributed spin-axes, is 4.28E-24 (near 140 Hz). Strict upper limits are also obtained for small patches on the sky for best-case and worst-case inclinations of the spin axes.Comment: 39 pages, 41 figures An error was found in the computation of the C parameter defined in equation 44 which led to its overestimate by 2^(1/4). The correct values for the multi-interferometer, H1 and L1 analyses are 9.2, 9.7, and 9.3, respectively. Figure 32 has been updated accordingly. None of the upper limits presented in the paper were affecte
    corecore