4,006 research outputs found
Vanishing Point Detection with Direct and Transposed Fast Hough Transform inside the neural network
In this paper, we suggest a new neural network architecture for vanishing
point detection in images. The key element is the use of the direct and
transposed Fast Hough Transforms separated by convolutional layer blocks with
standard activation functions. It allows us to get the answer in the
coordinates of the input image at the output of the network and thus to
calculate the coordinates of the vanishing point by simply selecting the
maximum. Besides, it was proved that calculation of the transposed Fast Hough
Transform can be performed using the direct one. The use of integral operators
enables the neural network to rely on global rectilinear features in the image,
and so it is ideal for detecting vanishing points. To demonstrate the
effectiveness of the proposed architecture, we use a set of images from a DVR
and show its superiority over existing methods. Note, in addition, that the
proposed neural network architecture essentially repeats the process of direct
and back projection used, for example, in computed tomography.Comment: 9 pages, 9 figures, submitted to "Computer Optics"; extra experiment
added, new theorem proof added, references added; typos correcte
Advanced Hough-based method for on-device document localization
The demand for on-device document recognition systems increases in
conjunction with the emergence of more strict privacy and security
requirements. In such systems, there is no data transfer from the end device to
a third-party information processing servers. The response time is vital to the
user experience of on-device document recognition. Combined with the
unavailability of discrete GPUs, powerful CPUs, or a large RAM capacity on
consumer-grade end devices such as smartphones, the time limitations put
significant constraints on the computational complexity of the applied
algorithms for on-device execution.
In this work, we consider document location in an image without prior
knowledge of the document content or its internal structure. In accordance with
the published works, at least 5 systems offer solutions for on-device document
location. All these systems use a location method which can be considered
Hough-based. The precision of such systems seems to be lower than that of the
state-of-the-art solutions which were not designed to account for the limited
computational resources.
We propose an advanced Hough-based method. In contrast with other approaches,
it accounts for the geometric invariants of the central projection model and
combines both edge and color features for document boundary detection. The
proposed method allowed for the second best result for SmartDoc dataset in
terms of precision, surpassed by U-net like neural network. When evaluated on a
more challenging MIDV-500 dataset, the proposed algorithm guaranteed the best
precision compared to published methods. Our method retained the applicability
to on-device computations.Comment: This is a preprint of the article submitted for publication in the
journal "Computer Optics
Identifying music documents in a collection of images
Digital libraries and search engines are now well-equipped to find images of documents based on queries. Many images of music scores are now available, often mixed up with textual documents and images. For example, using the Google “images” search feature, a search for “Beethoven” will return a number of scores and manuscripts as well as pictures of the composer. In this paper we report on an investigation into methods to mechanically determine if a particular document is indeed a score, so that the user can specify that only musical scores should be returned. The goal is to find a minimal set of features that can be used as a quick test that will be applied to large numbers of documents.
A variety of filters were considered, and two promising ones (run-length ratios and Hough transform) were evaluated. We found that a method based around run-lengths in vertical scans (RL) that out-performs a comparable algorithm using the Hough transform (HT). On a test set of 1030 images, RL achieved recall and precision of 97.8% and 88.4% respectively while HT achieved 97.8% and 73.5%. In terms of processor time, RL was more than five times as fast as HT
Proceedings of the 4th field robot event 2006, Stuttgart/Hohenheim, Germany, 23-24th June 2006
Zeer uitgebreid verslag van het 4e Fieldrobotevent, dat gehouden werd op 23 en 24 juni 2006 in Stuttgart/Hohenhei
HoughNet: neural network architecture for vanishing points detection
In this paper we introduce a novel neural network architecture based on Fast
Hough Transform layer. The layer of this type allows our neural network to
accumulate features from linear areas across the entire image instead of local
areas. We demonstrate its potential by solving the problem of vanishing points
detection in the images of documents. Such problem occurs when dealing with
camera shots of the documents in uncontrolled conditions. In this case, the
document image can suffer several specific distortions including projective
transform. To train our model, we use MIDV-500 dataset and provide testing
results. The strong generalization ability of the suggested method is proven
with its applying to a completely different ICDAR 2011 dewarping contest. In
previously published papers considering these dataset authors measured the
quality of vanishing point detection by counting correctly recognized words
with open OCR engine Tesseract. To compare with them, we reproduce this
experiment and show that our method outperforms the state-of-the-art result.Comment: 6 pages, 6 figures, 2 tables, 28 references, conferenc
Form processing with the Hough transform
A form document processing system based on the Hough transform (HT) is developed. It performs form identification and form registration. For form identification, HT is applied off-line to master forms to calculate form features and build-up the feature database, and it is performed on-line for the input (scanned) forms to extract features to identify the form type based on feature matching. The derived features are rotation, translation and scale invariant. The proposed form description is compact, thereby allows for fast identification. The registration is feature/knowledge based. Two methods for control points detection are discussed; one implements template matching for finding frame corners. The second approach is based on detection of line crossings via the analysis of the parameter space of the HT. Detected control points are used to calculate parameters of geometrical transform and perform coordinates translation. Linear conformal and projective transforms are tested. The system is featured by fast and reliable type identification, and the moderate preprocessing time, which is attained by proper design of the Hough space
All-sky search for periodic gravitational waves in LIGO S4 data
We report on an all-sky search with the LIGO detectors for periodic
gravitational waves in the frequency range 50-1000 Hz and with the frequency's
time derivative in the range -1.0E-8 Hz/s to zero. Data from the fourth LIGO
science run (S4) have been used in this search. Three different semi-coherent
methods of transforming and summing strain power from Short Fourier Transforms
(SFTs) of the calibrated data have been used. The first, known as "StackSlide",
averages normalized power from each SFT. A "weighted Hough" scheme is also
developed and used, and which also allows for a multi-interferometer search.
The third method, known as "PowerFlux", is a variant of the StackSlide method
in which the power is weighted before summing. In both the weighted Hough and
PowerFlux methods, the weights are chosen according to the noise and detector
antenna-pattern to maximize the signal-to-noise ratio. The respective
advantages and disadvantages of these methods are discussed. Observing no
evidence of periodic gravitational radiation, we report upper limits; we
interpret these as limits on this radiation from isolated rotating neutron
stars. The best population-based upper limit with 95% confidence on the
gravitational-wave strain amplitude, found for simulated sources distributed
isotropically across the sky and with isotropically distributed spin-axes, is
4.28E-24 (near 140 Hz). Strict upper limits are also obtained for small patches
on the sky for best-case and worst-case inclinations of the spin axes.Comment: 39 pages, 41 figures An error was found in the computation of the C
parameter defined in equation 44 which led to its overestimate by 2^(1/4).
The correct values for the multi-interferometer, H1 and L1 analyses are 9.2,
9.7, and 9.3, respectively. Figure 32 has been updated accordingly. None of
the upper limits presented in the paper were affecte
- …