516 research outputs found

    Automatic visual recognition using parallel machines

    Get PDF
    Invariant features and quick matching algorithms are two major concerns in the area of automatic visual recognition. The former reduces the size of an established model database, and the latter shortens the computation time. This dissertation, will discussed both line invariants under perspective projection and parallel implementation of a dynamic programming technique for shape recognition. The feasibility of using parallel machines can be demonstrated through the dramatically reduced time complexity. In this dissertation, our algorithms are implemented on the AP1000 MIMD parallel machines. For processing an object with a features, the time complexity of the proposed parallel algorithm is O(n), while that of a uniprocessor is O(n2). The two applications, one for shape matching and the other for chain-code extraction, are used in order to demonstrate the usefulness of our methods. Invariants from four general lines under perspective projection are also discussed in here. In contrast to the approach which uses the epipolar geometry, we investigate the invariants under isotropy subgroups. Theoretically speaking, two independent invariants can be found for four general lines in 3D space. In practice, we show how to obtain these two invariants from the projective images of four general lines without the need of camera calibration. A projective invariant recognition system based on a hypothesis-generation-testing scheme is run on the hypercube parallel architecture. Object recognition is achieved by matching the scene projective invariants to the model projective invariants, called transfer. Then a hypothesis-generation-testing scheme is implemented on the hypercube parallel architecture

    Geometric and photometric affine invariant image registration

    Get PDF
    This thesis aims to present a solution to the correspondence problem for the registration of wide-baseline images taken from uncalibrated cameras. We propose an affine invariant descriptor that combines the geometry and photometry of the scene to find correspondences between both views. The geometric affine invariant component of the descriptor is based on the affine arc-length metric, whereas the photometry is analysed by invariant colour moments. A graph structure represents the spatial distribution of the primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs represent connectivities by extracted contours. After matching, we refine the search for correspondences by using a maximum likelihood robust algorithm. We have evaluated the system over synthetic and real data. The method is endemic to propagation of errors introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System

    Estimating the best reference homography for planar mosaics from videos

    Get PDF
    This paper proposes a novel strategy to find the best reference homography in mosaics from video sequences. The reference homography globally minimizes the distortions induced on each image frame by the mosaic homography itself. This method is designed for planar mosaics on which a bad choice of the first reference image frame can lead to severe distortions after concatenating several successive homographies. This often happens in the case of underwater mosaics with non-flat seabed and no georeferential information available. Given a video sequence of an almost planar surface, sub-mosaics with low distortions of temporally close image frames are computed and successively merged according to a hierarchical clustering procedure. A robust and effective feature tracker using an approximated global position map between image frames allows us to build the mosaic also between locally close but not temporally consecutive frames. Sub-mosaics are successively merged by concatenating their relative homographies with another reference homography which minimizes the distortion on each frame of the fused image. Experimental results on challenging real underwater videos show the validity of the proposed method

    Geometric data understanding : deriving case specific features

    Get PDF
    There exists a tradition using precise geometric modeling, where uncertainties in data can be considered noise. Another tradition relies on statistical nature of vast quantity of data, where geometric regularity is intrinsic to data and statistical models usually grasp this level only indirectly. This work focuses on point cloud data of natural resources and the silhouette recognition from video input as two real world examples of problems having geometric content which is intangible at the raw data presentation. This content could be discovered and modeled to some degree by such machine learning (ML) approaches like deep learning, but either a direct coverage of geometry in samples or addition of special geometry invariant layer is necessary. Geometric content is central when there is a need for direct observations of spatial variables, or one needs to gain a mapping to a geometrically consistent data representation, where e.g. outliers or noise can be easily discerned. In this thesis we consider transformation of original input data to a geometric feature space in two example problems. The first example is curvature of surfaces, which has met renewed interest since the introduction of ubiquitous point cloud data and the maturation of the discrete differential geometry. Curvature spectra can characterize a spatial sample rather well, and provide useful features for ML purposes. The second example involves projective methods used to video stereo-signal analysis in swimming analytics. The aim is to find meaningful local geometric representations for feature generation, which also facilitate additional analysis based on geometric understanding of the model. The features are associated directly to some geometric quantity, and this makes it easier to express the geometric constraints in a natural way, as shown in the thesis. Also, the visualization and further feature generation is much easier. Third, the approach provides sound baseline methods to more traditional ML approaches, e.g. neural network methods. Fourth, most of the ML methods can utilize the geometric features presented in this work as additional features.Geometriassa käytetään perinteisesti tarkkoja malleja, jolloin datassa esiintyvät epätarkkuudet edustavat melua. Toisessa perinteessä nojataan suuren datamäärän tilastolliseen luonteeseen, jolloin geometrinen säännönmukaisuus on datan sisäsyntyinen ominaisuus, joka hahmotetaan tilastollisilla malleilla ainoastaan epäsuorasti. Tämä työ keskittyy kahteen esimerkkiin: luonnonvaroja kuvaaviin pistepilviin ja videohahmontunnistukseen. Nämä ovat todellisia ongelmia, joissa geometrinen sisältö on tavoittamattomissa raakadatan tasolla. Tämä sisältö voitaisiin jossain määrin löytää ja mallintaa koneoppimisen keinoin, esim. syväoppimisen avulla, mutta joko geometria pitää kattaa suoraan näytteistämällä tai tarvitaan neuronien lisäkerros geometrisia invariansseja varten. Geometrinen sisältö on keskeinen, kun tarvitaan suoraa avaruudellisten suureiden havainnointia, tai kun tarvitaan kuvaus geometrisesti yhtenäiseen dataesitykseen, jossa poikkeavat näytteet tai melu voidaan helposti erottaa. Tässä työssä tarkastellaan datan muuntamista geometriseen piirreavaruuteen kahden esimerkkiohjelman suhteen. Ensimmäinen esimerkki on pintakaarevuus, joka on uudelleen virinneen kiinnostuksen kohde kaikkialle saatavissa olevan datan ja diskreetin geometrian kypsymisen takia. Kaarevuusspektrit voivat luonnehtia avaruudellista kohdetta melko hyvin ja tarjota koneoppimisessa hyödyllisiä piirteitä. Toinen esimerkki koskee projektiivisia menetelmiä käytettäessä stereovideosignaalia uinnin analytiikkaan. Tavoite on löytää merkityksellisiä paikallisen geometrian esityksiä, jotka samalla mahdollistavat muun geometrian ymmärrykseen perustuvan analyysin. Piirteet liittyvät suoraan johonkin geometriseen suureeseen, ja tämä helpottaa luonnollisella tavalla geometristen rajoitteiden käsittelyä, kuten väitöstyössä osoitetaan. Myös visualisointi ja lisäpiirteiden luonti muuttuu helpommaksi. Kolmanneksi, lähestymistapa suo selkeän vertailumenetelmän perinteisemmille koneoppimisen lähestymistavoille, esim. hermoverkkomenetelmille. Neljänneksi, useimmat koneoppimismenetelmät voivat hyödyntää tässä työssä esitettyjä geometrisia piirteitä lisäämällä ne muiden piirteiden joukkoon

    RCDN -- Robust X-Corner Detection Algorithm based on Advanced CNN Model

    Full text link
    Accurate detection and localization of X-corner on both planar and non-planar patterns is a core step in robotics and machine vision. However, previous works could not make a good balance between accuracy and robustness, which are both crucial criteria to evaluate the detectors performance. To address this problem, in this paper we present a novel detection algorithm which can maintain high sub-pixel precision on inputs under multiple interference, such as lens distortion, extreme poses and noise. The whole algorithm, adopting a coarse-to-fine strategy, contains a X-corner detection network and three post-processing techniques to distinguish the correct corner candidates, as well as a mixed sub-pixel refinement technique and an improved region growth strategy to recover the checkerboard pattern partially visible or occluded automatically. Evaluations on real and synthetic images indicate that the presented algorithm has the higher detection rate, sub-pixel accuracy and robustness than other commonly used methods. Finally, experiments of camera calibration and pose estimation verify it can also get smaller re-projection error in quantitative comparisons to the state-of-the-art.Comment: 15 pages, 8 figures and 4 tables. Unpublished further research and experiments of Checkerboard corner detection network CCDN (arXiv:2302.05097) and application exploration for robust camera calibration (https://ieeexplore.ieee.org/abstract/document/9428389

    Robot Vision in the Language of Geometric Algebra

    Get PDF

    Feature-based Image Comparison and Its Application in Wireless Visual Sensor Networks

    Get PDF
    This dissertation studies the feature-based image comparison method and its application in Wireless Visual Sensor Networks. Wireless Visual Sensor Networks (WVSNs), formed by a large number of low-cost, small-size visual sensor nodes, represent a new trend in surveillance and monitoring practices. Although each single sensor has very limited capability in sensing, processing and transmission, by working together they can achieve various high level tasks. Sensor collaboration is essential to WVSNs and normally performed among sensors having similar measurements, which are called neighbor sensors. The directional sensing characteristics of imagers and the presence of visual occlusion present unique challenges to neighborhood formation, as geographically-close neighbors might not monitor similar scenes. Besides, the energy resource on the WVSNs is also very tight, with wireless communication and complicated computation consuming most energy in WVSNs. Therefore the feature-based image comparison method has been proposed, which directly compares the captured image from each visual sensor in an economical way in terms of both the computational cost and the transmission overhead. The feature-based image comparison method compares different images and aims to find similar image pairs using a set of local features from each image. The image feature is a numerical representation of the raw image and can be more compact in terms of the data volume than the raw image. The feature-based image comparison contains three steps: feature detection, descriptor calculation and feature comparison. For the step of feature detection, the dissertation proposes two computationally efficient corner detectors. The first detector is based on the Discrete Wavelet Transform that provides multi-scale corner point detection and the scale selection is achieved efficiently through a Gaussian convolution approach. The second detector is based on a linear unmixing model, which treats a corner point as the intersection of two or three “line” bases in a 3 by 3 region. The line bases are extracted through a constrained Nonnegative Matrix Factorization (NMF) approach and the corner detection is accomplished through counting the number of contributing bases in the linear mixture. For the step of descriptor calculation, the dissertation proposes an effective dimensionality reduction algorithm for the high dimensional Scale Invariant Feature Transform (SIFT) descriptors. A set of 40 SIFT descriptor bases are extracted through constrained NMF from a large training set and all SIFT descriptors are then projected onto the space spanned by these bases, achieving dimensionality reduction. The efficiency of the proposed corner detectors have been proven through theoretical analysis. In addition, the effectiveness of the proposed corner detectors and the dimensionality reduction approach has been validated through extensive comparison with several state-of-the-art feature detector/descriptor combinations
    corecore