280 research outputs found

    Learning scale-variant and scale-invariant features for deep image classification

    Get PDF
    Convolutional Neural Networks (CNNs) require large image corpora to be trained on classification tasks. The variation in image resolutions, sizes of objects and patterns depicted, and image scales, hampers CNN training and performance, because the task-relevant information varies over spatial scales. Previous work attempting to deal with such scale variations focused on encouraging scale-invariant CNN representations. However, scale-invariant representations are incomplete representations of images, because images contain scale-variant information as well. This paper addresses the combined development of scale-invariant and scale-variant representations. We propose a multi- scale CNN method to encourage the recognition of both types of features and evaluate it on a challenging image classification task involving task-relevant characteristics at multiple scales. The results show that our multi-scale CNN outperforms single-scale CNN. This leads to the conclusion that encouraging the combined development of a scale-invariant and scale-variant representation in CNNs is beneficial to image recognition performance

    3D machine vision system for robotic weeding and plant phenotyping

    Get PDF
    The need for chemical free food is increasing and so is the demand for a larger supply to feed the growing global population. An autonomous weeding system should be capable of differentiating crop plants and weeds to avoid contaminating crops with herbicide or damaging them with mechanical tools. For the plant genetics industry, automated high-throughput phenotyping technology is critical to profiling seedlings at a large scale to facilitate genomic research. This research applied 2D and 3D imaging techniques to develop an innovative crop plant recognition system and a 3D holographic plant phenotyping system. A 3D time-of-flight (ToF) camera was used to develop a crop plant recognition system for broccoli and soybean plants. The developed system overcame the previously unsolved problems caused by occluded canopy and illumination variation. Both 2D and 3D features were extracted and utilized for the plant recognition task. Broccoli and soybean recognition algorithms were developed based on the characteristics of the plants. At field experiments, detection rates of over 88.3% and 91.2% were achieved for broccoli and soybean plants, respectively. The detection algorithm also reached a speed over 30 frame per second (fps), making it applicable for robotic weeding operations. Apart from applying 3D vision for plant recognition, a 3D reconstruction based phenotyping system was also developed for holographic 3D reconstruction and physical trait parameter estimation for corn plants. In this application, precise alignment of multiple 3D views is critical to the 3D reconstruction of a plant. Previously published research highlighted the need for high-throughput, high-accuracy, and low-cost 3D phenotyping systems capable of holographic plant reconstruction and plant morphology related trait characterization. This research contributed to the realization of such a system by integrating a low-cost 2D camera, a low-cost 3D ToF camera, and a chessboard-pattern beacon array to track the 3D camera\u27s position and attitude, thus accomplishing precise 3D point cloud registration from multiple views. Specifically, algorithms of beacon target detection, camera pose tracking, and spatial relationship calibration between 2D and 3D cameras were developed. The phenotypic data obtained by this novel 3D reconstruction based phenotyping system were validated by the experimental data generated by the instrument and manual measurements, showing that the system has achieved measurement accuracy of more than 90% for most cases under an average of less than five seconds processing time per plant

    Humanoid Localization on Robocup Field using Corner Intersection and Geometric Distance Estimation

    Get PDF
    In the humanoid competition field, identifying landmarks for localizing robots in a dynamic environment is of crucial importance. By convention, state-of-the-art humanoid vision systems rely on poles located outside the middle of the field as an indicator for generating landmarks. However, in compliance with the recent rules of Robocup, the middle pole has been discarded to deliberately provide less prior information for the humanoid vision system to strategize its winning tactics on the field. Previous localization method used middle poles as a landmark. Therefore, robot localization tasks should apply accurate corner and distance detection simultaneously to locate the positions of goalposts. State-of-the-art corner detection algorithms such as the Harris corner and mean projection transformation are excessively sensitive to image noise and suffer from high processing times. Moreover, despite their prevalence in robot motor log and fish-eye lens calibration for humanoid localization, current distance estimation techniques nonetheless remain highly dependent on multiple poles as vision landmarks, apart from being prone to huge localization errors. Thus, we propose a novel localization method consisting of a proposed corner extraction algorithm, namely, the contour intersection algorithm (CIA), and a distance estimation algorithm, namely, analytic geometric estimation (AGE), for efficiently identifying salient goalposts. At first, the proposed CIA algorithm, which is based on linear contour intersection using a projection matrix, is utilized to extract corners of a goalpost after performing an adaptive binarization process. Then, these extracted corner features are fed into our proposed AGE algorithm to estimate the real-word distance using analytic geometry methods. As a result, the proposed localization vision system and the state-of-the-art method obtained approximately 3-4 and 7-23 centimeter estimation errors, respectively. This demonstrates the capability of the proposed localization algorithm to outperform other methods, which renders it more effective in indoor task localization for further actions such as attack or defense strategies

    Orchard mapping and mobile robot localisation using on-board camera and laser scanner data fusion

    Get PDF
    Agricultural mobile robots have great potential to effectively implement different agricultural tasks. They can save human labour costs, avoid the need for people having to perform risky operations and increase productivity. Automation and advanced sensing technologies can provide up-to-date information that helps farmers in orchard management. Data collected from on-board sensors on a mobile robot provide information that can help the farmer detect tree or fruit diseases or damage, measure tree canopy volume and monitor fruit development. In orchards, trees are natural landmarks providing suitable cues for mobile robot localisation and navigation as trees are nominally planted in straight and parallel rows. This thesis presents a novel tree trunk detection algorithm that detects trees and discriminates between trees and non-tree objects in the orchard using a camera and 2D laser scanner data fusion. A local orchard map of the individual trees was developed allowing the mobile robot to navigate to a specific tree in the orchard to perform a specific task such as tree inspection. Furthermore, this thesis presents a localisation algorithm that does not rely on GPS positions and depends only on the on-board sensors of the mobile robot without adding any artificial landmarks, respective tapes or tags to the trees. The novel tree trunk detection algorithm combined the features extracted from a low cost camera's images and 2D laser scanner data to increase the robustness of the detection. The developed algorithm used a new method to detect the edge points and determine the width of the tree trunks and non-tree objects from the laser scan data. Then a projection of the edge points from the laser scanner coordinates to the image plane was implemented to construct a region of interest with the required features for tree trunk colour and edge detection. The camera images were used to verify the colour and the parallel edges of the tree trunks and non-tree objects. The algorithm automatically adjusted the colour detection parameters after each test which was shown to increase the detection accuracy. The orchard map was constructed based on tree trunk detection and consisted of the 2D positions of the individual trees and non-tree objects. The map of the individual trees was used as an a priority map for mobile robot localisation. A data fusion algorithm based on an Extended Kalman filter was used for pose estimation of the mobile robot in different paths (midway between rows, close to the rows and moving around trees in the row) and different turns (semi-circle and right angle turns) required for tree inspection tasks. The 2D positions of the individual trees were used in the correction step of the Extended Kalman filter to enhance localisation accuracy. Experimental tests were conducted in a simulated environment and a real orchard to evaluate the performance of the developed algorithms. The tree trunk detection algorithm was evaluated under two broad illumination conditions (sunny and cloudy). The algorithm was able to detect the tree trunks (regular and thin tree trunks) and discriminate between trees and non-tree objects with a detection accuracy of 97% showing that the fusion of both vision and 2D laser scanner technologies produced robust tree trunk detection. The mapping method successfully localised all the trees and non-tree objects of the tested tree rows in the orchard environment. The mapping results indicated that the constructed map can be reliably used for mobile robot localisation and navigation. The localisation algorithm was evaluated against the logged RTK-GPS positions for different paths and headland turns. The average of the RMS of the position error in x, y coordinates and Euclidean distance were 0.08 m, 0.07 m and 0.103 m respectively, whilst the average of the RMS of the heading error was 3:32°. These results were considered acceptable while driving along the rows and when executing headland turns for the target application of autonomous mobile robot navigation and tree inspection tasks in orchards

    Advanced Calibration of Automotive Augmented Reality Head-Up Displays = Erweiterte Kalibrierung von Automotiven Augmented Reality-Head-Up-Displays

    Get PDF
    In dieser Arbeit werden fortschrittliche Kalibrierungsmethoden für Augmented-Reality-Head-up-Displays (AR-HUDs) in Kraftfahrzeugen vorgestellt, die auf parametrischen perspektivischen Projektionen und nichtparametrischen Verzerrungsmodellen basieren. Die AR-HUD-Kalibrierung ist wichtig, um virtuelle Objekte in relevanten Anwendungen wie z.B. Navigationssystemen oder Parkvorgängen korrekt zu platzieren. Obwohl es im Stand der Technik einige nützliche Ansätze für dieses Problem gibt, verfolgt diese Dissertation das Ziel, fortschrittlichere und dennoch weniger komplizierte Ansätze zu entwickeln. Als Voraussetzung für die Kalibrierung haben wir mehrere relevante Koordinatensysteme definiert, darunter die dreidimensionale (3D) Welt, den Ansichtspunkt-Raum, den HUD-Sichtfeld-Raum (HUD-FOV) und den zweidimensionalen (2D) virtuellen Bildraum. Wir beschreiben die Projektion der Bilder von einem AR-HUD-Projektor in Richtung der Augen des Fahrers als ein ansichtsabhängiges Lochkameramodell, das aus intrinsischen und extrinsischen Matrizen besteht. Unter dieser Annahme schätzen wir zunächst die intrinsische Matrix unter Verwendung der Grenzen des HUD-Sichtbereichs. Als nächstes kalibrieren wir die extrinsischen Matrizen an verschiedenen Blickpunkten innerhalb einer ausgewählten "Eyebox" unter Berücksichtigung der sich ändernden Augenpositionen des Fahrers. Die 3D-Positionen dieser Blickpunkte werden von einer Fahrerkamera verfolgt. Für jeden einzelnen Blickpunkt erhalten wir eine Gruppe von 2D-3D-Korrespondenzen zwischen einer Menge Punkten im virtuellen Bildraum und ihren übereinstimmenden Kontrollpunkten vor der Windschutzscheibe. Sobald diese Korrespondenzen verfügbar sind, berechnen wir die extrinsische Matrix am entsprechenden Betrachtungspunkt. Durch Vergleichen der neu projizierten und realen Pixelpositionen dieser virtuellen Punkte erhalten wir eine 2D-Verteilung von Bias-Vektoren, mit denen wir Warping-Karten rekonstruieren, welche die Informationen über die Bildverzerrung enthalten. Für die Vollständigkeit wiederholen wir die obigen extrinsischen Kalibrierungsverfahren an allen ausgewählten Betrachtungspunkten. Mit den kalibrierten extrinsischen Parametern stellen wir die Betrachtungspunkte wieder her im Weltkoordinatensystem. Da wir diese Punkte gleichzeitig im Raum der Fahrerkamera verfolgen, kalibrieren wir weiter die Transformation von der Fahrerkamera in den Weltraum unter Verwendung dieser 3D-3D-Korrespondenzen. Um mit nicht teilnehmenden Betrachtungspunkten innerhalb der Eyebox umzugehen, erhalten wir ihre extrinsischen Parameter und Warping-Karten durch nichtparametrische Interpolationen. Unsere Kombination aus parametrischen und nichtparametrischen Modellen übertrifft den Stand der Technik hinsichtlich der Zielkomplexität sowie Zeiteffizienz, während wir eine vergleichbare Kalibrierungsgenauigkeit beibehalten. Bei allen unseren Kalibrierungsschemen liegen die Projektionsfehler in der Auswertungsphase bei einer Entfernung von 7,5 Metern innerhalb weniger Millimeter, was einer Winkelgenauigkeit von ca. 2 Bogenminuten entspricht, was nahe am Auflösungvermögen des Auges liegt

    Scan matching by cross-correlation and differential evolution

    Get PDF
    Scan matching is an important task, solved in the context of many high-level problems including pose estimation, indoor localization, simultaneous localization and mapping and others. Methods that are accurate and adaptive and at the same time computationally efficient are required to enable location-based services in autonomous mobile devices. Such devices usually have a wide range of high-resolution sensors but only a limited processing power and constrained energy supply. This work introduces a novel high-level scan matching strategy that uses a combination of two advanced algorithms recently used in this field: cross-correlation and differential evolution. The cross-correlation between two laser range scans is used as an efficient measure of scan alignment and the differential evolution algorithm is used to search for the parameters of a transformation that aligns the scans. The proposed method was experimentally validated and showed good ability to match laser range scans taken shortly after each other and an excellent ability to match laser range scans taken with longer time intervals between them.Web of Science88art. no. 85

    Cross-calibration of Time-of-flight and Colour Cameras

    Get PDF
    Time-of-flight cameras provide depth information, which is complementary to the photometric appearance of the scene in ordinary images. It is desirable to merge the depth and colour information, in order to obtain a coherent scene representation. However, the individual cameras will have different viewpoints, resolutions and fields of view, which means that they must be mutually calibrated. This paper presents a geometric framework for this multi-view and multi-modal calibration problem. It is shown that three-dimensional projective transformations can be used to align depth and parallax-based representations of the scene, with or without Euclidean reconstruction. A new evaluation procedure is also developed; this allows the reprojection error to be decomposed into calibration and sensor-dependent components. The complete approach is demonstrated on a network of three time-of-flight and six colour cameras. The applications of such a system, to a range of automatic scene-interpretation problems, are discussed.Comment: 18 pages, 12 figures, 3 table

    Object Duplicate Detection

    Get PDF
    With the technological evolution of digital acquisition and storage technologies, millions of images and video sequences are captured every day and shared in online services. One way of exploring this huge volume of images and videos is through searching a particular object depicted in images or videos by making use of object duplicate detection. Therefore, need of research on object duplicate detection is validated by several image and video retrieval applications, such as tag propagation, augmented reality, surveillance, mobile visual search, and television statistic measurement. Object duplicate detection is detecting visually same or very similar object to a query. Input is not restricted to an image, it can be several images from an object or even it can be a video. This dissertation describes the author's contribution to solve problems on object duplicate detection in computer vision. A novel graph-based approach is introduced for 2D and 3D object duplicate detection in still images. Graph model is used to represent the 3D spatial information of the object based on the local features extracted from training images so that an explicit and complex 3D object modeling is avoided. Therefore, improved performance can be achieved in comparison to existing methods in terms of both robustness and computational complexity. Our method is shown to be robust in detecting the same objects even when images containing the objects are taken from very different viewpoints or distances. Furthermore, we apply our object duplicate detection method to video, where the training images are added iteratively to the video sequence in order to compensate for 3D view variations, illumination changes and partial occlusions. Finally, we show several mobile applications for object duplicate detection, such as object recognition based museum guide, money recognition or flower recognition. General object duplicate detection may fail to detection chess figures, however considering context, like chess board position and height of the chess figure, detection can be more accurate. We show that user interaction further improves image retrieval compared to pure content-based methods through a game, called Epitome
    corecore