126 research outputs found

    Comparison of Semantic Segmentation Approaches for Horizon/Sky Line Detection

    Full text link
    Horizon or skyline detection plays a vital role towards mountainous visual geo-localization, however most of the recently proposed visual geo-localization approaches rely on \textbf{user-in-the-loop} skyline detection methods. Detecting such a segmenting boundary fully autonomously would definitely be a step forward for these localization approaches. This paper provides a quantitative comparison of four such methods for autonomous horizon/sky line detection on an extensive data set. Specifically, we provide the comparison between four recently proposed segmentation methods; one explicitly targeting the problem of horizon detection\cite{Ahmad15}, second focused on visual geo-localization but relying on accurate detection of skyline \cite{Saurer16} and other two proposed for general semantic segmentation -- Fully Convolutional Networks (FCN) \cite{Long15} and SegNet\cite{Badrinarayanan15}. Each of the first two methods is trained on a common training set \cite{Baatz12} comprised of about 200 images while models for the third and fourth method are fine tuned for sky segmentation problem through transfer learning using the same data set. Each of the method is tested on an extensive test set (about 3K images) covering various challenging geographical, weather, illumination and seasonal conditions. We report average accuracy and average absolute pixel error for each of the presented formulation.Comment: Proceedings of the International Joint Conference on Neural Networks (IJCNN) (oral presentation), IEEE Computational Intelligence Society, 201

    Machine Learning based Mountainous Skyline Detection and Visual Geo-Localization

    Get PDF
    With the ubiquitous availability of geo-tagged imagery and increased computational power, geo-localization has captured a lot of attention from researchers in computer vision and image retrieval communities. Significant progress has been made in urban environments with stable man-made structures and geo-referenced street imagery of frequently visited tourist attractions. However, geo-localization of natural/mountain scenes is more challenging due to changed vegetations, lighting, seasonal changes and lack of geo-tagged imagery. Conventional approaches for mountain/natural geo-localization mostly rely on mountain peaks and valley information, visible skylines and ridges etc. Skyline (boundary segmenting sky and non-sky regions) has been established to be a robust natural feature for mountainous images, which can be matched with the synthetic skylines generated from publicly available terrain maps such as Digital Elevation Models (DEMs). Skyline or visible horizon finds further applications in various other contexts e.g. smooth navigation of Unmanned Aerial Vehicles (UAVs)/Micro Aerial Vehicles (MAVs), port security, ship detection and outdoor robot/vehicle localization.\parProminent methods for skyline/horizon detection are based on non-realistic assumptions and rely on mere edge detection and/or linear line fitting using Hough transform. We investigate the use of supervised machine learning for skyline detection. Specifically we propose two novel machine learning based methods, one relying on edge detection and classification while other solely based on classification. Given a query image, an edge or classification map is first built and converted into a multi-stage graph problem. Dynamic programming is then used to find a shortest path which conforms to the detected skyline in the given image. For the first method, we provide a detailed quantitative analysis for various texture features (Scale Invariant Feature Transform (SIFT), Local Binary Patterns (LBP), Histogram of Oriented Gradients (HOG) and their combinations) used to train a Support Vector Machine (SVM) classifier and different choices (binary edges, classified edge score, gradient score and their combinations) for the nodal costs for Dynamic Programming (DP). For the second method, we investigate the use of dense classification maps for horizon line detection. We use Support Vector Machines (SVMs) and Convolutional Neural Networks (CNNs) as our classifier choices and use normalized intensity patches as features. Both proposed formulations are compared with a prominent edge based method on two different data sets.\par We propose a fusion strategy which boosts the performance of the edge-less approach using edge information. The fusion approach, which has been tested on an additional challenging data set, outperforms each of the two methods alone. Further, we demonstrate the capability of our formulations to detect absence of horizon boundary and detection of partial horizon lines. This could be of great value in applications where a confidence measure of the detection is necessary e.g. localization of planetary rovers/robots. In an extended work, we compare our edge-less skyline detection approach against deep learning networks recently proposed for semantic segmentation on an additional data set. Specifically, we compare our proposed fusion formulation with Fully Convolutional Network (FCN), SegNet and another classical supervised learning based method.\par We further propose a visual geo-localization pipeline based on evolutionary computing; where Particle Swarm Optimization (PSO) is adopted to find/refine an orientation estimate by minimizing the cost function based on horizon-ness probability of pixels. The dense classification score image resulting from our edge-less/fusion approach is used as a fitness measure to guide the particles toward best solution where the rendered horizon from DEM perfectly aligns with the actual horizon from the image without even requiring its explicit detection. The effectiveness of the proposed geo-localization pipeline is evaluated on a decent sized data set

    Matchability prediction for full-search template matching algorithms

    Get PDF
    While recent approaches have shown that it is possible to do template matching by exhaustively scanning the parameter space, the resulting algorithms are still quite demanding. In this paper we alleviate the computational load of these algorithms by proposing an efficient approach for predicting the match ability of a template, before it is actually performed. This avoids large amounts of unnecessary computations. We learn the match ability of templates by using dense convolutional neural network descriptors that do not require ad-hoc criteria to characterize a template. By using deep learning descriptions of patches we are able to predict match ability over the whole image quite reliably. We will also show how no specific training data is required to solve problems like panorama stitching in which you usually require data from the scene in question. Due to the highly parallelizable nature of this tasks we offer an efficient technique with a negligible computational cost at test time.Peer ReviewedPostprint (author's final draft

    The People Inside

    Get PDF
    Our collection begins with an example of computer vision that cuts through time and bureaucratic opacity to help us meet real people from the past. Buried in thousands of files in the National Archives of Australia is evidence of the exclusionary “White Australia” policies of the nineteenth and twentieth centuries, which were intended to limit and discourage immigration by non-Europeans. Tim Sherratt and Kate Bagnall decided to see what would happen if they used a form of face-detection software made ubiquitous by modern surveillance systems and applied it to a security system of a century ago. What we get is a new way to see the government documents, not as a source of statistics but, Sherratt and Bagnall argue, as powerful evidence of the people affected by racism

    A study of the role of an Unmanned Aerial Vehicle (UAV) in creating an Enhanced Virtual Field Guide (EVFG) in Geoscience Fieldwork

    Get PDF
    This thesis investigated the role of an Unmanned Aerial Vehicle (UAV) in the creation of an Enhanced Virtual Field Guide (EVFG) in Geoscience fieldwork. This research used a pragmatic mixed methods approach to investigate the research question “How can an Unmanned Aerial Vehicle’s data be used to create an Enhanced Virtual Field Guide for Geoscience fieldwork?” The thesis examines the question in four distinct sections; fieldwork, mobile technologies in fieldwork, UAVs in fieldwork and finally, the creation and evaluation of the Enhanced Virtual Field Guide created by UAV technology. To achieve this, online questionnaires, interviews, focus groups and fieldwork observations with a selection of Geoscience staff and students at two UK Universities were utilised. UAVs are a rapidly emerging commercial technology, however, their uptake and critical discussion around their potential in fieldwork with students has been limited. This study created with the guidance of those interviewed in this research, an innovative Enhanced Virtual Field Guide for students to utilise in their final year fieldwork module and assignment. Findings from this research with regards to fieldwork and mobile technologies confirms that fieldwork and mobile technologies are still an integral part of a geoscience students course and the majority of students still greatly enjoy the positive aspects of fieldwork. However, this research has discovered many unexplored darker sides of fieldwork and mobile technology use in fieldwork, such as disabilities, distractions, and lack of access for some students. In terms of the educational value of UAVs, this research showcases the many potential benefits for the fieldwork experience. Yet, this thesis highlights the many distinct and unique challenges that are attributed to UAV technologies that have and will continue to hinder their uptake on fieldwork. The value of this EVFG developed from a UAV has been shown to be a useful tool for educators and students on fieldwork as examined in this thesis, such as an improvement of efficiency in the field, deeper and more peer learning discussions in the field and for it to be an effective learning tool for both educators and students, particularly post-fieldwork

    Multi-Sensory Interaction for Blind and Visually Impaired People

    Get PDF
    This book conveyed the visual elements of artwork to the visually impaired through various sensory elements to open a new perspective for appreciating visual artwork. In addition, the technique of expressing a color code by integrating patterns, temperatures, scents, music, and vibrations was explored, and future research topics were presented. A holistic experience using multi-sensory interaction acquired by people with visual impairment was provided to convey the meaning and contents of the work through rich multi-sensory appreciation. A method that allows people with visual impairments to engage in artwork using a variety of senses, including touch, temperature, tactile pattern, and sound, helps them to appreciate artwork at a deeper level than can be achieved with hearing or touch alone. The development of such art appreciation aids for the visually impaired will ultimately improve their cultural enjoyment and strengthen their access to culture and the arts. The development of this new concept aids ultimately expands opportunities for the non-visually impaired as well as the visually impaired to enjoy works of art and breaks down the boundaries between the disabled and the non-disabled in the field of culture and arts through continuous efforts to enhance accessibility. In addition, the developed multi-sensory expression and delivery tool can be used as an educational tool to increase product and artwork accessibility and usability through multi-modal interaction. Training the multi-sensory experiences introduced in this book may lead to more vivid visual imageries or seeing with the mind’s eye

    Object Duplicate Detection

    Get PDF
    With the technological evolution of digital acquisition and storage technologies, millions of images and video sequences are captured every day and shared in online services. One way of exploring this huge volume of images and videos is through searching a particular object depicted in images or videos by making use of object duplicate detection. Therefore, need of research on object duplicate detection is validated by several image and video retrieval applications, such as tag propagation, augmented reality, surveillance, mobile visual search, and television statistic measurement. Object duplicate detection is detecting visually same or very similar object to a query. Input is not restricted to an image, it can be several images from an object or even it can be a video. This dissertation describes the author's contribution to solve problems on object duplicate detection in computer vision. A novel graph-based approach is introduced for 2D and 3D object duplicate detection in still images. Graph model is used to represent the 3D spatial information of the object based on the local features extracted from training images so that an explicit and complex 3D object modeling is avoided. Therefore, improved performance can be achieved in comparison to existing methods in terms of both robustness and computational complexity. Our method is shown to be robust in detecting the same objects even when images containing the objects are taken from very different viewpoints or distances. Furthermore, we apply our object duplicate detection method to video, where the training images are added iteratively to the video sequence in order to compensate for 3D view variations, illumination changes and partial occlusions. Finally, we show several mobile applications for object duplicate detection, such as object recognition based museum guide, money recognition or flower recognition. General object duplicate detection may fail to detection chess figures, however considering context, like chess board position and height of the chess figure, detection can be more accurate. We show that user interaction further improves image retrieval compared to pure content-based methods through a game, called Epitome

    Ubiquitous interactive displays: magical experiences beyond the screen

    Get PDF
    Ubiquitous Interactive Displays are interfaces that extend interaction beyond traditional flat screens. This thesis presents a series of proof-of-concept systems exploring three interactive displays: the first part of this thesis explores interactive projective displays, where the use of projected light transforms and enhances physical objects in our environment. The second part of this thesis explores gestural displays, where traditional mobile devices such as our smartphones are equipped with depth sensors to enable input and output around a device. Finally, I introduce a new tactile display that imbues our physical spaces with a sense of touch in mid air without requiring the user to wear a physical device. These systems explore a future where interfaces are inherently everywhere, connecting our physical objects and spaces together through visual, gestural and tactile displays. I aim to demonstrate new technical innovations as well as compelling interactions with one ore more users and their physical environment. These new interactive displays enable novel experiences beyond flat screens that blurs the line between the physical and virtual world
    • …
    corecore