3,446 research outputs found

    Adversarial Training for Adverse Conditions: Robust Metric Localisation using Appearance Transfer

    Full text link
    We present a method of improving visual place recognition and metric localisation under very strong appear- ance change. We learn an invertable generator that can trans- form the conditions of images, e.g. from day to night, summer to winter etc. This image transforming filter is explicitly designed to aid and abet feature-matching using a new loss based on SURF detector and dense descriptor maps. A network is trained to output synthetic images optimised for feature matching given only an input RGB image, and these generated images are used to localize the robot against a previously built map using traditional sparse matching approaches. We benchmark our results using multiple traversals of the Oxford RobotCar Dataset over a year-long period, using one traversal as a map and the other to localise. We show that this method significantly improves place recognition and localisation under changing and adverse conditions, while reducing the number of mapping runs needed to successfully achieve reliable localisation.Comment: Accepted at ICRA201

    Automated taxiing for unmanned aircraft systems

    Get PDF
    Over the last few years, the concept of civil Unmanned Aircraft System(s) (UAS) has been realised, with small UASs commonly used in industries such as law enforcement, agriculture and mapping. With increased development in other areas, such as logistics and advertisement, the size and range of civil UAS is likely to grow. Taken to the logical conclusion, it is likely that large scale UAS will be operating in civil airspace within the next decade. Although the airborne operations of civil UAS have already gathered much research attention, work is also required to determine how UAS will function when on the ground. Motivated by the assumption that large UAS will share ground facilities with manned aircraft, this thesis describes the preliminary development of an Automated Taxiing System(ATS) for UAS operating at civil aerodromes. To allow the ATS to function on the majority of UAS without the need for additional hardware, a visual sensing approach has been chosen, with the majority of work focusing on monocular image processing techniques. The purpose of the computer vision system is to provide direct sensor data which can be used to validate the vehicle s position, in addition to detecting potential collision risks. As aerospace regulations require the most robust and reliable algorithms for control, any methods which are not fully definable or explainable will not be suitable for real-world use. Therefore, non-deterministic methods and algorithms with hidden components (such as Artificial Neural Network (ANN)) have not been used. Instead, the visual sensing is achieved through a semantic segmentation, with separate segmentation and classification stages. Segmentation is performed using superpixels and reachability clustering to divide the image into single content clusters. Each cluster is then classified using multiple types of image data, probabilistically fused within a Bayesian network. The data set for testing has been provided by BAE Systems, allowing the system to be trained and tested on real-world aerodrome data. The system has demonstrated good performance on this limited dataset, accurately detecting both collision risks and terrain features for use in navigation

    Learning to See the Wood for the Trees: Deep Laser Localization in Urban and Natural Environments on a CPU

    Full text link
    Localization in challenging, natural environments such as forests or woodlands is an important capability for many applications from guiding a robot navigating along a forest trail to monitoring vegetation growth with handheld sensors. In this work we explore laser-based localization in both urban and natural environments, which is suitable for online applications. We propose a deep learning approach capable of learning meaningful descriptors directly from 3D point clouds by comparing triplets (anchor, positive and negative examples). The approach learns a feature space representation for a set of segmented point clouds that are matched between a current and previous observations. Our learning method is tailored towards loop closure detection resulting in a small model which can be deployed using only a CPU. The proposed learning method would allow the full pipeline to run on robots with limited computational payload such as drones, quadrupeds or UGVs.Comment: Accepted for publication at RA-L/ICRA 2019. More info: https://ori.ox.ac.uk/esm-localizatio

    Networks in Berlin’s Music Industry – A Spatial Analysis

    Get PDF
    In addition to a distinct regional concentration of the branch in a few, large metropolitan areas in Germany, Berlin shows inner-city (inner-regional) concentrations of the music industry and its players linked with the value chain as well as branch-relevant institutions. By means of a written survey of companies in the media and IT industries in Berlin and Brandenburg plus expert interviews, an analysis of the Berlin music branch, regarding its spatial as well as organizational concentration and how this concentration is perceived by companies, has been carried out. A comparison of the results within the branch and with the Brandenburg region can be made on the basis of a differentiation of the media branch in the analysis. This analysis found that creative milieus are of particular importance as they perform the role of being the driving force in developing the field of music. Therefore this paper examines spillovers into this industry, as a first step of spatial concentration in terms of networks of music companies, institutions, and the specific and innovative milieu and the geographical dimension of knowledge. Furthermore, evidence has been found through the use of economic and socio-cultural indicators. Urbanization economies become especially clear (apparent) for the region in the examination of Berlin’s music industry with their intersectoral integration and cross-sectoral stimilus to settlement and formation of companies.

    Localisation in 3D Images Using Cross-features Correlation Learning

    Get PDF
    Object detection and segmentation have evolved drastically over the past two decades thanks to the continuous advancement in the field of deep learning. Substantial research efforts have been dedicated towards integrating object detection techniques into a wide range of real-world prob-lems. Most existing methods take advantage of the successful application and representational ability of convolutional neural networks (CNNs). Generally, these methods target mainstream applications that are typically based on 2D imaging scenarios. Additionally, driven by the strong correlation between the quality of the feature embedding and the performance in CNNs, most works focus on design characteristics of CNNs, e.g., depth and width, to enhance their modelling capacity and discriminative ability. Limited research was directed towards exploiting feature-level dependencies, which can be feasibly used to enhance the performance of CNNs. More-over, directly adopting such approaches into more complex imaging domains that target data of higher dimensions (e.g., 3D multi-modal and volumetric images) is not straightforwardly appli-cable due to the different nature and complexity of the problem. In this thesis, we explore the possibility of incorporating feature-level correspondence and correlations into object detection and segmentation contexts that target the localisation of 3D objects from 3D multi-modal and volumetric image data. Accordingly, we first explore the detection problem of 3D solar active regions in multi-spectral solar imagery where different imaging bands correspond to different 2D layers (altitudes) in the 3D solar atmosphere.We propose a joint analysis approach in which information from different imaging bands is first individually analysed using band-specific network branches to extract inter-band features that are then dynamically cross-integrated and jointly analysed to investigate spatial correspon-dence and co-dependencies between the different bands. The aggregated embeddings are further analysed using band-specific detection network branches to predict separate sets of results (one for each band). Throughout our study, we evaluate different types of feature fusion, using convo-lutional embeddings of different semantic levels, as well as the impact of using different numbers of image bands inputs to perform the joint analysis. We test the proposed approach over different multi-modal datasets (multi-modal solar images and brain MRI) and applications. The proposed joint analysis based framework consistently improves the CNN’s performance when detecting target regions in contrast to single band based baseline methods.We then generalise our cross-band joint analysis detection scheme into the 3D segmentation problem using multi-modal images. We adopt the joint analysis principles into a segmentation framework where cross-band information is dynamically analysed and cross-integrated at vari-ous semantic levels. The proposed segmentation network also takes advantage of band-specific skip connections to maximise the inter-band information and assist the network in capturing fine details using embeddings of different spatial scales. Furthermore, a recursive training strat-egy, based on weak labels (e.g., bounding boxes), is proposed to overcome the difficulty of producing dense labels to train the segmentation network. We evaluate the proposed segmen-tation approach using different feature fusion approaches, over different datasets (multi-modal solar images, brain MRI, and cloud satellite imagery), and using different levels of supervisions. Promising results were achieved and demonstrate an improved performance in contrast to single band based analysis and state-of-the-art segmentation methods.Additionally, we investigate the possibility of explicitly modelling objective driven feature-level correlations, in a localised manner, within 3D medical imaging scenarios (3D CT pul-monary imaging) to enhance the effectiveness of the feature extraction process in CNNs and subsequently the detection performance. Particularly, we present a framework to perform the 3D detection of pulmonary nodules as an ensemble of two stages, candidate proposal and a false positive reduction. We propose a 3D channel attention block in which cross-channel informa-tion is incorporated to infer channel-wise feature importance with respect to the target objective. Unlike common attention approaches that rely on heavy dimensionality reduction and computa-tionally expensive multi-layer perceptron networks, the proposed approach utilises fully convo-lutional networks to allow directly exploiting rich 3D descriptors and performing the attention in an efficient manner. We also propose a fully convolutional 3D spatial attention approach that elevates cross-sectional information to infer spatial attention. We demonstrate the effectiveness of the proposed attention approaches against a number of popular channel and spatial attention mechanisms. Furthermore, for the False positive reduction stage, in addition to attention, we adopt a joint analysis based approach that takes into account the variable nodule morphology by aggregating spatial information from different contextual levels. We also propose a Zoom-in convolutional path that incorporates semantic information of different spatial scales to assist the network in capturing fine details. The proposed detection approach demonstrates considerable gains in performance in contrast to state-of-the-art lung nodule detection methods.We further explore the possibility of incorporating long-range dependencies between arbi-trary positions in the input features using Transformer networks to infer self-attention, in the context of 3D pulmonary nodule detection, in contrast to localised (convolutional based) atten-tion . We present a hybrid 3D detection approach that takes advantage of both, the Transformers ability in modelling global context and correlations and the spatial representational characteris-tics of convolutional neural networks, providing complementary information and subsequently improving the discriminative ability of the detection model. We propose two hybrid Transformer CNN variants where we investigate the impact of exploiting a deeper Transformer design –in which more Transformer layers and trainable parameters are incorporated– is used along with high-level convolutional feature inputs of a single spatial resolution, in contrast to a shallower Transformer design –of less Transformer layers and trainable parameters– while exploiting con-volutional embeddings of different semantic levels and relatively higher resolution.Extensive quantitative and qualitative analyses are presented for the proposed methods in this thesis and demonstrate the feasibility of exploiting feature-level relations, either implicitly or explicitly, in different detection and segmentation problems

    Scene Signatures: Localised and Point-less Features for Localisation

    Get PDF
    This paper is about localising across extreme lighting and weather conditions. We depart from the traditional point-feature-based approach as matching under dramatic appearance changes is a brittle and hard thing. Point feature detectors are fixed and rigid procedures which pass over an image examining small, low-level structure such as corners or blobs. They apply the same criteria applied all images of all places. This paper takes a contrary view and asks what is possible if instead we learn a bespoke detector for every place. Our localisation task then turns into curating a large bank of spatially indexed detectors and we show that this yields vastly superior performance in terms of robustness in exchange for a reduced but tolerable metric precision. We present an unsupervised system that produces broad-region detectors for distinctive visual elements, called scene signatures, which can be associated across almost all appearance changes. We show, using 21km of data collected over a period of 3 months, that our system is capable of producing metric localisation estimates from night-to-day or summer-to-winter conditions

    COMBINED ARTIFICIAL INTELLIGENCE BEHAVIOUR SYSTEMS IN SERIOUS GAMING

    Get PDF
    This thesis proposes a novel methodology for creating Artificial Agents with semi-realistic behaviour, with such behaviour defined as overcoming common limitations of mainstream behaviour systems; rapidly switching between actions, ignoring “obvious” event priorities, etc. Behaviour in these Agents is not fully realistic as some limitations remain; Agents have a “perfect” knowledge about the surrounding environment, and an inability to transfer knowledge to other Agents (no communication). The novel methodology is achieved by hybridising existing Artificial Intelligence (AI) behaviour systems. In most artificial agents (Agents) behaviour is created using a single behaviour system, whereas this work combines several systems in a novel way to overcome the limitations of each. A further proposal is the separation of behavioural concerns into behaviour systems that are best suited to their needs, as well as describing a biologically inspired memory system that further aids in the production of semi-realistic behaviour. Current behaviour systems are often inherently limited, and in this work it is shown that by combining systems that are complementary to each other, these limitations can be overcome without the need for a workaround. This work examines in detail Belief Desire Intention systems, as well as Finite State Machines and explores how these methodologies can complement each other when combined appropriately. By combining these systems together a hybrid system is proposed that is both fast to react and simple to maintain by separating behaviours into fast-reaction (instinctual) and slow-reaction (behavioural) behaviours, and assigning these to the most appropriate system. Computational intelligence learning techniques such as Artificial Neural Networks have been intentionally avoided, as these techniques commonly present their data in a “black box” system, whereas this work aims to make knowledge explicitly available to the user. A biologically inspired memory system has further been proposed in order to generate additional behaviours in Artificial Agents, such as behaviour related to forgetfulness. This work explores how humans can quickly recall information while still being able to store millions of pieces of information, and how this can be achieved in an artificial system

    Advances in flexible manipulation through the application of AI-based techniques

    Get PDF
    282 p.Objektuak hartu eta uztea oinarrizko bi eragiketa dira ia edozein aplikazio robotikotan. Gaur egun, "pick and place" aplikazioetarako erabiltzen diren robot industrialek zeregin sinpleak eta errepikakorrak egiteko duten eraginkortasuna dute ezaugarri. Hala ere, sistema horiek oso zurrunak dira, erabat kontrolatutako inguruneetan lan egiten dute, eta oso kostu handia dakarte beste zeregin batzuk egiteko birprogramatzeak. Gaur egun, industria-ingurune desberdinetako zereginak daude (adibidez, logistika-ingurune batean eskaerak prestatzea), zeinak objektuak malgutasunez manipulatzea eskatzen duten, eta oraindik ezin izan dira automatizatu beren izaera dela-eta. Automatizazioa zailtzen duten botila-lepo nagusiak manipulatu beharreko objektuen aniztasuna, roboten trebetasun falta eta kontrolatu gabeko ingurune dinamikoen ziurgabetasuna dira.Adimen artifizialak (AA) gero eta paper garrantzitsuagoa betetzen du robotikaren barruan, robotei zeregin konplexuak betetzeko beharrezko adimena ematen baitie. Gainera, AAk benetako esperientzia erabiliz portaera konplexuak ikasteko aukera ematen du, programazioaren kostua nabarmen murriztuz. Objektuak manipulatzeko egungo sistema robotikoen mugak ikusita, lan honen helburu nagusia manipulazio-sistemen malgutasuna handitzea da AAn oinarritutako algoritmoak erabiliz, birprogramatu beharrik gabe ingurune dinamikoetara egokitzeko beharrezko gaitasunak emanez

    Human and Group Activity Recognition from Video Sequences

    Get PDF
    A good solution to human activity recognition enables the creation of a wide variety of useful applications such as applications in visual surveillance, vision-based Human-Computer-Interaction (HCI) and gesture recognition. In this thesis, a graph based approach to human activity recognition is proposed which models spatio-temporal features as contextual space-time graphs. In this method, spatio-temporal gradient cuboids were extracted at significant regions of activity, and feature graphs (gradient, space-time, local neighbours, immediate neighbours) are constructed using the similarity matrix. The Laplacian representation of the graph is utilised to reduce the computational complexity and to allow the use of traditional statistical classifiers. A second methodology is proposed to detect and localise abnormal activities in crowded scenes. This approach has two stages: training and identification. During the training stage, specific human activities are identified and characterised by employing modelling of medium-term movement flow through streaklines. Each streakline is formed by multiple optical flow vectors that represent and track locally the movement in the scene. A dictionary of activities is recorded for a given scene during the training stage. During the testing stage, the consistency of each observed activity with those from the dictionary is verified using the Kullback-Leibler (KL) divergence. The anomaly detection of the proposed methodology is compared to state of the art, producing state of the art results for localising anomalous activities. Finally, we propose an automatic group activity recognition approach by modelling the interdependencies of group activity features over time. We propose to model the group interdependences in both motion and location spaces. These spaces are extended to time-space and time-movement spaces and modelled using Kernel Density Estimation (KDE). The recognition performance of the proposed methodology shows an improvement in recognition performance over state of the art results on group activity datasets
    • …
    corecore