2,029 research outputs found

    A cooperative navigation system with distributed architecture for multiple unmanned aerial vehicles

    Get PDF
    Unmanned aerial vehicles (UAVs) have been widely used in many applications due to, among other features, their versatility, reduced operating cost, and small size. These applications increasingly demand that features related to autonomous navigation be employed, such as mapping. However, the reduced capacity of resources such as, for example, battery and hardware (memory and processing units) can hinder the development of these applications in UAVs. Thus, the collaborative use of multiple UAVs for mapping can be used as an alternative to solve this problem, with a cooperative navigation system. This system requires that individual local maps be transmitted and merged into a global map in a distributed manner. In this scenario, there are two main problems to be addressed: the transmission of maps among the UAVs and the merging of the local maps in each UAV. In this context, this work describes the design, development, and evaluation of a cooperative navigation system with distributed architecture to be used by multiple UAVs. This system uses proposed structures to store the 3D occupancy grid maps. Furthermore, maps are compressed and transmitted between UAVs using algorithms specially proposed for these purposes. Then the local 3D maps are merged in each UAV. In this map merging system, maps are processed before and merged in pairs using suitable algorithms to make them compatible with the 3D occupancy grid map data. In addition, keypoints orientation properties are obtained from potential field gradients. Some proposed filters are used to improve the parameters of the transformations among maps. To validate the proposed solution, simulations were performed in six different environments, outdoors and indoors, and with different layout characteristics. The obtained results demonstrate the effectiveness of thesystemin the construction, sharing, and merging of maps. Still, from the obtained results, the extreme complexity of map merging systems is highlighted.Os veículos aéreos não tripulados (VANTs) têm sidoamplamenteutilizados em muitas aplicações devido, entre outrosrecursos,à sua versatilidade, custo de operação e tamanho reduzidos. Essas aplicações exigem cadavez mais que recursos relacionados à navegaçãoautônoma sejam empregados,como o mapeamento. No entanto, acapacidade reduzida de recursos como, por exemplo, bateria e hardware (memória e capacidade de processamento) podem atrapalhar o desenvolvimento dessas aplicações em VANTs.Assim, o uso colaborativo de múltiplosVANTs para mapeamento pode ser utilizado como uma alternativa para resolvereste problema, criando um sistema de navegaçãocooperativo. Estesistema requer que mapas locais individuais sejam transmitidos efundidos em um mapa global de forma distribuída.Nesse cenário, há doisproblemas principais aserem abordados:a transmissão dosmapas entre os VANTs e afusão dos mapas locais em cada VANT. Nestecontexto, estatese apresentao projeto, desenvolvimento e avaliaçãode um sistema de navegação cooperativo com arquitetura distribuída para ser utilizado pormúltiplos VANTs. Este sistemausa estruturas propostas para armazenaros mapasdegradedeocupação 3D. Além disso, os mapas são compactados e transmitidos entre os VANTs usando os algoritmos propostos. Em seguida, os mapas 3D locais são fundidos em cada VANT. Neste sistemade fusão de mapas, os mapas são processados antes e juntados em pares usando algunsalgoritmos adequados para torná-los compatíveiscom os dados dos mapas da grade de ocupação 3D. Além disso, as propriedadesde orientação dos pontoschave são obtidas a partir de gradientes de campos potenciais. Alguns filtros propostos são utilizadospara melhorar as indicações dos parâmetros dastransformações entre mapas. Paravalidar a aplicação proposta, foram realizadas simulações em seis ambientes distintos, externos e internos, e com características construtivas distintas. Os resultados apresentados demonstram a efetividade do sistema na construção, compartilhamento e fusão dos mapas. Ainda, a partir dos resultados obtidos, destaca-se a extrema complexidade dos sistemas de fusão de mapas

    Differential Recurrent Neural Networks for Human Activity Recognition

    Get PDF
    Human activity recognition has been an active research area in recent years. The difficulty of this problem lies in the complex dynamical motion patterns embedded through the sequential frames. The Long Short-Term Memory (LSTM) recurrent neural network is capable of processing complex sequential information since it utilizes special gating schemes for learning representations from long input sequences. It has the potential to model various time-series data, where the current hidden state has to be considered in the context of the past hidden states. Unfortunately, the conventional LSTMs do not consider the impact of spatio-temporal dynamics corresponding to the given salient motion patterns, when they gate the information that ought to be memorized through time. To address this problem, we propose a differential gating scheme for the LSTM neural network, which emphasizes the change in information gain caused by the salient motions between the successive video frames. This change in information gain is quantified by Derivative of States (DoS), and thus the proposed LSTM model is termed differential Recurrent Neural Network (dRNN). Based on the energy profiling of DoS, we further propose to employ the State Energy Profile (SEP) to search for salient dRNN states and construct more informative representations. To better understand the scene and human appearance information, the dRNN model is extended by connecting Convolutional Neural Networks (CNN) and stacked dRNNs into an end-to-end model. Lastly, the dissertation continues to discuss and compare the combined and the individual orders of DoS used within the dRNN. We propose to control the LSTM gates via individual order of DoS and stack multiple levels of LSTM cells in increasing orders of state derivatives. To this end, we have introduced a new family of LSTMs, expanding the applications of LSTMs and advancing the performances of the state-of-the-art methods

    Kodizajn arhitekture i algoritama za lokalizacijumobilnih robota i detekciju prepreka baziranih namodelu

    No full text
    This thesis proposes SoPC (System on a Programmable Chip) architectures for efficient embedding of vison-based localization and obstacle detection tasks in a navigational pipeline on autonomous mobile robots. The obtained results are equivalent or better in comparison to state-ofthe- art. For localization, an efficient hardware architecture that supports EKF-SLAM's local map management with seven-dimensional landmarks in real time is developed. For obstacle detection a novel method of object recognition is proposed - detection by identification framework based on single detection window scale. This framework allows adequate algorithmic precision and execution speeds on embedded hardware platforms.Ova teza bavi se dizajnom SoPC (engl. System on a Programmable Chip) arhitektura i algoritama za efikasnu implementaciju zadataka lokalizacije i detekcije prepreka baziranih na viziji u kontekstu autonomne robotske navigacije. Za lokalizaciju, razvijena je efikasna računarska arhitektura za EKF-SLAM algoritam, koja podržava skladištenje i obradu sedmodimenzionalnih orijentira lokalne mape u realnom vremenu. Za detekciju prepreka je predložena nova metoda prepoznavanja objekata u slici putem prozora detekcije fiksne dimenzije, koja omogućava veću brzinu izvršavanja algoritma detekcije na namenskim računarskim platformama

    Action recognition from RGB-D data

    Get PDF
    In recent years, action recognition based on RGB-D data has attracted increasing attention. Different from traditional 2D action recognition, RGB-D data contains extra depth and skeleton modalities. Different modalities have their own characteristics. This thesis presents seven novel methods to take advantages of the three modalities for action recognition. First, effective handcrafted features are designed and frequent pattern mining method is employed to mine the most discriminative, representative and nonredundant features for skeleton-based action recognition. Second, to take advantages of powerful Convolutional Neural Networks (ConvNets), it is proposed to represent spatio-temporal information carried in 3D skeleton sequences in three 2D images by encoding the joint trajectories and their dynamics into color distribution in the images, and ConvNets are adopted to learn the discriminative features for human action recognition. Third, for depth-based action recognition, three strategies of data augmentation are proposed to apply ConvNets to small training datasets. Forth, to take full advantage of the 3D structural information offered in the depth modality and its being insensitive to illumination variations, three simple, compact yet effective images-based representations are proposed and ConvNets are adopted for feature extraction and classification. However, both of previous two methods are sensitive to noise and could not differentiate well fine-grained actions. Fifth, it is proposed to represent a depth map sequence into three pairs of structured dynamic images at body, part and joint levels respectively through bidirectional rank pooling to deal with the issue. The structured dynamic image preserves the spatial-temporal information, enhances the structure information across both body parts/joints and different temporal scales, and takes advantages of ConvNets for action recognition. Sixth, it is proposed to extract and use scene flow for action recognition from RGB and depth data. Last, to exploit the joint information in multi-modal features arising from heterogeneous sources (RGB, depth), it is proposed to cooperatively train a single ConvNet (referred to as c-ConvNet) on both RGB features and depth features, and deeply aggregate the two modalities to achieve robust action recognition

    LiDAR-Based Place Recognition For Autonomous Driving: A Survey

    Full text link
    LiDAR-based place recognition (LPR) plays a pivotal role in autonomous driving, which assists Simultaneous Localization and Mapping (SLAM) systems in reducing accumulated errors and achieving reliable localization. However, existing reviews predominantly concentrate on visual place recognition (VPR) methods. Despite the recent remarkable progress in LPR, to the best of our knowledge, there is no dedicated systematic review in this area. This paper bridges the gap by providing a comprehensive review of place recognition methods employing LiDAR sensors, thus facilitating and encouraging further research. We commence by delving into the problem formulation of place recognition, exploring existing challenges, and describing relations to previous surveys. Subsequently, we conduct an in-depth review of related research, which offers detailed classifications, strengths and weaknesses, and architectures. Finally, we summarize existing datasets, commonly used evaluation metrics, and comprehensive evaluation results from various methods on public datasets. This paper can serve as a valuable tutorial for newcomers entering the field of place recognition and for researchers interested in long-term robot localization. We pledge to maintain an up-to-date project on our website https://github.com/ShiPC-AI/LPR-Survey.Comment: 26 pages,13 figures, 5 table

    Electronic Systems with High Energy Efficiency for Embedded Computer Vision

    Get PDF
    Electronic systems are now widely adopted in everyday use. Moreover, nowadays there is an extensive use of embedded wearable and portable devices from industrial to consumer applications. The growing demand of embedded devices and applications has opened several new research fields due to the need of low power consumption and real time responsiveness. Focusing on this class of devices, computer vision algorithms are a challenging application target. In embedded computer vision hardware and software design have to interact to meet application specific requirements. The focus of this thesis is to study computer vision algorithms for embedded systems. The presented work starts presenting a novel algorithm for an IoT stationary use case targeting a high-end embedded device class, where power can be supplied to the platform through wires. Moreover, further contributions focus on algorithmic design and optimization on low and ultra-low power devices. Solutions are presented to gesture recognition and context change detection for wearable devices, focusing on first person wearable devices (Ego-Centric Vision), with the aim to exploit more constrained systems in terms of available power budget and computational resources. A novel gesture recognition algorithm is presented that improves state of art approaches. We then demonstrate the effectiveness of low resolution images exploitation in context change detection with real world ultra-low power imagers. The last part of the thesis deals with more flexible software models to support multiple applications linked at runtime and executed on Cortex-M device class, supporting critical isolation features typical of virtualization-ready CPUs on low-cost low-power microcontrollers and covering some defects in security and deployment capabilities of current firmwares
    corecore