16 research outputs found

    Multi-task Learning for Radar Signal Characterisation

    Full text link
    Radio signal recognition is a crucial task in both civilian and military applications, as accurate and timely identification of unknown signals is an essential part of spectrum management and electronic warfare. The majority of research in this field has focused on applying deep learning for modulation classification, leaving the task of signal characterisation as an understudied area. This paper addresses this gap by presenting an approach for tackling radar signal classification and characterisation as a multi-task learning (MTL) problem. We propose the IQ Signal Transformer (IQST) among several reference architectures that allow for simultaneous optimisation of multiple regression and classification tasks. We demonstrate the performance of our proposed MTL model on a synthetic radar dataset, while also providing a first-of-its-kind benchmark for radar signal characterisation.Comment: 5 pages, 3 figure

    Multimodal Image Correspondence

    No full text
    Multimodal images are used across many application areas including medical and surveillance. Due to the different characteristics of different imaging modalities, developing image processing algorithms for multimodal images is challenging. This thesis proposes effective solutions for the challenging problem of multimodal semantic correspondence where the connections between similar components across images from different modalities are established. The proposed methods which are based on deep learning techniques have been applied for several applications including epilepsy type classification and 3D reconstruction of human hand from visible and X-ray image. These proposed algorithms can be adapted to many other imaging modalities

    Multi-modal semantic image segmentation

    No full text
    We propose a modality invariant method to obtain high quality semantic object segmentation of human body parts, for four imaging modalities which consist of visible images, X-ray images, thermal images (heatmaps) and infrared radiation (IR) images. We first consider two modalities (i.e. visible and X-ray images) to develop an architecture suitable for multi-modal semantic segmentation. Due to the intrinsic difference between images from the two modalities, state-of-the-art approaches such as Mask R-CNN do not perform satisfactorily. Insights from analysing how the intermediate layers within Mask R-CNN work on both visible and X-ray modalities have led us to propose a new and efficient network architecture which yields highly accurate semantic segmentation results across both visible and X-ray domains. We design multi-task losses to train the network across different modalities. By conducting multiple experiments across visible and X-ray images of the human upper extremity, we validate the proposed approach, which outperforms the traditional Mask R-CNN method through better exploiting the output features of CNNs. Based on the insights gained on these images from visible and X-ray domains, we extend the proposed multi-modal semantic segmentation method to two additional modalities; (viz. heatmap and IR images). Experiments conducted on these two modalities, further confirm our architecture's capacity to improve the segmentation by exploiting the complementary information in the different modalities of the images. Our method can also be applied to include other modalities and can be effectively utilized for several tasks including medical image analysis tasks such as image registration and 3D reconstruction across modalities.</p

    Multi-Task Learning For Radar Signal Characterisation

    No full text
    Radio signal recognition is a crucial task in both civilian and military applications, as accurate and timely identification of unknown signals is an essential part of spectrum management and electronic warfare. The majority of research in this field has focused on applying deep learning for modulation classification, leaving the task of signal characterisation as an understudied area. This paper addresses this gap by presenting an approach for tackling radar signal classification and characterisation as a multi-task learning (MTL) problem. We propose the IQ Signal Transformer (IQST) among several reference architectures that allow for simultaneous optimisation of multiple regression and classification tasks. We demonstrate the performance of our proposed MTL model on a synthetic radar dataset, while also providing a first-of-its-kind benchmark for radar signal characterisation.</p

    Unified 2D and 3D hand pose estimation from a single visible or X-ray image

    No full text
    Robust detection of the keypoints of the human hand from a single 2D image is a crucial step in many applications including medical image processing, where X-ray images play a vital role. In this paper, we address the challenging problem of 2D and 3D hand pose estimation from a single hand image, where the image can be either in the visible spectrum or an X-ray. In contrast to the state-of-the-art methods, which are for hand pose estimation on visible images, in this work, we do not incorporate the depth images to the training model, there by making the pose estimation more appealing for the situations where the access to the depth images is not viable. Besides, by training a unified model for both X-ray and visible images, where each modality captures different information which complements each other, we elevate the accuracy of the overall model. We present a cascaded network architecture which utilizes a template mesh to estimate the deformations in the 2D images where the estimation is propagated in different cascaded levels to increase the accuracy

    An online lighting model estimation using neural networks for augmented reality in handheld devices

    No full text
    The level of realism in augmented reality applications is heavily dependent on the consistency of illumination between real objects and virtual objects. This paper presents a comprehensive methodology to model the real world lighting and synthesize it with virtual objects which are rendered. While there is a substantial body of knowledge on this aspect, the novel methodology suggested in this paper has its own advantages of not having to have any prior knowledge on the environment or any special hardware, which increases the usability of the system while making it possible to be utilized in online systems.</p

    Feature point tracking algorithm evaluation for augmented reality in handheld devices

    No full text
    In augmented reality applications for handheld devices, accuracy and speed of the tracking algorithm are two of the most critical parameters to achieve realism. This paper presents a comprehensive framework to evaluate feature tracking algorithms on these two parameters. While there is a substantial body of knowledge on these aspects, a novel feature introduced in this paper is the use of error associated with the estimated directional movement in performance measurements to improve the evaluation framework. The work described in this paper is a comparative evaluation of nine widely used feature point tracking algorithms using the developed measurement framework and the results are interpreted based on the characteristics of the algorithms as well as the characteristics of test image sequences.</p

    Sparse over-complete patch matching

    No full text
    Image patch matching, which is the process of identifying corresponding patches across images, has been used as a subroutine for many computer vision and image processing tasks. State -of-the-art patch matching techniques take image patches as input to a convolutional neural network to extract the patch features and evaluate their similarity. Our aim in this paper is to improve on the state of the art patch matching techniques by observing the fact that a sparse-overcomplete representation of an image posses statistical properties of natural visual scenes which can be exploited for patch matching. We propose a new paradigm which encodes image patch details by encoding the patch and subsequently using this sparse representation as input to a neural network to compare the patches. As sparse coding is based on a generative model of natural image patches, it can represent the patch in terms of the fundamental visual components from which it has been composed of, leading to similar sparse codes for patches which are built from similar components. Once the sparse coded features are extracted, we employ a fully-connected neural network, which captures the non-linear relationships between features, for comparison. We have evaluated our approach using the Liberty and Notredame subsets of the popular UBC patch dataset and set a new benchmark outperforming all state-of-the-art patch matching techniques for these datasets
    corecore