978 research outputs found

    Artificial Intelligence for Multimedia Signal Processing

    Get PDF
    Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining

    Deep Learning in Cardiology

    Full text link
    The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table

    Deep learning based RGB-D vision tasks

    Get PDF
    Depth is an important source of information in computer vision. However, depth is usually discarded in most vision tasks. In this thesis, we study the tasks of estimating depth from single monocular images, and incorporating depth for object detection and semantic segmentation. Recently, a significant number of breakthroughs have been introduced to the vision community by deep convolutional neural networks (CNNs). All of our algorithms in this thesis are built upon deep CNNs. The first part of this thesis addresses the task of incorporating depth for object detection and semantic segmentation. The aim is to improve the performance of vision tasks that are only based on RGB data. Two approaches for object detection and two approaches for semantic segmentation are presented. These approaches are based on existing depth estimation, object detection and semantic segmentation algorithms. The second part of this thesis addresses the task of depth estimation. Depth estimation is often formulated as a regression task due to the continuous property of depths. Deep CNNs for depth estimation are trained by iteratively minimizing regression errors between predicted and ground-truth depths. A drawback of regression is that it predicts depths without confidence. In this thesis, we propose to formulate depth estimation as a classification task which naturally predicts depths with confidence. The confidence can be used during training and post-processing. We also propose to exploit ordinal depth relationships from stereo videos to improve the performance of metric depth estimation. By doing so we propose a Relative Depth in Stereo (RDIS) dataset that is densely annotated with relative depths.Thesis (Ph.D.) -- University of Adelaide,School of Computer Science , 201

    3D Convolutional Neural Networks for Solving Complex Digital Agriculture and Medical Imaging Problems

    Get PDF
    3D signals have become widely popular in view of the advantage they provide via 3D representations of data by employing a third spatial or temporal dimension to extend 2D signals. Predominantly, 3D signals contain details inexistent in their 2D counterparts such as the depth of an image, which is inherent to point clouds (PC), or the temporal evolution of an image, which is inherent to time series data such as videos. Despite this advantage, 3D models are still underexploited in machine learning (ML) compared to 2D signals, mainly due to data scarcity. In this thesis, we exploit and determine the efficiency and influence of using both multispectral PCs and time-series data with 3D convolutional neural networks (CNNs). We evaluate the performance and utility of these networks and data in the context of two applications from the areas of digital agriculture and medical imaging. In particular, multispectral PCs are investigated for the problem of fusarium-head-blight (FHB) detection and total number of spikelets estimation, while time-series echocardiography are investigated for the problem of myocardial infarction (MI) detection. In the context of the digital agriculture application, two state-of-the-art datasets were created, namely the UW-MRDC WHEAT-PLANT PC dataset, consisting of 216 multispectral PC of wheat plants, and the UW-MRDC WHEAT-HEAD PC dataset, consisting of 80 multispectral PC of wheat heads. Both dataset samples were acquired using a multispectral 3D scanner. Moreover, a real-time parallel GPU-enabled preprocessing method, that runs 1065 times faster than its CPU counterpart, was proposed to convert multispectral PCs into multispectral 3D images compatible with CNNs. Also, the UW-MRDC WHEAT-PLANT PC dataset was used to develop novel and efficient 3D CNNs for disease detection to automatically identify wheat infected with FHB from multispectral 3D images of wheat plants. In addition, the influence of the multispectral information on the detection performance was evaluated, and our results showed the dominance of the red, green, and blue (RGB) colour channels over both the near-infra-red (NIR) channel and RGB and NIR channels combined. Our best model for FHB detection in wheat plants achieved 100% accuracy. Furthermore, the UW-MRDC WHEAT-HEAD PC dataset was used to develop unique and efficient 3D CNNs for total number of spikelets estimation in multispectral 3D images of wheat heads, in addition to adapting three benchmark 2D CNN architectures to 3D images to achieve the same purpose. Our best model for total number of spikelets estimation in wheat head achieved 1.13 mean absolute error, meaning that, on average, the difference between the estimated number of spikelets and the actual value is equal to 1.13. Our 3D CNN for FHB detection in wheat achieved the highest accuracy amongst existing FHB detection models, and our 3D CNN for total number of spikelets estimation in wheat is a unique and pioneer application. These results suggest that replacing arduous tasks that require the input of field experts and significant temporal resources with automated ML models in the context of digital agriculture is feasible and promising. In the context of the medical imaging application, an innovative, real-time, and fully automated pipeline based on 2D and 3D CNNs was proposed for early detection of MI, which is a deadly cardiac disorder, from a patient’s echocardiography. The developed pipeline consists of a 2D CNN that performs data preprocessing by segmenting the left ventricle (LV) chamber from the apical 4-chamber (A4C) view from an echocardiography, followed by a 3D CNN that performs MI detection in real-time. The pipeline was trained and tested on the HMC-QU dataset consisting of 162 echocardiography. The 2D CNN achieved 97.18% accuracy on data segmentation, and the 3D CNN achieved 90.9% accuracy, 100% precision, 95% recall, and 97.2% F1 score. Our detection results outperformed existing state-of-the-art models that were tested on the HMC-QU dataset for MI detection. Moreover, our results demonstrate that developing a fully automated system for LV segmentation and MI detection is efficient and propitious and could enable the creation of a tool that reliably suggests the presence of MI in a given echocardiography on the fly. All the empirical results achieved in our thesis indicate the efficiency and reliability of 3D signals, that are multispectral PCs and videos, in developing detection and regression 3D CNN models that can achieve accurate and reliable results.Mitacs, EMILI, NSERC, Western Diversification Canada, The Faculty of Graduate Studies.Master of Science in Applied Computer Scienc

    Monocular Object Instance Segmentation and Depth Ordering with CNNs

    Full text link
    In this paper we tackle the problem of instance-level segmentation and depth ordering from a single monocular image. Towards this goal, we take advantage of convolutional neural nets and train them to directly predict instance-level segmentations where the instance ID encodes the depth ordering within image patches. To provide a coherent single explanation of an image we develop a Markov random field which takes as input the predictions of convolutional neural nets applied at overlapping patches of different resolutions, as well as the output of a connected component algorithm. It aims to predict accurate instance-level segmentation and depth ordering. We demonstrate the effectiveness of our approach on the challenging KITTI benchmark and show good performance on both tasks.Comment: International Conference on Computer Vision (ICCV), 201

    Applications of Deep Learning Techniques for Automated Multiple Sclerosis Detection Using Magnetic Resonance Imaging: A Review

    Get PDF
    Multiple Sclerosis (MS) is a type of brain disease which causes visual, sensory, and motor problems for people with a detrimental effect on the functioning of the nervous system. In order to diagnose MS, multiple screening methods have been proposed so far; among them, magnetic resonance imaging (MRI) has received considerable attention among physicians. MRI modalities provide physicians with fundamental information about the structure and function of the brain, which is crucial for the rapid diagnosis of MS lesions. Diagnosing MS using MRI is time-consuming, tedious, and prone to manual errors. Research on the implementation of computer aided diagnosis system (CADS) based on artificial intelligence (AI) to diagnose MS involves conventional machine learning and deep learning (DL) methods. In conventional machine learning, feature extraction, feature selection, and classification steps are carried out by using trial and error; on the contrary, these steps in DL are based on deep layers whose values are automatically learn. In this paper, a complete review of automated MS diagnosis methods performed using DL techniques with MRI neuroimaging modalities is provided. Initially, the steps involved in various CADS proposed using MRI modalities and DL techniques for MS diagnosis are investigated. The important preprocessing techniques employed in various works are analyzed. Most of the published papers on MS diagnosis using MRI modalities and DL are presented. The most significant challenges facing and future direction of automated diagnosis of MS using MRI modalities and DL techniques are also provided
    • …
    corecore