135 research outputs found

    Learning Sensory Representations with Minimal Supervision

    Get PDF

    Data-driven Channel Learning for Next-generation Communication Systems

    Get PDF
    University of Minnesota Ph.D. dissertation. October 2019. Major: Electrical/Computer Engineering. Advisor: Georgios Giannakis. 1 computer file (PDF); x, 116 pages.The turn of the decade has trademarked the `global society' as an information society, where the creation, distribution, integration, and manipulation of information have significant political, economic, technological, academic, and cultural implications. Its main drivers are digital information and communication technologies, which have resulted in a "data deluge", as the number of smart and Internet-capable devices increases rapidly. Unfortunately, establishing information infrastructure to collect data becomes more challenging particularly as communication networks for those devices become larger, denser, and more heterogeneous to meet the quality-of-service (QoS) for the users. Furthermore, scarcity in spectral resources due to an increased demand for mobile devices urges the development of a new methodology for wireless communications possibly facing unprecedented constraints both on hardware and software. At the same time, recent advances in machine learning tools enable statistical inference with efficiency as well as scalability in par with the volume and dimensionality of the data. These considerations justify the pressing need for machine learning tools that are amenable to new hardware and software constraints, and can scale with the size of networks, to facilitate the advanced operation of next-generation communication systems. The present thesis is centered on analytical and algorithmic foundations enabling statistical inference of critical information under practical hardware/software constraints to design and operate wireless communication networks. The vision is to establish a unified and comprehensive framework based on state-of-the-art data-driven learning and Bayesian inference tools to learn the channel-state information that is accurate yet efficient and non-demanding in terms of resources. The central goal is to theoretically, algorithmically, and experimentally demonstrate how valuable insights from data-driven learning can lead to solutions that markedly advance the state-of-the-art performance on inference of channel-state information. To this end, the present thesis investigates two main research thrusts: i) channel-gain cartography leveraging low-rank and sparsity; and ii) Bayesian approaches to channel-gain cartography for spatially heterogeneous environment. The aforementioned research thrusts introduce novel algorithms that aim to tackle the issues of next-generation communication networks. Potential of the proposed algorithms is showcased by rigorous theoretical results and extensive numerical tests

    Towards Vehicle-to-everything Autonomous Driving: A Survey on Collaborative Perception

    Full text link
    Vehicle-to-everything (V2X) autonomous driving opens up a promising direction for developing a new generation of intelligent transportation systems. Collaborative perception (CP) as an essential component to achieve V2X can overcome the inherent limitations of individual perception, including occlusion and long-range perception. In this survey, we provide a comprehensive review of CP methods for V2X scenarios, bringing a profound and in-depth understanding to the community. Specifically, we first introduce the architecture and workflow of typical V2X systems, which affords a broader perspective to understand the entire V2X system and the role of CP within it. Then, we thoroughly summarize and analyze existing V2X perception datasets and CP methods. Particularly, we introduce numerous CP methods from various crucial perspectives, including collaboration stages, roadside sensors placement, latency compensation, performance-bandwidth trade-off, attack/defense, pose alignment, etc. Moreover, we conduct extensive experimental analyses to compare and examine current CP methods, revealing some essential and unexplored insights. Specifically, we analyze the performance changes of different methods under different bandwidths, providing a deep insight into the performance-bandwidth trade-off issue. Also, we examine methods under different LiDAR ranges. To study the model robustness, we further investigate the effects of various simulated real-world noises on the performance of different CP methods, covering communication latency, lossy communication, localization errors, and mixed noises. In addition, we look into the sim-to-real generalization ability of existing CP methods. At last, we thoroughly discuss issues and challenges, highlighting promising directions for future efforts. Our codes for experimental analysis will be public at https://github.com/memberRE/Collaborative-Perception.Comment: 19 page

    Single View 3D Reconstruction using Deep Learning

    Get PDF
    One of the major challenges in the field of Computer Vision has been the reconstruction of a 3D object or scene from a single 2D image. While there are many notable examples, traditional methods for single view reconstruction often fail to generalise due to the presence of many brittle hand-crafted engineering solutions, limiting their applicability to real world problems. Recently, deep learning has taken over the field of Computer Vision and ”learning to reconstruct” has become the dominant technique for addressing the limitations of traditional methods when performing single view 3D reconstruction. Deep learning allows our reconstruction methods to learn generalisable image features and monocular cues that would otherwise be difficult to engineer through ad-hoc hand-crafted approaches. However, it can often be difficult to efficiently integrate the various 3D shape representations within the deep learning framework. In particular, 3D volumetric representations can be adapted to work with Convolutional Neural Networks, but they are computationally expensive and memory inefficient when using local convolutional layers. Also, the successful learning of generalisable feature representations for 3D reconstruction requires large amounts of diverse training data. In practice, this is challenging for 3D training data, as it entails a costly and time consuming manual data collection and annotation process. Researchers have attempted to address these issues by utilising self-supervised learning and generative modelling techniques, however these approaches often produce suboptimal results when compared with models trained on larger datasets. This thesis addresses several key challenges incurred when using deep learning for ”learning to reconstruct” 3D shapes from single view images. We observe that it is possible to learn a compressed representation for multiple categories of the 3D ShapeNet dataset, improving the computational and memory efficiency when working with 3D volumetric representations. To address the challenge of data acquisition, we leverage deep generative models to ”hallucinate” hidden or latent novel viewpoints for a given input image. Combining these images with depths estimated by a self-supervised depth estimator and the known camera properties, allowed us to reconstruct textured 3D point clouds without any ground truth 3D training data. Furthermore, we show that is is possible to improve upon the previous self-supervised monocular depth estimator by adding a self-attention and a discrete volumetric representation, significantly improving accuracy on the KITTI 2015 dataset and enabling the estimation of uncertainty depth predictions.Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 202

    Fifteenth Biennial Status Report: March 2019 - February 2021

    Get PDF

    Generic Object Detection and Segmentation for Real-World Environments

    Get PDF

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

    Automated Diagnosis of Cardiovascular Diseases from Cardiac Magnetic Resonance Imaging Using Deep Learning Models: A Review

    Full text link
    In recent years, cardiovascular diseases (CVDs) have become one of the leading causes of mortality globally. CVDs appear with minor symptoms and progressively get worse. The majority of people experience symptoms such as exhaustion, shortness of breath, ankle swelling, fluid retention, and other symptoms when starting CVD. Coronary artery disease (CAD), arrhythmia, cardiomyopathy, congenital heart defect (CHD), mitral regurgitation, and angina are the most common CVDs. Clinical methods such as blood tests, electrocardiography (ECG) signals, and medical imaging are the most effective methods used for the detection of CVDs. Among the diagnostic methods, cardiac magnetic resonance imaging (CMR) is increasingly used to diagnose, monitor the disease, plan treatment and predict CVDs. Coupled with all the advantages of CMR data, CVDs diagnosis is challenging for physicians due to many slices of data, low contrast, etc. To address these issues, deep learning (DL) techniques have been employed to the diagnosis of CVDs using CMR data, and much research is currently being conducted in this field. This review provides an overview of the studies performed in CVDs detection using CMR images and DL techniques. The introduction section examined CVDs types, diagnostic methods, and the most important medical imaging techniques. In the following, investigations to detect CVDs using CMR images and the most significant DL methods are presented. Another section discussed the challenges in diagnosing CVDs from CMR data. Next, the discussion section discusses the results of this review, and future work in CVDs diagnosis from CMR images and DL techniques are outlined. The most important findings of this study are presented in the conclusion section
    • …
    corecore