324 research outputs found

    Neural Image Compression via Non-Local Attention Optimization and Improved Context Modeling

    Full text link
    This paper proposes a novel Non-Local Attention optmization and Improved Context modeling-based image compression (NLAIC) algorithm, which is built on top of the deep nerual network (DNN)-based variational auto-encoder (VAE) structure. Our NLAIC 1) embeds non-local network operations as non-linear transforms in the encoders and decoders for both the image and the latent representation probability information (known as hyperprior) to capture both local and global correlations, 2) applies attention mechanism to generate masks that are used to weigh the features, which implicitly adapt bit allocation for feature elements based on their importance, and 3) implements the improved conditional entropy modeling of latent features using joint 3D convolutional neural network (CNN)-based autoregressive contexts and hyperpriors. Towards the practical application, additional enhancements are also introduced to speed up processing (e.g., parallel 3D CNN-based context prediction), reduce memory consumption (e.g., sparse non-local processing) and alleviate the implementation complexity (e.g., unified model for variable rates without re-training). The proposed model outperforms existing methods on Kodak and CLIC datasets with the state-of-the-art compression efficiency reported, including learned and conventional (e.g., BPG, JPEG2000, JPEG) image compression methods, for both PSNR and MS-SSIM distortion metrics.Comment: arXiv admin note: substantial text overlap with arXiv:1904.0975

    Efficient Methods for Computational Light Transport

    Get PDF
    En esta tesis presentamos contribuciones sobre distintos retos computacionales relacionados con transporte de luz. Los algoritmos que utilizan información sobre el transporte de luz están presentes en muchas aplicaciones de hoy en día, desde la generación de efectos visuales, a la detección de objetos en tiempo real. La luz es una valiosa fuente de información que nos permite entender y representar nuestro entorno, pero obtener y procesar esta información presenta muchos desafíos debido a la complejidad de las interacciones entre la luz y la materia. Esta tesis aporta contribuciones en este tema desde dos puntos de vista diferentes: algoritmos en estado estacionario, en los que se asume que la velocidad de la luz es infinita; y algoritmos en estado transitorio, que tratan la luz no solo en el dominio espacial, sino también en el temporal. Nuestras contribuciones en algoritmos estacionarios abordan problemas tanto en renderizado offline como en tiempo real. Nos enfocamos en la reducción de varianza para métodos offline,proponiendo un nuevo método para renderizado eficiente de medios participativos. En renderizado en tiempo real, abordamos las limitacionesde consumo de batería en dispositivos móviles proponiendo un sistema de renderizado que incrementa la eficiencia energética en aplicaciones gráficas en tiempo real. En el transporte de luz transitorio, formalizamos la simulación de este tipo transporte en este nuevo dominio, y presentamos nuevos algoritmos y métodos para muestreo eficiente para render transitorio. Finalmente, demostramos la utilidad de generar datos en este dominio, presentando un nuevo método para corregir interferencia multi-caminos en camaras Timeof- Flight, un problema patológico en el procesamiento de imágenes transitorias.n this thesis we present contributions to different challenges of computational light transport. Light transport algorithms are present in many modern applications, from image generation for visual effects to real-time object detection. Light is a rich source of information that allows us to understand and represent our surroundings, but obtaining and processing this information presents many challenges due to its complex interactions with matter. This thesis provides advances in this subject from two different perspectives: steady-state algorithms, where the speed of light is assumed infinite, and transient-state algorithms, which deal with light as it travels not only through space but also time. Our steady-state contributions address problems in both offline and real-time rendering. We target variance reduction in offline rendering by proposing a new efficient method for participating media rendering. In real-time rendering, we target energy constraints of mobile devices by proposing a power-efficient rendering framework for real-time graphics applications. In transient-state we first formalize light transport simulation under this domain, and present new efficient sampling methods and algorithms for transient rendering. We finally demonstrate the potential of simulated data to correct multipath interference in Time-of-Flight cameras, one of the pathological problems in transient imaging.<br /

    Review : Deep learning in electron microscopy

    Get PDF
    Deep learning is transforming most areas of science and technology, including electron microscopy. This review paper offers a practical perspective aimed at developers with limited familiarity. For context, we review popular applications of deep learning in electron microscopy. Following, we discuss hardware and software needed to get started with deep learning and interface with electron microscopes. We then review neural network components, popular architectures, and their optimization. Finally, we discuss future directions of deep learning in electron microscopy

    Generic Object Detection and Segmentation for Real-World Environments

    Get PDF

    Deep learning for internet of underwater things and ocean data analytics

    Get PDF
    The Internet of Underwater Things (IoUT) is an emerging technological ecosystem developed for connecting objects in maritime and underwater environments. IoUT technologies are empowered by an extreme number of deployed sensors and actuators. In this thesis, multiple IoUT sensory data are augmented with machine intelligence for forecasting purposes

    Architectures for ubiquitous 3D on heterogeneous computing platforms

    Get PDF
    Today, a wide scope for 3D graphics applications exists, including domains such as scientific visualization, 3D-enabled web pages, and entertainment. At the same time, the devices and platforms that run and display the applications are more heterogeneous than ever. Display environments range from mobile devices to desktop systems and ultimately to distributed displays that facilitate collaborative interaction. While the capability of the client devices may vary considerably, the visualization experiences running on them should be consistent. The field of application should dictate how and on what devices users access the application, not the technical requirements to realize the 3D output. The goal of this thesis is to examine the diverse challenges involved in providing consistent and scalable visualization experiences to heterogeneous computing platforms and display setups. While we could not address the myriad of possible use cases, we developed a comprehensive set of rendering architectures in the major domains of scientific and medical visualization, web-based 3D applications, and movie virtual production. To provide the required service quality, performance, and scalability for different client devices and displays, our architectures focus on the efficient utilization and combination of the available client, server, and network resources. We present innovative solutions that incorporate methods for hybrid and distributed rendering as well as means to manage data sets and stream rendering results. We establish the browser as a promising platform for accessible and portable visualization services. We collaborated with experts from the medical field and the movie industry to evaluate the usability of our technology in real-world scenarios. The presented architectures achieve a wide coverage of display and rendering setups and at the same time share major components and concepts. Thus, they build a strong foundation for a unified system that supports a variety of use cases.Heutzutage existiert ein großer Anwendungsbereich für 3D-Grafikapplikationen wie wissenschaftliche Visualisierungen, 3D-Inhalte in Webseiten, und Unterhaltungssoftware. Gleichzeitig sind die Geräte und Plattformen, welche die Anwendungen ausführen und anzeigen, heterogener als je zuvor. Anzeigegeräte reichen von mobilen Geräten zu Desktop-Systemen bis hin zu verteilten Bildschirmumgebungen, die eine kollaborative Anwendung begünstigen. Während die Leistungsfähigkeit der Geräte stark schwanken kann, sollten die dort laufenden Visualisierungen konsistent sein. Das Anwendungsfeld sollte bestimmen, wie und auf welchem Gerät Benutzer auf die Anwendung zugreifen, nicht die technischen Voraussetzungen zur Erzeugung der 3D-Grafik. Das Ziel dieser Thesis ist es, die diversen Herausforderungen zu untersuchen, die bei der Bereitstellung von konsistenten und skalierbaren Visualisierungsanwendungen auf heterogenen Plattformen eine Rolle spielen. Während wir nicht die Vielzahl an möglichen Anwendungsfällen abdecken konnten, haben wir eine repräsentative Auswahl an Rendering-Architekturen in den Kernbereichen wissenschaftliche Visualisierung, web-basierte 3D-Anwendungen, und virtuelle Filmproduktion entwickelt. Um die geforderte Qualität, Leistung, und Skalierbarkeit für verschiedene Client-Geräte und -Anzeigen zu gewährleisten, fokussieren sich unsere Architekturen auf die effiziente Nutzung und Kombination der verfügbaren Client-, Server-, und Netzwerkressourcen. Wir präsentieren innovative Lösungen, die hybrides und verteiltes Rendering als auch das Verwalten der Datensätze und Streaming der 3D-Ausgabe umfassen. Wir etablieren den Web-Browser als vielversprechende Plattform für zugängliche und portierbare Visualisierungsdienste. Um die Verwendbarkeit unserer Technologie in realitätsnahen Szenarien zu testen, haben wir mit Experten aus der Medizin und Filmindustrie zusammengearbeitet. Unsere Architekturen erreichen eine umfassende Abdeckung von Anzeige- und Rendering-Szenarien und teilen sich gleichzeitig wesentliche Komponenten und Konzepte. Sie bilden daher eine starke Grundlage für ein einheitliches System, das eine Vielzahl an Anwendungsfällen unterstützt

    Internet of Underwater Things and Big Marine Data Analytics -- A Comprehensive Survey

    Full text link
    The Internet of Underwater Things (IoUT) is an emerging communication ecosystem developed for connecting underwater objects in maritime and underwater environments. The IoUT technology is intricately linked with intelligent boats and ships, smart shores and oceans, automatic marine transportations, positioning and navigation, underwater exploration, disaster prediction and prevention, as well as with intelligent monitoring and security. The IoUT has an influence at various scales ranging from a small scientific observatory, to a midsized harbor, and to covering global oceanic trade. The network architecture of IoUT is intrinsically heterogeneous and should be sufficiently resilient to operate in harsh environments. This creates major challenges in terms of underwater communications, whilst relying on limited energy resources. Additionally, the volume, velocity, and variety of data produced by sensors, hydrophones, and cameras in IoUT is enormous, giving rise to the concept of Big Marine Data (BMD), which has its own processing challenges. Hence, conventional data processing techniques will falter, and bespoke Machine Learning (ML) solutions have to be employed for automatically learning the specific BMD behavior and features facilitating knowledge extraction and decision support. The motivation of this paper is to comprehensively survey the IoUT, BMD, and their synthesis. It also aims for exploring the nexus of BMD with ML. We set out from underwater data collection and then discuss the family of IoUT data communication techniques with an emphasis on the state-of-the-art research challenges. We then review the suite of ML solutions suitable for BMD handling and analytics. We treat the subject deductively from an educational perspective, critically appraising the material surveyed.Comment: 54 pages, 11 figures, 19 tables, IEEE Communications Surveys & Tutorials, peer-reviewed academic journa

    Machine Learning Algorithms for Robotic Navigation and Perception and Embedded Implementation Techniques

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen
    corecore