129 research outputs found

    Learning-based Wavelet-like Transforms For Fully Scalable and Accessible Image Compression

    Full text link
    The goal of this thesis is to improve the existing wavelet transform with the aid of machine learning techniques, so as to enhance coding efficiency of wavelet-based image compression frameworks, such as JPEG 2000. In this thesis, we first propose to augment the conventional base wavelet transform with two additional learned lifting steps -- a high-to-low step followed by a low-to-high step. The high-to-low step suppresses aliasing in the low-pass band by using the detail bands at the same resolution, while the low-to-high step aims to further remove redundancy from detail bands by using the corresponding low-pass band. These two additional steps reduce redundancy (notably aliasing information) amongst the wavelet subbands, and also improve the visual quality of reconstructed images at reduced resolutions. To train these two networks in an end-to-end fashion, we develop a backward annealing approach to overcome the non-differentiability of the quantization and cost functions during back-propagation. Importantly, the two additional networks share a common architecture, named a proposal-opacity topology, which is inspired and guided by a specific theoretical argument related to geometric flow. This particular network topology is compact and with limited non-linearities, allowing a fully scalable system; one pair of trained network parameters are applied for all levels of decomposition and for all bit-rates of interest. By employing the additional lifting networks within the JPEG2000 image coding standard, we can achieve up to 17.4% average BD bit-rate saving over a wide range of bit-rates, while retaining the quality and resolution scalability features of JPEG2000. Built upon the success of the high-to-low and low-to-high steps, we then study more broadly the extension of neural networks to all lifting steps that correspond to the base wavelet transform. The purpose of this comprehensive study is to understand what is the most effective way to develop learned wavelet-like transforms for highly scalable and accessible image compression. Specifically, we examine the impact of the number of learned lifting steps, the number of layers and the number of channels in each learned lifting network, and kernel support in each layer. To facilitate the study, we develop a generic training methodology that is simultaneously appropriate to all lifting structures considered. Experimental results ultimately suggest that to improve the existing wavelet transform, it is more profitable to augment a larger wavelet transform with more diverse high-to-low and low-to-high steps, rather than developing deep fully learned lifting structures

    JPEG2000-Based Semantic Image Compression using CNN

    Get PDF
    Some of the computer vision applications such as understanding, recognition as well as image processing are some areas where AI techniques like convolutional neural network (CNN) have attained great success. AI techniques are not very frequently used in applications like image compression which are a part of low-level vision applications. Intensifying the visual quality of the lossy video/image compression has been a huge obstacle for a very long time. Image processing tasks and image recognition can be addressed with the application of deep learning CNNs as a result of the availability of large training datasets and the recent advances in computing power. This paper consists of a CNN-based novel compression framework comprising of Compact CNN (ComCNN) and Reconstruction CNN (RecCNN) where they are trained concurrently and ideally consolidated into a compression framework, along with MS-ROI (Multi Structure-Region of Interest) mapping which highlights the semiotically notable portions of the image. The framework attains a mean PSNR value of 32.9dB, achieving a gain of 3.52dB and attains mean SSIM value of 0.9262, achieving a gain of 0.0723dB over the other methods when compared using the 6 main test images. Experimental results in the proposed study validate that the architecture substantially surpasses image compression frameworks, that utilized deblocking or denoising post- processing techniques, classified utilizing Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index Measures (SSIM) with a mean PSNR, SSIM and Compression Ratio of 38.45, 0.9602 and 1.75x respectively for the 50 test images, thus obtaining state-of-art performance for Quality Factor (QF)=5

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

    Intelligent Biosignal Processing in Wearable and Implantable Sensors

    Get PDF
    This reprint provides a collection of papers illustrating the state-of-the-art of smart processing of data coming from wearable, implantable or portable sensors. Each paper presents the design, databases used, methodological background, obtained results, and their interpretation for biomedical applications. Revealing examples are brain–machine interfaces for medical rehabilitation, the evaluation of sympathetic nerve activity, a novel automated diagnostic tool based on ECG data to diagnose COVID-19, machine learning-based hypertension risk assessment by means of photoplethysmography and electrocardiography signals, Parkinsonian gait assessment using machine learning tools, thorough analysis of compressive sensing of ECG signals, development of a nanotechnology application for decoding vagus-nerve activity, detection of liver dysfunction using a wearable electronic nose system, prosthetic hand control using surface electromyography, epileptic seizure detection using a CNN, and premature ventricular contraction detection using deep metric learning. Thus, this reprint presents significant clinical applications as well as valuable new research issues, providing current illustrations of this new field of research by addressing the promises, challenges, and hurdles associated with the synergy of biosignal processing and AI through 16 different pertinent studies. Covering a wide range of research and application areas, this book is an excellent resource for researchers, physicians, academics, and PhD or master students working on (bio)signal and image processing, AI, biomaterials, biomechanics, and biotechnology with applications in medicine

    3D Medical Image Lossless Compressor Using Deep Learning Approaches

    Get PDF
    The ever-increasing importance of accelerated information processing, communica-tion, and storing are major requirements within the big-data era revolution. With the extensive rise in data availability, handy information acquisition, and growing data rate, a critical challenge emerges in efficient handling. Even with advanced technical hardware developments and multiple Graphics Processing Units (GPUs) availability, this demand is still highly promoted to utilise these technologies effectively. Health-care systems are one of the domains yielding explosive data growth. Especially when considering their modern scanners abilities, which annually produce higher-resolution and more densely sampled medical images, with increasing requirements for massive storage capacity. The bottleneck in data transmission and storage would essentially be handled with an effective compression method. Since medical information is critical and imposes an influential role in diagnosis accuracy, it is strongly encouraged to guarantee exact reconstruction with no loss in quality, which is the main objective of any lossless compression algorithm. Given the revolutionary impact of Deep Learning (DL) methods in solving many tasks while achieving the state of the art results, includ-ing data compression, this opens tremendous opportunities for contributions. While considerable efforts have been made to address lossy performance using learning-based approaches, less attention was paid to address lossless compression. This PhD thesis investigates and proposes novel learning-based approaches for compressing 3D medical images losslessly.Firstly, we formulate the lossless compression task as a supervised sequential prediction problem, whereby a model learns a projection function to predict a target voxel given sequence of samples from its spatially surrounding voxels. Using such 3D local sampling information efficiently exploits spatial similarities and redundancies in a volumetric medical context by utilising such a prediction paradigm. The proposed NN-based data predictor is trained to minimise the differences with the original data values while the residual errors are encoded using arithmetic coding to allow lossless reconstruction.Following this, we explore the effectiveness of Recurrent Neural Networks (RNNs) as a 3D predictor for learning the mapping function from the spatial medical domain (16 bit-depths). We analyse Long Short-Term Memory (LSTM) models’ generalisabil-ity and robustness in capturing the 3D spatial dependencies of a voxel’s neighbourhood while utilising samples taken from various scanning settings. We evaluate our proposed MedZip models in compressing unseen Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities losslessly, compared to other state-of-the-art lossless compression standards.This work investigates input configurations and sampling schemes for a many-to-one sequence prediction model, specifically for compressing 3D medical images (16 bit-depths) losslessly. The main objective is to determine the optimal practice for enabling the proposed LSTM model to achieve a high compression ratio and fast encoding-decoding performance. A solution for a non-deterministic environments problem was also proposed, allowing models to run in parallel form without much compression performance drop. Compared to well-known lossless codecs, experimental evaluations were carried out on datasets acquired by different hospitals, representing different body segments, and have distinct scanning modalities (i.e. CT and MRI).To conclude, we present a novel data-driven sampling scheme utilising weighted gradient scores for training LSTM prediction-based models. The objective is to determine whether some training samples are significantly more informative than others, specifically in medical domains where samples are available on a scale of billions. The effectiveness of models trained on the presented importance sampling scheme was evaluated compared to alternative strategies such as uniform, Gaussian, and sliced-based sampling

    Technology, Science and Culture

    Get PDF
    From the success of the first and second volume of this series, we are enthusiastic to continue our discussions on research topics related to the fields of Food Science, Intelligent Systems, Molecular Biomedicine, Water Science, and Creation and Theories of Culture. Our aims are to discuss the newest topics, theories, and research methods in each of the mentioned fields, to promote debates among top researchers and graduate students and to generate collaborative works among them

    Cultural Heritage on line

    Get PDF
    The 2nd International Conference "Cultural Heritage online – Empowering users: an active role for user communities" was held in Florence on 15-16 December 2009. It was organised by the Fondazione Rinascimento Digitale, the Italian Ministry for Cultural Heritage and Activities and the Library of Congress, through the National Digital Information Infrastructure and Preservation Program - NDIIP partners. The conference topics were related to digital libraries, digital preservation and the changing paradigms, focussing on user needs and expectations, analysing how to involve users and the cultural heritage community in creating and sharing digital resources. The sessions investigated also new organisational issues and roles, and cultural and economic limits from an international perspective

    XLIII Jornadas de Automática: libro de actas: 7, 8 y 9 de septiembre de 2022, Logroño (La Rioja)

    Get PDF
    [Resumen] Las Jornadas de Automática (JA) son el evento más importante del Comité Español de Automática (CEA), entidad científico-técnica con más de cincuenta años de vida y destinada a la difusión e implantación de la Automática en la sociedad. Este año se celebra la cuadragésima tercera edición de las JA, que constituyen el punto de encuentro de la comunidad de Automática de nuestro país. La presente edición permitirá dar visibilidad a los nuevos retos y resultados del ámbito, y su uso en un gran número de aplicaciones, entre otras, las energías renovables, la bioingeniería o la robótica asistencial. Además de la componente científica, que se ve reflejada en este libro de actas, las JA son un punto de encuentro de las diferentes generaciones de profesores, investigadores y profesionales, incluyendo la componente social que es de vital importancia. Esta edición 2022 de las JA se celebra en Logroño, capital de La Rioja, región mundialmente conocida por la calidad de sus vinos de Denominación de Origen y que ha asumido el desafío de poder ganar competitividad a través de la transformación verde y digital. Pero también por ser la cuna del castellano e impulsar el Valle de la Lengua con la ayuda de las nuevas tecnologías, entre ellas la Automática Inteligente. Los organizadores de estas JA, pertenecientes al Área de Ingeniería de Sistemas y Automática del Departamento de Ingeniería Eléctrica de la Universidad de La Rioja (UR), constituyen un pilar fundamental en el apoyo a la región para el estudio, implementación y difusión de estos retos. Esta edición, la primera en formato íntegramente presencial después de la pandemia de la covid-19, cuenta con más de 200 asistentes y se celebra a caballo entre el Edificio Politécnico de la Escuela Técnica Superior de Ingeniería Industrial y el Monasterio de Yuso situado en San Millán de la Cogolla, dos marcos excepcionales para la realización de las JA. Como parte del programa científico, dos sesiones plenarias harán hincapié, respectivamente, sobre soluciones de control para afrontar los nuevos retos energéticos, y sobre la calidad de los datos para una inteligencia artificial (IA) imparcial y confiable. También, dos mesas redondas debatirán aplicaciones de la IA y la implantación de la tecnología digital en la actividad profesional. Adicionalmente, destacaremos dos clases magistrales alineadas con tecnología de última generación que serán impartidas por profesionales de la empresa. Las JA también van a albergar dos competiciones: CEABOT, con robots humanoides, y el Concurso de Ingeniería de Control, enfocado a UAVs. A todas estas actividades hay que añadir las reuniones de los grupos temáticos de CEA, las exhibiciones de pósteres con las comunicaciones presentadas a las JA y los expositores de las empresas. Por último, durante el evento se va a proceder a la entrega del “Premio Nacional de Automática” (edición 2022) y del “Premio CEA al Talento Femenino en Automática”, patrocinado por el Gobierno de La Rioja (en su primera edición), además de diversos galardones enmarcados dentro de las actividades de los grupos temáticos de CEA. Las actas de las XLIII Jornadas de Automática están formadas por un total de 143 comunicaciones, organizadas en torno a los nueve Grupos Temáticos y a las dos Líneas Estratégicas de CEA. Los trabajos seleccionados han sido sometidos a un proceso de revisión por pares

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
    corecore