259 research outputs found

    Mining complex data in highly streaming environments

    Get PDF
    Data is growing at a rapid rate because of advanced hardware and software technologies and platforms such as e-health systems, sensor networks, and social media. One of the challenging problems is storing, processing and transferring this big data in an efficient and effective way. One solution to tackle these challenges is to construct synopsis by means of data summarization techniques. Motivated by the fact that without summarization, processing, analyzing and communicating this vast amount of data is inefficient, this thesis introduces new summarization frameworks with the main goals of reducing communication costs and accelerating data mining processes in different application scenarios. Specifically, we study the following big data summarizaion techniques:(i) dimensionality reduction;(ii)clustering,and(iii)histogram, considering their importance and wide use in various areas and domains. In our work, we propose three different frameworks using these summarization techniques to cover three different aspects of big data:"Volume","Velocity"and"Variety" in centralized and decentralized platforms. We use dimensionality reduction techniques for summarizing large 2D-arrays, clustering and histograms for processing multiple data streams. With respect to the importance and rapid growth of emerging e-health applications such as tele-radiology and tele-medicine that require fast, low cost, and often lossless access to massive amounts of medical images and data over band limited channels,our first framework attempts to summarize streams of large volume medical images (e.g. X-rays) for the purpose of compression. Significant amounts of correlation and redundancy exist across different medical images. These can be extracted and used as a data summary to achieve better compression, and consequently less storage and less communication overheads on the network. We propose a novel memory-assisted compression framework as a learning-based universal coding, which can be used to complement any existing algorithm to further eliminate redundancies/similarities across images. This approach is motivated by the fact that, often in medical applications, massive amounts of correlated images from the same family are available as training data for learning the dependencies and deriving appropriate reference or synopses models. The models can then be used for compression of any new image from the same family. In particular, dimensionality reduction techniques such as Principal Component Analysis (PCA) and Non-negative Matrix Factorization (NMF) are applied on a set of images from training data to form the required reference models. The proposed memory-assisted compression allows each image to be processed independently of other images, and hence allows individual image access and transmission. In the second part of our work,we investigate the problem of summarizing distributed multidimensional data streams using clustering. We devise a distributed clustering framework, DistClusTree, that extends the centralized ClusTree approach. The main difficulty in distributed clustering is balancing communication costs and clustering quality. We tackle this in DistClusTree through combining spatial index summaries and online tracking for efficient local and global incremental clustering. We demonstrate through extensive experiments the efficacy of the framework in terms of communication costs and approximate clustering quality. In the last part, we use a multidimensional index structure to merge distributed summaries in the form of a centralized histogram as another widely used summarization technique with the application in approximate range query answering. In this thesis, we propose the index-based Distributed Mergeable Summaries (iDMS) framework based on kd-trees that addresses these challenges with data generative models of Gaussian mixture models (GMMs) and a Generative Adversarial Network (GAN). iDMS maintains a global approximate kd-tree at a central site via GMMs or GANs upon new arrivals of streaming data at local sites. Experimental results validate the effectiveness and efficiency of iDMS against baseline distributed settings in terms of approximation error and communication costs

    Discrete Wavelet Transforms

    Get PDF
    The discrete wavelet transform (DWT) algorithms have a firm position in processing of signals in several areas of research and industry. As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. The present book: Discrete Wavelet Transforms: Algorithms and Applications reviews the recent progress in discrete wavelet transform algorithms and applications. The book covers a wide range of methods (e.g. lifting, shift invariance, multi-scale analysis) for constructing DWTs. The book chapters are organized into four major parts. Part I describes the progress in hardware implementations of the DWT algorithms. Applications include multitone modulation for ADSL and equalization techniques, a scalable architecture for FPGA-implementation, lifting based algorithm for VLSI implementation, comparison between DWT and FFT based OFDM and modified SPIHT codec. Part II addresses image processing algorithms such as multiresolution approach for edge detection, low bit rate image compression, low complexity implementation of CQF wavelets and compression of multi-component images. Part III focuses watermaking DWT algorithms. Finally, Part IV describes shift invariant DWTs, DC lossless property, DWT based analysis and estimation of colored noise and an application of the wavelet Galerkin method. The chapters of the present book consist of both tutorial and highly advanced material. Therefore, the book is intended to be a reference text for graduate students and researchers to obtain state-of-the-art knowledge on specific applications

    3D Medical Image Lossless Compressor Using Deep Learning Approaches

    Get PDF
    The ever-increasing importance of accelerated information processing, communica-tion, and storing are major requirements within the big-data era revolution. With the extensive rise in data availability, handy information acquisition, and growing data rate, a critical challenge emerges in efficient handling. Even with advanced technical hardware developments and multiple Graphics Processing Units (GPUs) availability, this demand is still highly promoted to utilise these technologies effectively. Health-care systems are one of the domains yielding explosive data growth. Especially when considering their modern scanners abilities, which annually produce higher-resolution and more densely sampled medical images, with increasing requirements for massive storage capacity. The bottleneck in data transmission and storage would essentially be handled with an effective compression method. Since medical information is critical and imposes an influential role in diagnosis accuracy, it is strongly encouraged to guarantee exact reconstruction with no loss in quality, which is the main objective of any lossless compression algorithm. Given the revolutionary impact of Deep Learning (DL) methods in solving many tasks while achieving the state of the art results, includ-ing data compression, this opens tremendous opportunities for contributions. While considerable efforts have been made to address lossy performance using learning-based approaches, less attention was paid to address lossless compression. This PhD thesis investigates and proposes novel learning-based approaches for compressing 3D medical images losslessly.Firstly, we formulate the lossless compression task as a supervised sequential prediction problem, whereby a model learns a projection function to predict a target voxel given sequence of samples from its spatially surrounding voxels. Using such 3D local sampling information efficiently exploits spatial similarities and redundancies in a volumetric medical context by utilising such a prediction paradigm. The proposed NN-based data predictor is trained to minimise the differences with the original data values while the residual errors are encoded using arithmetic coding to allow lossless reconstruction.Following this, we explore the effectiveness of Recurrent Neural Networks (RNNs) as a 3D predictor for learning the mapping function from the spatial medical domain (16 bit-depths). We analyse Long Short-Term Memory (LSTM) models’ generalisabil-ity and robustness in capturing the 3D spatial dependencies of a voxel’s neighbourhood while utilising samples taken from various scanning settings. We evaluate our proposed MedZip models in compressing unseen Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities losslessly, compared to other state-of-the-art lossless compression standards.This work investigates input configurations and sampling schemes for a many-to-one sequence prediction model, specifically for compressing 3D medical images (16 bit-depths) losslessly. The main objective is to determine the optimal practice for enabling the proposed LSTM model to achieve a high compression ratio and fast encoding-decoding performance. A solution for a non-deterministic environments problem was also proposed, allowing models to run in parallel form without much compression performance drop. Compared to well-known lossless codecs, experimental evaluations were carried out on datasets acquired by different hospitals, representing different body segments, and have distinct scanning modalities (i.e. CT and MRI).To conclude, we present a novel data-driven sampling scheme utilising weighted gradient scores for training LSTM prediction-based models. The objective is to determine whether some training samples are significantly more informative than others, specifically in medical domains where samples are available on a scale of billions. The effectiveness of models trained on the presented importance sampling scheme was evaluated compared to alternative strategies such as uniform, Gaussian, and sliced-based sampling

    The 5th Conference of PhD Students in Computer Science

    Get PDF

    A Comprehensive Review on Medical Image Steganography Based on LSB Technique and Potential Challenges

    Get PDF
    The rapid development of telemedicine services and the requirements for exchanging medical information between physicians, consultants, and health institutions have made the protection of patients’ information an important priority for any future e-health system. The protection of medical information, including the cover (i.e. medical image), has a specificity that slightly differs from the requirements for protecting other information. It is necessary to preserve the cover greatly due to its importance on the reception side as medical staff use this information to provide a diagnosis to save a patient's life. If the cover is tampered with, this leads to failure in achieving the goal of telemedicine. Therefore, this work provides an investigation of information security techniques in medical imaging, focusing on security goals. Encrypting a message before hiding them gives an extra layer of security, and thus, will provide an excellent solution to protect the sensitive information of patients during the sharing of medical information. Medical image steganography is a special case of image steganography, while Digital Imaging and Communications in Medicine (DICOM) is the backbone of all medical imaging divisions, whereby it is most broadly used to store and transmit medical images. The main objective of this study is to provide a general idea of what Least Significant Bit-based (LSB) steganography techniques have achieved in medical images

    Effects of discrete wavelet compression on automated mammographic shape recognition

    Full text link
    At present early detection is critical for the cure of breast cancer. Mammography is a breast screening technique which can detect breast cancer at the earliest possible stage. Mammographic lesions are typically classified into three shape classes, namely round, nodular and stellate. Presently this classification is done by experienced radiologists. In order to increase the speed and decrease the cost of diagnosis, automated recognition systems are being developed. This study analyses an automated classification procedure and its sensitivity to wavelet based image compression; In this study, the mammographic shape images are compressed using discrete wavelet compression and then classified using statistical classification methods. First, one dimensional compression is done on the radial distance measure and the shape features are extracted. Second, linear discriminant analysis is used to compute the weightings of the features. Third, a minimum distance Euclidean classifier and the leave-one-out test method is used for classification. Lastly, a two dimensional compression is performed on the images, and the above process of feature extraction and classification is repeated. The results are compared with those obtained with uncompressed mammographic images

    Design for energy-efficient and reliable fog-assisted healthcare IoT systems

    Get PDF
    Cardiovascular disease and diabetes are two of the most dangerous diseases as they are the leading causes of death in all ages. Unfortunately, they cannot be completely cured with the current knowledge and existing technologies. However, they can be effectively managed by applying methods of continuous health monitoring. Nonetheless, it is difficult to achieve a high quality of healthcare with the current health monitoring systems which often have several limitations such as non-mobility support, energy inefficiency, and an insufficiency of advanced services. Therefore, this thesis presents a Fog computing approach focusing on four main tracks, and proposes it as a solution to the existing limitations. In the first track, the main goal is to introduce Fog computing and Fog services into remote health monitoring systems in order to enhance the quality of healthcare. In the second track, a Fog approach providing mobility support in a real-time health monitoring IoT system is proposed. The handover mechanism run by Fog-assisted smart gateways helps to maintain the connection between sensor nodes and the gateways with a minimized latency. Results show that the handover latency of the proposed Fog approach is 10%-50% less than other state-of-the-art mobility support approaches. In the third track, the designs of four energy-efficient health monitoring IoT systems are discussed and developed. Each energy-efficient system and its sensor nodes are designed to serve a specific purpose such as glucose monitoring, ECG monitoring, or fall detection; with the exception of the fourth system which is an advanced and combined system for simultaneously monitoring many diseases such as diabetes and cardiovascular disease. Results show that these sensor nodes can continuously work, depending on the application, up to 70-155 hours when using a 1000 mAh lithium battery. The fourth track mentioned above, provides a Fog-assisted remote health monitoring IoT system for diabetic patients with cardiovascular disease. Via several proposed algorithms such as QT interval extraction, activity status categorization, and fall detection algorithms, the system can process data and detect abnormalities in real-time. Results show that the proposed system using Fog services is a promising approach for improving the treatment of diabetic patients with cardiovascular disease

    Entropy in Image Analysis II

    Get PDF
    Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

    Workshop proceedings: Information Systems for Space Astrophysics in the 21st Century, volume 1

    Get PDF
    The Astrophysical Information Systems Workshop was one of the three Integrated Technology Planning workshops. Its objectives were to develop an understanding of future mission requirements for information systems, the potential role of technology in meeting these requirements, and the areas in which NASA investment might have the greatest impact. Workshop participants were briefed on the astrophysical mission set with an emphasis on those missions that drive information systems technology, the existing NASA space-science operations infrastructure, and the ongoing and planned NASA information systems technology programs. Program plans and recommendations were prepared in five technical areas: Mission Planning and Operations; Space-Borne Data Processing; Space-to-Earth Communications; Science Data Systems; and Data Analysis, Integration, and Visualization
    corecore