3,263 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Characterization of Information Channels for Asymptotic Mean Stationarity and Stochastic Stability of Non-stationary/Unstable Linear Systems

    Full text link
    Stabilization of non-stationary linear systems over noisy communication channels is considered. Stochastically stable sources, and unstable but noise-free or bounded-noise systems have been extensively studied in information theory and control theory literature since 1970s, with a renewed interest in the past decade. There have also been studies on non-causal and causal coding of unstable/non-stationary linear Gaussian sources. In this paper, tight necessary and sufficient conditions for stochastic stabilizability of unstable (non-stationary) possibly multi-dimensional linear systems driven by Gaussian noise over discrete channels (possibly with memory and feedback) are presented. Stochastic stability notions include recurrence, asymptotic mean stationarity and sample path ergodicity, and the existence of finite second moments. Our constructive proof uses random-time state-dependent stochastic drift criteria for stabilization of Markov chains. For asymptotic mean stationarity (and thus sample path ergodicity), it is sufficient that the capacity of a channel is (strictly) greater than the sum of the logarithms of the unstable pole magnitudes for memoryless channels and a class of channels with memory. This condition is also necessary under a mild technical condition. Sufficient conditions for the existence of finite average second moments for such systems driven by unbounded noise are provided.Comment: To appear in IEEE Transactions on Information Theor

    On the Chirp Function, the Chirplet Transform and the Optimal Communication of Information

    Get PDF
    ā€”The purpose of this extended paper is to provide a review of the chirp function and the chirplet transform and to investigate the application of chirplet modulation for digital communications, in particular, the transmission of binary strings. The signiļ¬cance of the chirp function in the solution to a range of fundamental problems in physics is revisited to provide a background to the case and to present the context in which the chirp function plays a central role, the material presented being designed to show a variety of problems with solutions and applications that are characterized by a chirp function in a fundamental way. A study is then provided whose aim is to investigate the uniqueness of the chirp function in regard to its use for convolutionalcodinganddecoding,thelattercase(i.e.decoding) being related to the autocorrelation of the chirp function which provides a unique solution to the deconvolution problem. Complementary material in regard to the uniqueness of a chirp is addressed through an investigation into the selfcharacterizationofthechirpfunctionuponFouriertransformation. This includes a short study on the eigenfunctions of the Fourier transform, leading to a uniqueness conjecture which is based on an application of the Bluestein decomposition of a Fourier transform. The conjecture states that the chirp function is the only phase-only function to have a self-characteristic Fourier transform, and, for a speciļ¬c scaling constant, a conjugate eigenfunction. In the context of this conjecture, we consider the transmission of information through a channel characterized by additive noise and the detection of signals with very low Signal-to-Noise Ratios. It is shown that application of chirplet modulation can provide a simple and optimal solution to the problem of transmitting binary strings through noisy communication channels, a result which suggests that all digital communication systems should ideally by predicated on the application of chirplet modulation. In the latter part of the paper, a method is proposed for securing the communication of information (in the form of a binary string) through chirplet modulation that is based on prime number factorization of the chirplet (angular) bandwidth. Coupled with a quantum computer for factorizing very large prime numbers using Shorā€™s algorithm, the method has the potential for designing a communications protocol speciļ¬cally for users with access to quantum computing when the factorization of very large prime numbers is required. In thisrespect,and,intheļ¬nalpartofthepaper,weinvestigatethe application of chirplet modulation for communicating through the ā€˜Water-Holeā€™. This includes the introduction of a method for distinguishing between genuine ā€˜intelligibleā€™ binary strings through the Kullback-Leibler divergence which is shown to be statistically signiļ¬cant for a number of natural languages

    Lossless image coding using hierarchical decomposition and recursive partitioning

    Get PDF
    State-Of-The-Art lossless image compression schemes, such as JPEG-LS and CALIC, have been proposed in the context-adaptive predictive coding framework. These schemes involve a prediction step followed by context-adaptive entropy coding of the residuals. However, the models for context determination proposed in the literature, have been designed using ad-hoc techniques. In this paper, we take an alternative approach where we fix a simpler context model and then rely on a systematic technique to efficiently exploit spatial correlation to achieve efficient compression. The essential idea is to decompose the image into binary bitmaps such that the spatial correlation that exists among non-binary symbols is captured as the correlation among few bit positions. The proposed scheme then encodes the bitmaps in a particular order based on the simple context model. However, instead of encoding a bitmap as a whole, we partition it into rectangular blocks, induced by a binary tree, and then independently encode the blocks. The motivation for partitioning is to explicitly identify the blocks within which the statistical correlation remains the same. On a set of standard test images, the proposed scheme, using the same predictor as JPEG-LS, achieved an overall bit-rate saving of 1.56% against JPEG-LS. Ā© 2016 The Authors

    Algebraic Methods in Computational Complexity

    Get PDF
    Computational Complexity is concerned with the resources that are required for algorithms to detect properties of combinatorial objects and structures. It has often proven true that the best way to argue about these combinatorial objects is by establishing a connection (perhaps approximate) to a more well-behaved algebraic setting. Indeed, many of the deepest and most powerful results in Computational Complexity rely on algebraic proof techniques. The Razborov-Smolensky polynomial-approximation method for proving constant-depth circuit lower bounds, the PCP characterization of NP, and the Agrawal-Kayal-Saxena polynomial-time primality test are some of the most prominent examples. In some of the most exciting recent progress in Computational Complexity the algebraic theme still plays a central role. There have been significant recent advances in algebraic circuit lower bounds, and the so-called chasm at depth 4 suggests that the restricted models now being considered are not so far from ones that would lead to a general result. There have been similar successes concerning the related problems of polynomial identity testing and circuit reconstruction in the algebraic model (and these are tied to central questions regarding the power of randomness in computation). Also the areas of derandomization and coding theory have experimented important advances. The seminar aimed to capitalize on recent progress and bring together researchers who are using a diverse array of algebraic methods in a variety of settings. Researchers in these areas are relying on ever more sophisticated and specialized mathematics and the goal of the seminar was to play an important role in educating a diverse community about the latest new techniques
    • ā€¦
    corecore