10,887 research outputs found

    DROP: Dimensionality Reduction Optimization for Time Series

    Full text link
    Dimensionality reduction is a critical step in scaling machine learning pipelines. Principal component analysis (PCA) is a standard tool for dimensionality reduction, but performing PCA over a full dataset can be prohibitively expensive. As a result, theoretical work has studied the effectiveness of iterative, stochastic PCA methods that operate over data samples. However, termination conditions for stochastic PCA either execute for a predetermined number of iterations, or until convergence of the solution, frequently sampling too many or too few datapoints for end-to-end runtime improvements. We show how accounting for downstream analytics operations during DR via PCA allows stochastic methods to efficiently terminate after operating over small (e.g., 1%) subsamples of input data, reducing whole workload runtime. Leveraging this, we propose DROP, a DR optimizer that enables speedups of up to 5x over Singular-Value-Decomposition-based PCA techniques, and exceeds conventional approaches like FFT and PAA by up to 16x in end-to-end workloads

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Key Challenges and Opportunities in Hull Form Design Optimisation for Marine and Offshore Applications

    Get PDF
    New environmental regulations and volatile fuel prices have resulted in an ever-increasing need for reduction in carbon emission and fuel consumption. Designs of marine and offshore vessels are more demanding with complex operating requirements and oil and gas exploration venturing into deeper waters and hasher environments. Combinations of these factors have led to the need to optimise the design of the hull for the marine and offshore industry. The contribution of this paper is threefold. Firstly, the paper provides a comprehensive review of the state-ofthe- art techniques in hull form design. Specifically, it analyses geometry modelling, shape transformation, optimisation and performance evaluation. Strengths and weaknesses of existing solutions are also discussed. Secondly, key challenges of hull form optimisation specific to the design of marine and offshore vessels are identified and analysed. Thirdly, future trends in performing hull form design optimisation are investigated and possible solutions proposed. A case study on the design optimisation of bulbous bow for passenger ferry vessel to reduce wavemaking resistance is presented using NAPA software. Lastly, main issues and challenges are discussed to stimulate further ideas on future developments in this area, including the use of parallel computing and machine intelligence

    DeepBrain: Functional Representation of Neural In-Situ Hybridization Images for Gene Ontology Classification Using Deep Convolutional Autoencoders

    Full text link
    This paper presents a novel deep learning-based method for learning a functional representation of mammalian neural images. The method uses a deep convolutional denoising autoencoder (CDAE) for generating an invariant, compact representation of in situ hybridization (ISH) images. While most existing methods for bio-imaging analysis were not developed to handle images with highly complex anatomical structures, the results presented in this paper show that functional representation extracted by CDAE can help learn features of functional gene ontology categories for their classification in a highly accurate manner. Using this CDAE representation, our method outperforms the previous state-of-the-art classification rate, by improving the average AUC from 0.92 to 0.98, i.e., achieving 75% reduction in error. The method operates on input images that were downsampled significantly with respect to the original ones to make it computationally feasible
    • …
    corecore