1,704 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Common Information and Decentralized Inference with Dependent Observations

    Get PDF
    Wyner\u27s common information was originally defined for a pair of dependent discrete random variables. This thesis generalizes its definition in two directions: the number of dependent variables can be arbitrary, so are the alphabets of those random variables. New properties are determined for the generalized Wyner\u27s common information of multiple dependent variables. More importantly, a lossy source coding interpretation of Wyner\u27s common information is developed using the Gray-Wyner network. It is established that the common information equals to the smallest common message rate when the total rate is arbitrarily close to the rate distortion function with joint decoding if the distortions are within some distortion region. The application of Wyner\u27s common information to inference problems is also explored in the thesis. A central question is under what conditions does Wyner\u27s common information capture the entire information about the inference object. Under a simple Bayesian model, it is established that for infinitely exchangeable random variables that the common information is asymptotically equal to the information of the inference object. For finite exchangeable random variables, connection between common information and inference performance metrics are also established. The problem of decentralized inference is generally intractable with conditional dependent observations. A promising approach for this problem is to utilize a hierarchical conditional independence model. Utilizing the hierarchical conditional independence model, we identify a more general condition under which the distributed detection problem becomes tractable, thereby broadening the classes of distributed detection problems with dependent observations that can be readily solved. We then develop the sufficiency principle for data reduction for decentralized inference. For parallel networks, the hierarchical conditional independence model is used to obtain conditions such that local sufficiency implies global sufficiency. For tandem networks, the notion of conditional sufficiency is introduced and the related theory and tools are developed. Connections between the sufficiency principle and distributed source coding problems are also explored. Furthermore, we examine the impact of quantization on decentralized data reduction. The conditions under which sufficiency based data reduction with quantization constraints is optimal are identified. They include the case when the data at decentralized nodes are conditionally independent as well as a class of problems with conditionally dependent observations that admit conditional independence structure through the hierarchical conditional independence model

    Security and Privacy Dimensions in Next Generation DDDAS/Infosymbiotic Systems: A Position Paper

    Get PDF
    AbstractThe omnipresent pervasiveness of personal devices will expand the applicability of the Dynamic Data Driven Application Systems (DDDAS) paradigm in innumerable ways. While every single smartphone or wearable device is potentially a sensor with powerful computing and data capabilities, privacy and security in the context of human participants must be addressed to leverage the infinite possibilities of dynamic data driven application systems. We propose a security and privacy preserving framework for next generation systems that harness the full power of the DDDAS paradigm while (1) ensuring provable privacy guarantees for sensitive data; (2) enabling field-level, intermediate, and central hierarchical feedback-driven analysis for both data volume mitigation and security; and (3) intrinsically addressing uncertainty caused either by measurement error or security-driven data perturbation. These thrusts will form the foundation for secure and private deployments of large scale hybrid participant-sensor DDDAS systems of the future

    Arbitrarily Strong Utility-Privacy Tradeoff in Multi-Agent Systems

    Full text link
    Each agent in a network makes a local observation that is linearly related to a set of public and private parameters. The agents send their observations to a fusion center to allow it to estimate the public parameters. To prevent leakage of the private parameters, each agent first sanitizes its local observation using a local privacy mechanism before transmitting it to the fusion center. We investigate the utility-privacy tradeoff in terms of the Cram\'er-Rao lower bounds for estimating the public and private parameters. We study the class of privacy mechanisms given by linear compression and noise perturbation, and derive necessary and sufficient conditions for achieving arbitrarily strong utility-privacy tradeoff in a multi-agent system for both the cases where prior information is available and unavailable, respectively. We also provide a method to find the maximum estimation privacy achievable without compromising the utility and propose an alternating algorithm to optimize the utility-privacy tradeoff in the case where arbitrarily strong utility-privacy tradeoff is not achievable

    Role of artificial intelligence in cloud computing, IoT and SDN: Reliability and scalability issues

    Get PDF
    Information technology fields are now more dominated by artificial intelligence, as it is playing a key role in terms of providing better services. The inherent strengths of artificial intelligence are driving the companies into a modern, decisive, secure, and insight-driven arena to address the current and future challenges. The key technologies like cloud, internet of things (IoT), and software-defined networking (SDN) are emerging as future applications and rendering benefits to the society. Integrating artificial intelligence with these innovations with scalability brings beneficiaries to the next level of efficiency. Data generated from the heterogeneous devices are received, exchanged, stored, managed, and analyzed to automate and improve the performance of the overall system and be more reliable. Although these new technologies are not free of their limitations, nevertheless, the synthesis of technologies has been challenged and has put forth many challenges in terms of scalability and reliability. Therefore, this paper discusses the role of artificial intelligence (AI) along with issues and opportunities confronting all communities for incorporating the integration of these technologies in terms of reliability and scalability. This paper puts forward the future directions related to scalability and reliability concerns during the integration of the above-mentioned technologies and enable the researchers to address the current research gaps

    Privacy in the Genomic Era

    Get PDF
    Genome sequencing technology has advanced at a rapid pace and it is now possible to generate highly-detailed genotypes inexpensively. The collection and analysis of such data has the potential to support various applications, including personalized medical services. While the benefits of the genomics revolution are trumpeted by the biomedical community, the increased availability of such data has major implications for personal privacy; notably because the genome has certain essential features, which include (but are not limited to) (i) an association with traits and certain diseases, (ii) identification capability (e.g., forensics), and (iii) revelation of family relationships. Moreover, direct-to-consumer DNA testing increases the likelihood that genome data will be made available in less regulated environments, such as the Internet and for-profit companies. The problem of genome data privacy thus resides at the crossroads of computer science, medicine, and public policy. While the computer scientists have addressed data privacy for various data types, there has been less attention dedicated to genomic data. Thus, the goal of this paper is to provide a systematization of knowledge for the computer science community. In doing so, we address some of the (sometimes erroneous) beliefs of this field and we report on a survey we conducted about genome data privacy with biomedical specialists. Then, after characterizing the genome privacy problem, we review the state-of-the-art regarding privacy attacks on genomic data and strategies for mitigating such attacks, as well as contextualizing these attacks from the perspective of medicine and public policy. This paper concludes with an enumeration of the challenges for genome data privacy and presents a framework to systematize the analysis of threats and the design of countermeasures as the field moves forward

    Curious Negotiator

    Full text link
    n negotiation the exchange of information is as important as the exchange of offers. The curious negotiator is a multiagent system with three types of agents. Two negotiation agents, each representing an individual, develop consecutive offers, supported by information, whilst requesting information from its opponent. A mediator agent, with experience of prior negotiations, suggests how the negotiation may develop. A failed negotiation is a missed opportunity. An observer agent analyses failures looking for new opportunities. The integration of negotiation theory and data mining enables the curious negotiator to discover and exploit negotiation opportunities. Trials will be conducted in electronic business
    corecore