3,101 research outputs found

    Taking Computation to Data: Integrating Privacy-preserving AI techniques and Blockchain Allowing Secure Analysis of Sensitive Data on Premise

    Get PDF
    PhD thesis in Information technologyWith the advancement of artificial intelligence (AI), digital pathology has seen significant progress in recent years. However, the use of medical AI raises concerns about patient data privacy. The CLARIFY project is a research project funded under the European Union’s Marie Sklodowska-Curie Actions (MSCA) program. The primary objective of CLARIFY is to create a reliable, automated digital diagnostic platform that utilizes cloud-based data algorithms and artificial intelligence to enable interpretation and diagnosis of wholeslide-images (WSI) from any location, maximizing the advantages of AI-based digital pathology. My research as an early stage researcher for the CLARIFY project centers on securing information systems using machine learning and access control techniques. To achieve this goal, I extensively researched privacy protection technologies such as federated learning, differential privacy, dataset distillation, and blockchain. These technologies have different priorities in terms of privacy, computational efficiency, and usability. Therefore, we designed a computing system that supports different levels of privacy security, based on the concept: taking computation to data. Our approach is based on two design principles. First, when external users need to access internal data, a robust access control mechanism must be established to limit unauthorized access. Second, it implies that raw data should be processed to ensure privacy and security. Specifically, we use smart contractbased access control and decentralized identity technology at the system security boundary to ensure the flexibility and immutability of verification. If the user’s raw data still cannot be directly accessed, we propose to use dataset distillation technology to filter out privacy, or use locally trained model as data agent. Our research focuses on improving the usability of these methods, and this thesis serves as a demonstration of current privacy-preserving and secure computing technologies

    Towards Secure and Intelligent Diagnosis: Deep Learning and Blockchain Technology for Computer-Aided Diagnosis Systems

    Get PDF
    Cancer is the second leading cause of death across the world after cardiovascular disease. The survival rate of patients with cancerous tissue can significantly decrease due to late-stage diagnosis. Nowadays, advancements of whole slide imaging scanners have resulted in a dramatic increase of patient data in the domain of digital pathology. Large-scale histopathology images need to be analyzed promptly for early cancer detection which is critical for improving patient's survival rate and treatment planning. Advances of medical image processing and deep learning methods have facilitated the extraction and analysis of high-level features from histopathological data that could assist in life-critical diagnosis and reduce the considerable healthcare cost associated with cancer. In clinical trials, due to the complexity and large variance of collected image data, developing computer-aided diagnosis systems to support quantitative medical image analysis is an area of active research. The first goal of this research is to automate the classification and segmentation process of cancerous regions in histopathology images of different cancer tissues by developing models using deep learning-based architectures. In this research, a framework with different modules is proposed, including (1) data pre-processing, (2) data augmentation, (3) feature extraction, and (4) deep learning architectures. Four validation studies were designed to conduct this research. (1) differentiating benign and malignant lesions in breast cancer (2) differentiating between immature leukemic blasts and normal cells in leukemia cancer (3) differentiating benign and malignant regions in lung cancer, and (4) differentiating benign and malignant regions in colorectal cancer. Training machine learning models, disease diagnosis, and treatment often requires collecting patients' medical data. Privacy and trusted authenticity concerns make data owners reluctant to share their personal and medical data. Motivated by the advantages of Blockchain technology in healthcare data sharing frameworks, the focus of the second part of this research is to integrate Blockchain technology in computer-aided diagnosis systems to address the problems of managing access control, authentication, provenance, and confidentiality of sensitive medical data. To do so, a hierarchical identity and attribute-based access control mechanism using smart contract and Ethereum Blockchain is proposed to securely process healthcare data without revealing sensitive information to an unauthorized party leveraging the trustworthiness of transactions in a collaborative healthcare environment. The proposed access control mechanism provides a solution to the challenges associated with centralized access control systems and ensures data transparency and traceability for secure data sharing, and data ownership

    Collaborative Training of Medical Artificial Intelligence Models with non-uniform Labels

    Full text link
    Artificial intelligence (AI) methods are revolutionizing medical image analysis. However, robust AI models require large multi-site datasets for training. While multiple stakeholders have provided publicly available datasets, the ways in which these data are labeled differ widely. For example, one dataset of chest radiographs might contain labels denoting the presence of metastases in the lung, while another dataset of chest radiograph might focus on the presence of pneumonia. With conventional approaches, these data cannot be used together to train a single AI model. We propose a new framework that we call flexible federated learning (FFL) for collaborative training on such data. Using publicly available data of 695,000 chest radiographs from five institutions - each with differing labels - we demonstrate that large and heterogeneously labeled datasets can be used to train one big AI model with this framework. We find that models trained with FFL are superior to models that are trained on matching annotations only. This may pave the way for training of truly large-scale AI models that make efficient use of all existing data.Comment: 2 figures, 3 tables, 5 supplementary table

    Building standardized and secure mobile health services based on social media

    Get PDF
    Mobile devices and social media have been used to create empowering healthcare services. However, privacy and security concerns remain. Furthermore, the integration of interoperability biomedical standards is a strategic feature. Thus, the objective of this paper is to build enhanced healthcare services by merging all these components. Methodologically, the current mobile health telemonitoring architectures and their limitations are described, leading to the identification of new potentialities for a novel architecture. As a result, a standardized, secure/private, social-media-based mobile health architecture has been proposed and discussed. Additionally, a technical proof-of-concept (two Android applications) has been developed by selecting a social media (Twitter), a security envelope (open Pretty Good Privacy (openPGP)), a standard (Health Level 7 (HL7)) and an information-embedding algorithm (modifying the transparency channel, with two versions). The tests performed included a small-scale and a boundary scenario. For the former, two sizes of images were tested; for the latter, the two versions of the embedding algorithm were tested. The results show that the system is fast enough (less than 1 s) for most mHealth telemonitoring services. The architecture provides users with friendly (images shared via social media), straightforward (fast and inexpensive), secure/private and interoperable mHealth services

    AI Technical Considerations:Data Storage, Cloud usage and AI Pipeline

    Get PDF
    Artificial intelligence (AI), especially deep learning, requires vast amounts of data for training, testing, and validation. Collecting these data and the corresponding annotations requires the implementation of imaging biobanks that provide access to these data in a standardized way. This requires careful design and implementation based on the current standards and guidelines and complying with the current legal restrictions. However, the realization of proper imaging data collections is not sufficient to train, validate and deploy AI as resource demands are high and require a careful hybrid implementation of AI pipelines both on-premise and in the cloud. This chapter aims to help the reader when technical considerations have to be made about the AI environment by providing a technical background of different concepts and implementation aspects involved in data storage, cloud usage, and AI pipelines
    corecore