109 research outputs found

    Heterogeneous Federated Learning: State-of-the-art and Research Challenges

    Full text link
    Federated learning (FL) has drawn increasing attention owing to its potential use in large-scale industrial applications. Existing federated learning works mainly focus on model homogeneous settings. However, practical federated learning typically faces the heterogeneity of data distributions, model architectures, network environments, and hardware devices among participant clients. Heterogeneous Federated Learning (HFL) is much more challenging, and corresponding solutions are diverse and complex. Therefore, a systematic survey on this topic about the research challenges and state-of-the-art is essential. In this survey, we firstly summarize the various research challenges in HFL from five aspects: statistical heterogeneity, model heterogeneity, communication heterogeneity, device heterogeneity, and additional challenges. In addition, recent advances in HFL are reviewed and a new taxonomy of existing HFL methods is proposed with an in-depth analysis of their pros and cons. We classify existing methods from three different levels according to the HFL procedure: data-level, model-level, and server-level. Finally, several critical and promising future research directions in HFL are discussed, which may facilitate further developments in this field. A periodically updated collection on HFL is available at https://github.com/marswhu/HFL_Survey.Comment: 42 pages, 11 figures, and 4 table

    Survey: Leakage and Privacy at Inference Time

    Get PDF
    Leakage of data from publicly available Machine Learning (ML) models is an area of growing significance as commercial and government applications of ML can draw on multiple sources of data, potentially including users' and clients' sensitive data. We provide a comprehensive survey of contemporary advances on several fronts, covering involuntary data leakage which is natural to ML models, potential malevolent leakage which is caused by privacy attacks, and currently available defence mechanisms. We focus on inference-time leakage, as the most likely scenario for publicly available models. We first discuss what leakage is in the context of different data, tasks, and model architectures. We then propose a taxonomy across involuntary and malevolent leakage, available defences, followed by the currently available assessment metrics and applications. We conclude with outstanding challenges and open questions, outlining some promising directions for future research

    Fortifying robustness: unveiling the intricacies of training and inference vulnerabilities in centralized and federated neural networks

    Get PDF
    Neural network (NN) classifiers have gained significant traction in diverse domains such as natural language processing, computer vision, and cybersecurity, owing to their remarkable ability to approximate complex latent distributions from data. Nevertheless, the conventional assumption of an attack-free operating environment has been challenged by the emergence of adversarial examples. These perturbed samples, which are typically imperceptible to human observers, can lead to misclassifications by the NN classifiers. Moreover, recent studies have uncovered the ability of poisoned training data to generate Trojan backdoored classifiers that exhibit misclassification behavior triggered by predefined patterns. In recent years, significant research efforts have been dedicated to uncovering the vulnerabilities of NN classifiers and developing defenses or mitigations against them. However, the existing approaches still fall short of providing mature solutions to address this ever-evolving problem. The widely adopted defense mechanisms against adversarial examples are computationally expensive and impractical for certain real-world applications. Likewise, the practical black-box defense against Trojan backdoors has failed to achieve state-of-the-art performance. More concerning is the limited exploration of these vulnerabilities within the context of cooperative attack or Federated learning, leaving NN classifiers exposed to unknown risks. This dissertation aims to address these critical gaps and refine our understanding of these vulnerabilities. The research conducted within this dissertation encompasses both the attack and defense perspectives, aiming to shed light on future research directions for vulnerabilities in NN classifiers

    Towards Data-centric Graph Machine Learning: Review and Outlook

    Full text link
    Data-centric AI, with its primary focus on the collection, management, and utilization of data to drive AI models and applications, has attracted increasing attention in recent years. In this article, we conduct an in-depth and comprehensive review, offering a forward-looking outlook on the current efforts in data-centric AI pertaining to graph data-the fundamental data structure for representing and capturing intricate dependencies among massive and diverse real-life entities. We introduce a systematic framework, Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of the graph data lifecycle, including graph data collection, exploration, improvement, exploitation, and maintenance. A thorough taxonomy of each stage is presented to answer three critical graph-centric questions: (1) how to enhance graph data availability and quality; (2) how to learn from graph data with limited-availability and low-quality; (3) how to build graph MLOps systems from the graph data-centric view. Lastly, we pinpoint the future prospects of the DC-GML domain, providing insights to navigate its advancements and applications.Comment: 42 pages, 9 figure

    A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability

    Full text link
    Graph Neural Networks (GNNs) have made rapid developments in the recent years. Due to their great ability in modeling graph-structured data, GNNs are vastly used in various applications, including high-stakes scenarios such as financial analysis, traffic predictions, and drug discovery. Despite their great potential in benefiting humans in the real world, recent study shows that GNNs can leak private information, are vulnerable to adversarial attacks, can inherit and magnify societal bias from training data and lack interpretability, which have risk of causing unintentional harm to the users and society. For example, existing works demonstrate that attackers can fool the GNNs to give the outcome they desire with unnoticeable perturbation on training graph. GNNs trained on social networks may embed the discrimination in their decision process, strengthening the undesirable societal bias. Consequently, trustworthy GNNs in various aspects are emerging to prevent the harm from GNN models and increase the users' trust in GNNs. In this paper, we give a comprehensive survey of GNNs in the computational aspects of privacy, robustness, fairness, and explainability. For each aspect, we give the taxonomy of the related methods and formulate the general frameworks for the multiple categories of trustworthy GNNs. We also discuss the future research directions of each aspect and connections between these aspects to help achieve trustworthiness

    ์ธ๊ณต์ง€๋Šฅ ๋ณด์•ˆ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ์ƒ๋ฌผ์ •๋ณดํ•™์ „๊ณต, 2021. 2. ์œค์„ฑ๋กœ.With the development of machine learning (ML), expectations for artificial intelligence (AI) technologies have increased daily. In particular, deep neural networks have demonstrated outstanding performance in many fields. However, if a deep-learning (DL) model causes mispredictions or misclassifications, it can cause difficulty, owing to malicious external influences. This dissertation discusses DL security and privacy issues and proposes methodologies for security and privacy attacks. First, we reviewed security attacks and defenses from two aspects. Evasion attacks use adversarial examples to disrupt the classification process, and poisoning attacks compromise training by compromising the training data. Next, we reviewed attacks on privacy that can exploit exposed training data and defenses, including differential privacy and encryption. For adversarial DL, we study the problem of finding adversarial examples against ML-based portable document format (PDF) malware classifiers. We believe that our problem is more challenging than those against ML models for image processing, owing to the highly complex data structure of PDFs, compared with traditional image datasets, and the requirement that the infected PDF should exhibit malicious behavior without being detected. We propose an attack using generative adversarial networks that effectively generates evasive PDFs using a variational autoencoder robust against adversarial examples. For privacy in DL, we study the problem of avoiding sensitive data being misused and propose a privacy-preserving framework for deep neural networks. Our methods are based on generative models that preserve the privacy of sensitive data while maintaining a high prediction performance. Finally, we study the security aspect in biological domains to detect maliciousness in deoxyribonucleic acid sequences and watermarks to protect intellectual properties. In summary, the proposed DL models for security and privacy embrace a diversity of research by attempting actual attacks and defenses in various fields.์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๊ฐœ์ธ๋ณ„ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘์ด ํ•„์ˆ˜์ ์ด๋‹ค. ๋ฐ˜๋ฉด ๊ฐœ์ธ์˜ ๋ฏผ๊ฐํ•œ ๋ฐ์ดํ„ฐ๊ฐ€ ์œ ์ถœ๋˜๋Š” ๊ฒฝ์šฐ์—๋Š” ํ”„๋ผ์ด๋ฒ„์‹œ ์นจํ•ด์˜ ์†Œ์ง€๊ฐ€ ์žˆ๋‹ค. ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š”๋ฐ ์ˆ˜์ง‘๋œ ๋ฐ์ดํ„ฐ๊ฐ€ ์™ธ๋ถ€์— ์œ ์ถœ๋˜์ง€ ์•Š๋„๋ก ํ•˜๊ฑฐ๋‚˜, ์ต๋ช…ํ™”, ๋ถ€ํ˜ธํ™” ๋“ฑ์˜ ๋ณด์•ˆ ๊ธฐ๋ฒ•์„ ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์— ์ ์šฉํ•˜๋Š” ๋ถ„์•ผ๋ฅผ Private AI๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์ด ๋…ธ์ถœ๋  ๊ฒฝ์šฐ ์ง€์  ์†Œ์œ ๊ถŒ์ด ๋ฌด๋ ฅํ™”๋  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์ œ์ ๊ณผ, ์•…์˜์ ์ธ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜์—ฌ ์ธ๊ณต์ง€๋Šฅ ์‹œ์Šคํ…œ์„ ์˜ค์ž‘๋™ํ•  ์ˆ˜ ์žˆ๊ณ  ์ด๋Ÿฌํ•œ ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ ์ž์ฒด์— ๋Œ€ํ•œ ์œ„ํ˜‘์€ Secure AI๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํ•™์Šต ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ๊ณต๊ฒฉ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์‹ ๊ฒฝ๋ง์˜ ๊ฒฐ์† ์‚ฌ๋ก€๋ฅผ ๋ณด์—ฌ์ค€๋‹ค. ๊ธฐ์กด์˜ AEs ์—ฐ๊ตฌ๋“ค์€ ์ด๋ฏธ์ง€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋งŽ์€ ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜์—ˆ๋‹ค. ๋ณด๋‹ค ๋ณต์žกํ•œ heterogenousํ•œ PDF ๋ฐ์ดํ„ฐ๋กœ ์—ฐ๊ตฌ๋ฅผ ํ™•์žฅํ•˜์—ฌ generative ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ์„ ์ œ์•ˆํ•˜์—ฌ ๊ณต๊ฒฉ ์ƒ˜ํ”Œ์„ ์ƒ์„ฑํ•˜์˜€๋‹ค. ๋‹ค์Œ์œผ๋กœ ์ด์ƒ ํŒจํ„ด์„ ๋ณด์ด๋Š” ์ƒ˜ํ”Œ์„ ๊ฒ€์ถœํ•  ์ˆ˜ ์žˆ๋Š” DNA steganalysis ๋ฐฉ์–ด ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๊ฐœ์ธ ์ •๋ณด ๋ณดํ˜ธ๋ฅผ ์œ„ํ•ด generative ๋ชจ๋ธ ๊ธฐ๋ฐ˜์˜ ์ต๋ช…ํ™” ๊ธฐ๋ฒ•๋“ค์„ ์ œ์•ˆํ•œ๋‹ค. ์š”์•ฝํ•˜๋ฉด ๋ณธ ๋…ผ๋ฌธ์€ ์ธ๊ณต์ง€๋Šฅ ๋ชจ๋ธ์„ ํ™œ์šฉํ•œ ๊ณต๊ฒฉ ๋ฐ ๋ฐฉ์–ด ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ์‹ ๊ฒฝ๋ง์„ ํ™œ์šฉํ•˜๋Š”๋ฐ ๋ฐœ์ƒ๋˜๋Š” ํ”„๋ผ์ด๋ฒ„์‹œ ์ด์Šˆ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๊ณ„ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๊ธฐ๋ฐ˜ํ•œ ์ผ๋ จ์˜ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค.Abstract i List of Figures vi List of Tables xiii 1 Introduction 1 2 Background 6 2.1 Deep Learning: a brief overview . . . . . . . . . . . . . . . . . . . 6 2.2 Security Attacks on Deep Learning Models . . . . . . . . . . . . . 10 2.2.1 Evasion Attacks . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.2 Poisoning Attack . . . . . . . . . . . . . . . . . . . . . . . 20 2.3 Defense Techniques Against Deep Learning Models . . . . . . . . . 26 2.3.1 Defense Techniques against Evasion Attacks . . . . . . . . 27 2.3.2 Defense against Poisoning Attacks . . . . . . . . . . . . . . 36 2.4 Privacy issues on Deep Learning Models . . . . . . . . . . . . . . . 38 2.4.1 Attacks on Privacy . . . . . . . . . . . . . . . . . . . . . . 39 2.4.2 Defenses Against Attacks on Privacy . . . . . . . . . . . . 40 3 Attacks on Deep Learning Models 47 3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.1.1 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.1.2 Portable Document Format (PDF) . . . . . . . . . . . . . . 55 3.1.3 PDF Malware Classifiers . . . . . . . . . . . . . . . . . . . 57 3.1.4 Evasion Attacks . . . . . . . . . . . . . . . . . . . . . . . 58 3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.2.1 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . 60 3.2.2 Feature Selection Process . . . . . . . . . . . . . . . . . . 61 3.2.3 Seed Selection for Mutation . . . . . . . . . . . . . . . . . 62 3.2.4 Evading Model . . . . . . . . . . . . . . . . . . . . . . . . 63 3.2.5 Model architecture . . . . . . . . . . . . . . . . . . . . . . 67 3.2.6 PDF Repacking and Verification . . . . . . . . . . . . . . . 67 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 3.3.1 Datasets and Model Training . . . . . . . . . . . . . . . . . 68 3.3.2 Target Classifiers . . . . . . . . . . . . . . . . . . . . . . . 71 3.3.3 CVEs for Various Types of PDF Malware . . . . . . . . . . 72 3.3.4 Malicious Signature . . . . . . . . . . . . . . . . . . . . . 72 3.3.5 AntiVirus Engines (VirusTotal) . . . . . . . . . . . . . . . 76 3.3.6 Feature Mutation Result for Contagio . . . . . . . . . . . . 76 3.3.7 Feature Mutation Result for CVEs . . . . . . . . . . . . . . 78 3.3.8 Malicious Signature Verification . . . . . . . . . . . . . . . 78 3.3.9 Evasion Speed . . . . . . . . . . . . . . . . . . . . . . . . 80 3.3.10 AntiVirus Engines (VirusTotal) Result . . . . . . . . . . . . 82 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4 Defense on Deep Learning Models 88 4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.1.1 Message-Hiding Regions . . . . . . . . . . . . . . . . . . . 91 4.1.2 DNA Steganography . . . . . . . . . . . . . . . . . . . . . 92 4.1.3 Example of Message Hiding . . . . . . . . . . . . . . . . . 94 4.1.4 DNA Steganalysis . . . . . . . . . . . . . . . . . . . . . . 95 4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 4.2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 4.2.2 Proposed Model Architecture . . . . . . . . . . . . . . . . 103 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.3.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . 105 4.3.2 Environment . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.3.3 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3.4 Model Training . . . . . . . . . . . . . . . . . . . . . . . . 107 4.3.5 Message Hiding Procedure . . . . . . . . . . . . . . . . . . 108 4.3.6 Evaluation Procedure . . . . . . . . . . . . . . . . . . . . . 109 4.3.7 Performance Comparison . . . . . . . . . . . . . . . . . . . 109 4.3.8 Analyzing Malicious Code in DNA Sequences . . . . . . . 112 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 5 Privacy: Generative Models for Anonymizing Private Data 115 5.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.1.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 5.1.2 Anonymization using GANs . . . . . . . . . . . . . . . . . 119 5.1.3 Security Principle of Anonymized GANs . . . . . . . . . . 123 5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.2.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 5.2.2 Target Classifiers . . . . . . . . . . . . . . . . . . . . . . . 126 5.2.3 Model Training . . . . . . . . . . . . . . . . . . . . . . . . 126 5.2.4 Evaluation Process . . . . . . . . . . . . . . . . . . . . . . 126 5.2.5 Comparison to Differential Privacy . . . . . . . . . . . . . 128 5.2.6 Performance Comparison . . . . . . . . . . . . . . . . . . . 128 5.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 6 Privacy: Privacy-preserving Inference for Deep Learning Models 132 6.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 6.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 135 6.1.2 Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 6.1.3 Deep Private Generation Framework . . . . . . . . . . . . . 137 6.1.4 Security Principle . . . . . . . . . . . . . . . . . . . . . . . 141 6.1.5 Threat to the Classifier . . . . . . . . . . . . . . . . . . . . 143 6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 6.2.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 6.2.2 Experimental Process . . . . . . . . . . . . . . . . . . . . . 146 6.2.3 Target Classifiers . . . . . . . . . . . . . . . . . . . . . . . 147 6.2.4 Model Training . . . . . . . . . . . . . . . . . . . . . . . . 147 6.2.5 Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . 149 6.2.6 Performance Comparison . . . . . . . . . . . . . . . . . . . 150 6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 7 Conclusion 153 7.0.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . 154 7.0.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 155 Bibliography 157 Abstract in Korean 195Docto

    A Distributed and Real-time Machine Learning Framework for Smart Meter Big Data

    Get PDF
    The advanced metering infrastructure allows smart meters to collect high-resolution consumption data, thereby enabling consumers and utilities to understand their energy usage at different levels, which has led to numerous smart grid applications. Smart meter data, however, poses different challenges to developing machine learning frameworks than classic theoretical frameworks due to their big data features and privacy limitations. Therefore, in this work, we aim to address the challenges of building machine learning frameworks for smart meter big data. Specifically, our work includes three parts: 1) We first analyze and compare different learning algorithms for multi-level smart meter big data. A daily activity pattern recognition model has been developed based on non-intrusive load monitoring for appliance-level smart meter data. Then, a consensus-based load profiling and forecasting system has been proposed for individual building level and higher aggregated level smart meter data analysis; 2) Following discussion of multi-level smart meter data analysis from an offline perspective, a universal online functional analysis model has been proposed for multi-level real-time smart meter big data analysis. The proposed model consists of a multi-scale load dynamic profiling unit based on functional clustering and a multi-scale online load forecasting unit based on functional deep neural networks. The two units enable online tracking of the dynamic cluster trajectories and online forecasting of daily multi-scale demand; 3) To enable smart meter data analysis in the distributed environment, FederatedNILM was proposed, which is then combined with differential privacy to provide privacy guarantees for the appliance-level distributed machine learning framework. Based on federated deep learning enhanced with two schemes, namely the utility optimization scheme and the privacy-preserving scheme, the proposed distributed and privacy-preserving machine learning framework enables electric utilities and service providers to offer smart meter services on a large scale

    A cooperative framework for molecular biology database integration using image object selection

    Get PDF
    The theme and the concept of 'Molecular Biology Database Integration' and the problems associated with this concept initiated the idea for this Ph.D research. The available technologies facilitate to analyse the data independently and discretely but it fails to integrate the data resources for more meaningful information. This along with the integration issues created the scope for this Ph.D research. The research has reviewed the 'database interoperability' problems and it has suggested a framework for integrating the molecular biology databases. The framework has proposed to develop a cooperative environment to share information on the basis of common purpose for the molecular biology databases. The research has also reviewed other implementation and interoperability issues for laboratory based, dedicated and target specific database. The research has addressed the following issues: diversity of molecular biology databases schemas, schema constructs and schema implementation multi-database query using image object keying, database integration technologies using context graph, automated navigation among these databases. This thesis has introduced a new approach for database implementation. It has introduced an interoperable component database concept to initiate multidatabase query on gene mutation data. A number of data models have been proposed for gene mutation data which is the basis for integrating the target specific component database to be integrated with the federated information system. The proposed data models are: data models for genetic trait analysis, classification of gene mutation data, pathological lesion data and laboratory data. The main feature of this component database is non-overlapping attributes and it will follow non-redundant integration approach as explained in the thesis. This will be achieved by storing attributes which will not have the union or intersection of any attributes that exist in public domain molecular biology databases. Unlike data warehousing technique, this feature is quite unique and novel. The component database will be integrated with other biological data sources for sharing information in a cooperative environment. This involves developing new tools. The thesis explains the role of these new tools which are: meta data extractor, mapping linker, query generator and result interpreter. These tools are used for a transparent integration without creating any global schema of the participating databases. The thesis has also established the concept of image object keying for multidatabase query and it has proposed a relevant algorithm for matching protein spot in gel electrophoresis image. An object spot in gel electrophoresis image will initiate the query when it is selected by the user. It matches the selected spot with other similar spots in other resource databases. This image object keying method is an alternative to conventional multidatabase query which requires writing complex SQL scripts. This method also resolve the semantic conflicts that exist among molecular biology databases. The research has proposed a new framework based on the context of the web data for interactions with different biological data resources. A formal description of the resource context is described in the thesis. The implementation of the context into Resource Document Framework (RDF) will be able to increase the interoperability by providing the description of the resources and the navigation plan for accessing the web based databases. A higher level construct is developed (has, provide and access) to implement the context into RDF for web interactions. The interactions within the resources are achieved by utilising an integration domain to extract the required information with a single instance and without writing any query scripts. The integration domain allows to navigate and to execute the query plan within the resource databases. An extractor module collects elements from different target webs and unify them as a whole object in a single page. The proposed framework is tested to find specific information e.g., information on Alzheimer's disease, from public domain biology resources, such as, Protein Data Bank, Genome Data Bank, Online Mendalian Inheritance in Man and local database. Finally, the thesis proposes further propositions and plans for future work
    • โ€ฆ
    corecore