5 research outputs found
Content-Based Access Control
In conventional database, the most popular access control model specifies policies explicitly for each role of every user against each data object manually. Nowadays, in large-scale content-centric data sharing, conventional approaches could be impractical due to exponential explosion of the data growth and the sensitivity of data objects. What's more, conventional database access control policy will not be functional when the semantic content of data is expected to play a role in access decisions. Users are often over-privileged, and ex post facto auditing is enforced to detect misuse of the privileges. Unfortunately, it is usually difficult to reverse the damage, as (large amount of) data has been disclosed already. In this dissertation, we first introduce Content-Based Access Control (CBAC), an innovative access control model for content-centric information sharing. As a complement to conventional access control models, the CBAC model makes access control decisions based on the content similarity between user credentials and data content automatically. In CBAC, each user is allowed by a metarule to access "a subset" of the designated data objects of a content-centric database, while the boundary of the subset is dynamically determined by the textual content of data objects. We then present an enforcement mechanism for CBAC that exploits Oracles Virtual Private Database (VPD) to implement a row-wise access control and to prevent data objects from being abused by unnecessary access admission. To further improve the performance of the proposed approach, we introduce a content-based blocking mechanism to improve the efficiency of CBAC enforcement to further reveal a more relevant part of the data objects comparing with only using the user credentials and data content. We also utilized several tagging mechanisms for more accurate textual content matching for short text snippets (e.g. short VarChar attributes) to extract topics other than pure word occurrences to represent the content of data. In the tagging mechanism, the similarity of content is calculated not purely dependent on the word occurrences but the semantic topics underneath the text content. Experimental results show that CBAC makes accurate access control decisions with a small overhead
A Human-Centric Approach to Data Fusion in Post-Disaster Managment: The Development of a Fuzzy Set Theory Based Model
It is critical to provide an efficient and accurate information system in the post-disaster phase for individuals\u27 in order to access and obtain the necessary resources in a timely manner; but current map based post-disaster management systems provide all emergency resource lists without filtering them which usually leads to high levels of energy consumed in calculation. Also an effective post-disaster management system (PDMS) will result in distribution of all emergency resources such as, hospital, storage and transportation much more reasonably and be more beneficial to the individuals in the post disaster period. In this Dissertation, firstly, semi-supervised learning (SSL) based graph systems was constructed for PDMS. A Graph-based PDMS\u27 resource map was converted to a directed graph that presented by adjacent matrix and then the decision information will be conducted from the PDMS by two ways, one is clustering operation, and another is graph-based semi-supervised optimization process. In this study, PDMS was applied for emergency resource distribution in post-disaster (responses phase), a path optimization algorithm based ant colony optimization (ACO) was used for minimizing the cost in post-disaster, simulation results show the effectiveness of the proposed methodology. This analysis was done by comparing it with clustering based algorithms under improvement ACO of tour improvement algorithm (TIA) and Min-Max Ant System (MMAS) and the results also show that the SSL based graph will be more effective for calculating the optimization path in PDMS. This research improved the map by combining the disaster map with the initial GIS based map which located the target area considering the influence of disaster. First, all initial map and disaster map will be under Gaussian transformation while we acquired the histogram of all map pictures. And then all pictures will be under discrete wavelet transform (DWT), a Gaussian fusion algorithm was applied in the DWT pictures. Second, inverse DWT (iDWT) was applied to generate a new map for a post-disaster management system. Finally, simulation works were proposed and the results showed the effectiveness of the proposed method by comparing it to other fusion algorithms, such as mean-mean fusion and max-UD fusion through the evaluation indices including entropy, spatial frequency (SF) and image quality index (IQI). Fuzzy set model were proposed to improve the presentation capacity of nodes in this GIS based PDMS
λΉνμ§ κ³ μ₯ λ°μ΄ν°μ μ μ€κ°μ€λΆμλ°μ΄ν°λ₯Ό μ΄μ©ν λ₯λ¬λκΈ°λ° μ£Όλ³μκΈ° κ³ μ₯μ§λ¨ μ°κ΅¬
νμλ
Όλ¬Έ(λ°μ¬) -- μμΈλνκ΅λνμ : 곡과λν κΈ°κ³ν곡곡νλΆ, 2021.8. μμ¬μ
.μ€λλ μ°μ
μ κΈμν λ°μ κ³Ό κ³ λνλ‘ μΈν΄ μμ νκ³ μ λ’°ν μ μλ μ λ ₯ κ³ν΅μ λν μμλ λμ± μ€μν΄μ§κ³ μλ€. λ°λΌμ μ€μ μ°μ
νμ₯μμλ μ£Όλ³μκΈ°μ μμ ν μλμ μν΄ μνλ₯Ό μ ννκ² μ§λ¨ν μ μλ prognostics and health management (PHM)μ κ°μ κΈ°μ μ΄ νμνλ€. μ£Όλ³μκΈ° μ§λ¨μ μν΄ κ°λ°λ λ€μν λ°©λ² μ€ μΈκ³΅μ§λ₯(AI) κΈ°λ° μ κ·Όλ²μ μ°μ
κ³Ό νκ³μμ λ§μ κ΄μ¬μ λ°κ³ μλ€. λμ±μ΄ λ°©λν λ°μ΄ν°μ ν¨κ» λμ μ±λ₯μ λ¬μ±νλ λ₯ λ¬λ κΈ°μ μ μ£Όλ³μκΈ° κ³ μ₯ μ§λ¨μ νμλ€μκ² λμ κ΄μ¬μ κ°κ² ν΄μ€¬λ€. κ·Έ μ΄μ λ λ₯ λ¬λ κΈ°μ μ΄ μμ€ν
μ λλ©μΈ μ§μμ κΉμ΄ μ΄ν΄ν νμ μμ΄ λλμ λ°μ΄ν°λ§ μ£Όμ΄μ§λ€λ©΄ 볡μ‘ν μμ€ν
μ΄λΌλ μ¬μ©μμ λͺ©μ μ λ§κ² κ·Έ ν΄λ΅μ μ°Ύμ μ μκΈ° λλ¬Έμ λ₯ λ¬λμ λν κ΄μ¬μ μ£Όλ³μκΈ° κ³ μ₯ μ§λ¨ λΆμΌμμ νΉν λλλ¬μ‘λ€.
κ·Έλ¬λ, μ΄λ¬ν λ°μ΄λ μ§λ¨ μ±λ₯μ μμ§ μ€μ μ£Όλ³μκΈ° μ°μ
μμλ λ§μ κ΄μ¬μ μ»κ³ μμ§λ λͺ»ν κ²μΌλ‘ μλ €μ‘λ€. κ·Έ μ΄μ λ μ°μ
νμ₯μ λΉνμ§λ°μ΄ν°μ μλμ κ³ μ₯λ°μ΄ν° λλ¬Έμ μ°μν λ₯λ¬λκΈ°λ°μ κ³ μ₯ μ§λ¨ λͺ¨λΈλ€μ κ°λ°νκΈ° μ΄λ ΅λ€.
λ°λΌμ λ³Έ νμλ
Όλ¬Έμμλ μ£Όλ³μκΈ° μ°μ
μμ νμ¬ λλλκ³ μλ μΈκ°μ§ μ΄μλ₯Ό μ°κ΅¬νμλ€. 1) 건μ μ± νλ©΄ μκ°ν μ΄μ, 2) λ°μ΄ν° λΆμ‘± μ΄μ, 3) μ¬κ°λ μ΄μ λ€μ 극볡νκΈ° μν λ₯ λ¬λ κΈ°λ° κ³ μ₯ μ§λ¨ μ°κ΅¬λ₯Ό μ§ννμλ€. μκ°λ μΈκ°μ§ μ΄μλ€μ κ°μ νκΈ° μν΄ λ³Έ νμλ
Όλ¬Έμ μΈ κ°μ§ μ°κ΅¬λ₯Ό μ μνμλ€.
첫 λ²μ§Έ μ°κ΅¬λ 보쑰 κ°μ§ μμ
μ΄ μλ μ€μ§λ μλ μΈμ½λλ₯Ό ν΅ν΄ 건μ μ± νλ©΄μ μ μνμλ€. μ μλ λ°©λ²μ λ³μκΈ° μ΄ν νΉμ±μ μκ°ν ν μ μλ€. λν, μ€μ§λ μ κ·Όλ²μ νμ©νκΈ° λλ¬Έμ λ°©λν λΉνμ§λ°μ΄ν° κ·Έλ¦¬κ³ μμμ νμ§λ°μ΄ν°λ§μΌλ‘ ꡬνλ μ μλ€. μ μλ°©λ²μ μ£Όλ³μκΈ° 건μ μ±μ 건μ μ± νλ©΄κ³Ό ν¨κ» μκ°ννκ³ , λ§€μ° μ μ μμμ λ μ΄λΈ λ°μ΄ν°λ§μΌλ‘ μ£Όλ³μκΈ° κ³ μ₯μ μ§λ¨νλ€.
λ λ²μ§Έ μ°κ΅¬λ κ·μΉ κΈ°λ° Duval λ°©λ²μ AI κΈ°λ° deep neural network (DNN)κ³Ό μ΅ν©(bridge)νλ μλ‘μ΄ νλ μμν¬λ₯Ό μ μνμλ€. μ΄ λ°©λ²μ λ£°κΈ°λ°μ Duvalμ μ¬μ©νμ¬ λΉνμ§λ°μ΄ν°λ₯Ό μλ λ μ΄λΈλ§νλ€ (pseudo-labeling). λν, AI κΈ°λ° DNNμ μ κ·ν κΈ°μ κ³Ό λ§€κ° λ³μ μ μ΄ νμ΅μ μ μ©νμ¬ λ
Έμ΄μ¦κ° μλ pseudo-label λ°μ΄ν°λ₯Ό νμ΅νλλ° μ¬μ©λλ€. κ°λ°λ κΈ°μ μ λ°©λνμμ λΉνμ§λ°μ΄ν°λ₯Ό λ£°κΈ°λ°μΌλ‘ μΌμ°¨μ μΌλ‘ μ§λ¨ν κ²°κ³Όμ μμμ μ€μ κ³ μ₯λ°μ΄ν°μ ν¨κ» νμ΅λ°μ΄ν°λ‘ νλ ¨νμμ λ κΈ°μ‘΄μ μ§λ¨ λ°©λ²λ³΄λ€ νκΈ°μ μΈ ν₯μμ κ°λ₯μΌ νλ€.
λμΌλ‘, μΈ λ²μ§Έ μ°κ΅¬λ κ³ μ₯ νμ
μ μ§λ¨ν λΏλ§ μλλΌ μ¬κ°λ λν μ§λ¨νλ κΈ°μ μ μ μνμλ€. μ΄λ λ μνμ λ μ΄λΈλ§λ κ³ μ₯ νμ
κ³Ό μ¬κ°λ μ¬μ΄μλ λΆκ· μΌν λ°μ΄ν° λΆν¬λ‘ μ΄λ£¨μ΄μ Έ μλ€. κ·Έ μ΄μ λ μ¬κ°λμ κ²½μ° λ μ΄λΈλ§μ΄ νμ λμ΄ μμ§λ§ κ³ μ₯ νμ
μ κ²½μ°λ μ€μ μ£Όλ³μκΈ°λ‘λΆν° κ³ μ₯ νμ
λ°μ΄ν°λ₯Ό μ»κΈ°κ° λ§€μ° μ΄λ ΅κΈ° λλ¬Έμ΄λ€. λ°λΌμ, λ³Έ μ°κ΅¬μμ μΈλ²μ§Έλ‘ κ°λ°ν κΈ°μ μ μ€λλ λ°μ΄ν° μμ±μ λ§€μ° μ°μν μ±λ₯μ λ¬μ±νκ³ μλ generative adversarial network (GAN)λ₯Ό ν΅ν΄ λΆκ· νν λ μνλ₯Ό κ· μΌν μμ
μ μννλ λμμ κ³ μ₯ λͺ¨λμ μ¬κ°λλ₯Ό μ§λ¨νλ λͺ¨λΈμ κ°λ°νμλ€.Due to the rapid development and advancement of todayβs industry, the demand for safe and reliable power distribution and transmission lines is becoming more critical; thus, prognostics and health management (hereafter, PHM) is becoming more important in the power transformer industry. Among various methods developed for power transformer diagnosis, the artificial intelligence (AI) based approach has received considerable interest from academics. Specifically, deep learning technology, which offers excellent performance when used with vast amounts of data, is also rapidly gaining the spotlight in the academic field of transformer fault diagnosis. The interest in deep learning has been especially noticed in the field of fault diagnosis, because deep learning algorithms can be applied to complex systems that have large amounts of data, without the need for a deep understanding of the domain knowledge of the system.
However, the outstanding performance of these diagnosis methods has not yet gained much attention in the power transformer PHM industry. The reason is that a large amount of unlabeled and a small amount of fault data always restrict their deep-learning-based diagnosis methods in the power transformer PHM industry.
Therefore, in this dissertation research, deep-learning-based fault diagnosis methods are developed to overcome three issues that currently prevent this type of diagnosis in industrial power transformers: 1) the visualization of health feature space issue, 2) the insufficient data issue, and 3) the severity issue. To cope with these challenges, this thesis is composed of three research thrusts. The first research thrust develops a health feature space via a semi-supervised autoencoder with an auxiliary detection task. The proposed method can visualize a monotonic health trendability of the transformerβs degradation properties. Further, thanks to the use of a semi-supervised approach, the method is applicable to situations with a large amount of unlabeled and a small amount labeled data (a situation common in industrial datasets). Next, the second research thrust proposes a new framework, that bridges the rule-based Duval method with an AI-based deep neural network (BDD). In this method, the rule-based Duval method is utilized to pseudo-label a large amount of unlabeled data. Furthermore, the AI-based DNN is used to apply regularization techniques and parameter transfer learning to learn the noisy pseudo-labelled data. Finally, the third thrust not only identifies fault types but also indicates a severity level. However, the balance between labeled fault types and the severity level is imbalanced in real-world data. Therefore, in the proposed method, diagnosis of fault types β with severity levels β under imbalanced conditions is addressed by utilizing a generative adversarial network with an auxiliary classifier. The validity of the proposed methods is demonstrated by studying massive unlabeled dissolved gas analysis (DGA) data, provided by the Korea Electric Power Company (KEPCO), and sparse labeled data, provided by the IEC TC 10 database. Each developed method could be used in industrial fields that use power transformers to monitor the health feature space, consider severity level, and diagnose transformer faults under extremely insufficient labeled fault data.Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Research Scope and Overview 4
1.3 Dissertation Layout 7
Chapter 2 Literature Review 9
2.1 A Brief Overview of Rule-Based Fault Diagnosis 9
2.2 A Brief Overview of Conventional AI-Based Fault Diagnosis 11
Chapter 3 Extracting Health Feature Space via Semi-Supervised Autoencoder with an Auxiliary Task (SAAT) 13
3.1 Backgrounds of Semi-supervised autoencoder (SSAE) 15
3.1.1 Autoencoder: Unsupervised Feature Extraction 15
3.1.2 Softmax Classifier: Supervised Classification 17
3.1.3 Semi-supervised Autoencoder 18
3.2 Input DGA Data Preprocessing 20
3.3 SAAT-Based Fault Diagnosis Method 21
3.3.1 Roles of the Auxiliary Detection Task 23
3.3.2 Architecture of the Proposed SAAT 27
3.3.3 Health Feature Space Visualization 29
3.3.4 Overall Procedure of the Proposed SAAT-based Fault Diagnosis 30
3.4 Performance Evaluation of SAAT 31
3.4.1 Data Description and Implementation 31
3.4.2 An Outline of Four Comparative Studies and Quantitative Evaluation Metrics 33
3.4.3 Experimental Results and Discussion 36
3.5 Summary and Discussion 49
Chapter 4 Learning from Even a Weak Teacher: Bridging Rule-based Duval Weak Supervision and a Deep Neural Network (BDD) for Diagnosing Transformer 51
4.1 Backgrounds of BDD 53
4.1.1 Rule-based method: Duval Method 53
4.1.2 Deep learning Based Method: Deep Neural Network 54
4.1.3 Parameter Transfer 55
4.2 BDD Based Fault Diagnosis 56
4.2.1 Problem Statement 56
4.2.2 Framework of the Proposed BDD 57
4.2.3 Overall Procedure of BDD-based Fault Diagnosis 63
4.3 Performance Evaluation of the BDD 64
4.3.1 Description of Data and the DNN Architecture 64
4.3.2 Experimental Results and Discussion 66
4.4 Summary and Discussion 76
Chapter 5 Generative Adversarial Network with Embedding Severity DGA Level 79
5.1 Backgrounds of Generative Adversarial Network 81
5.2 GANES based Fault Diagnosis 82
5.2.1 Training Strategy of GANES 82
5.2.2 Overall procedure of GANES 87
5.3 Performance Evaluation of GANES 91
5.3.1 Description of Data 91
5.3.2 Outlines of Experiments 91
5.3.3 Preliminary Experimental Results of Various GANs 95
5.3.4 Experiments for the Effectiveness of Embedding Severity DGA Level 99
5.4 Summary and Discussion 105
Chapter 6 Conclusion 106
6.1 Contributions and Significance 106
6.2 Suggestions for Future Research 108
References 110
κ΅λ¬Έ μ΄λ‘ 127λ°
The Effect of Data Curation on the Accuracy of Quantitative Structure-Activity Relationship Models
In the 33 years since the first public release of GenBank, and the 15 years since the publication of the first pilot assembly of the human genome, drug discovery has been awash in a tsunami of data. But it has only been within the past decade that medicinal chemists and chemical biologists have had access to the same sorts of large-scale, public-access databases as bioinformaticians and molecular biologists have had for so long. The release of this data has sparked a renewed interest in computational methods for rational drug design, but questions have arisen recently about the accuracy and quality of this data. The same question has arisen in other scientific disciplines, but it has a particular urgency to practitioners of Quantitative Structure-Activity Relationship (QSAR) modeling. By its nature QSAR modeling depends on both activity data and chemical structures. While activities are usually expressed as numerical scalar values, a form ubiquitous throughout the sciences, chemical structures (especially that must be interpretable as such by computer software) are stored in a variety of specialized formats which are much less common and mostly ignored outside of cheminformatics and related fields. While previous research has determined that a 5% error rate in data being used for modeling can cause a QSAR model to be non-predictive and useless for its intended purpose, and workflows have been proposed which reduce the effect of inconsistent chemical structure representations on model accuracy, a fundamental question remains: βhow accurate are the structure and activity data freely available to researchers?β To this end, we have undertaken two surveys of data quality, one focusing on chemical structure information in Internet resources and a second examining the uncertainty associated with compounds reported in the medicinal chemistry literature as abstracted in ChEMBL. The results of these studies have informed the creation of an improved workflow for the curation of structure-activity data which is intended to identify problematic data points in raw data extracted from databases so that an expert human curator can examine the underlying literature and resolve discrepancies between reported values. This workflow was in turn applied to the creation of two QSAR models that were used to implement a virtual screen seeking molecules capable of binding to both the serotonergic reuptake transporter and the alpha2a adrenergic receptor. While no suitable compounds were identified in the initial screening process, regions of chemical space that may yield truly novel alpha 2a receptor ligands have been identified. These regions can be targeted in future efforts. Basing data curation workflows on manual processes by human curators is not particularly viable, as humans have a tendency to introduce errors by inattention even as they identify and repair other problems. Computers cannot effectively curate data either. While they are highly accurate when programmed properly, they lack human creativity and insight that would allow them to determine which data points represent truly inaccurate information. In order to effectively curate data, humans and computers must both be incorporated into a workflow that harnesses their strengths and limits their liabilities.Doctor of Philosoph