277 research outputs found

    Generating Nested XML Documents with Dtd from Relational Views

    Get PDF
    Converting relational database into XML is increasing daily for publishing and exchanging data on the web. Most of the current approaches and tools for generating XML documents from relational database generate flat XML documents that contain data redundancy which leads to produce a massive data on the web. Other approaches assume that the relational database for generating nested XML documents is normalized. In addition, these approaches have problem that lies in the difficult of how to specify the parent elements from the children elements in the nested XML document. Moreover, most of the current approaches and tools do not generate nested XML documents automatically. They require the user to specify the constraints and the schema of the target document. This research proposes an approach to automatically generate nested XML documents from flat relational database views that are unnormalized. The research aims to reduce data redundancy and storage sizes for the generated XML documents. The proposed approach consists of three steps. The first step is converting flat relational view into nested relational view. The second is generating DTD from the nested relational view. The third is generating nested XML document from the nested relational view. The proposed approach is evaluated and compared to other approaches such as NeT, CoT, and Cost-Based and tools such as Allora, Altova, and DbToXml with respect to two measurements: data redundancy and storage size of the document. The first measurement includes several parameters that are number of data values, elements, attributes, and tags. Based on the results of comparing the proposed approach to several other approaches and tools, the proposed approach is more efficient for reducing data redundancy and storage size of XML documents. It can reduce data redundancy and storage size by approximately 50% and 55%, respectively

    Towards an Efficient Evaluation of General Queries

    Get PDF
    Database applications often require to evaluate queries containing quantifiers or disjunctions, e.g., for handling general integrity constraints. Existing efficient methods for processing quantifiers depart from the relational model as they rely on non-algebraic procedures. Looking at quantified query evaluation from a new angle, we propose an approach to process quantifiers that makes use of relational algebra operators only. Our approach performs in two phases. The first phase normalizes the queries producing a canonical form. This form permits to improve the translation into relational algebra performed during the second phase. The improved translation relies on a new operator - the complement-join - that generalizes the set difference, on algebraic expressions of universal quantifiers that avoid the expensive division operator in many cases, and on a special processing of disjunctions by means of constrained outer-joins. Our method achieves an efficiency at least comparable with that of previous proposals, better in most cases. Furthermore, it is considerably simpler to implement as it completely relies on relational data structures and operators

    The normalization of frames as a superclass of relations

    Get PDF
    M.Sc. (Computer science)Knowledge representation suffers from certain problems, which is not a result of the inadequacies of knowledge representation schemes, but of the way in which they are used and implemented. In the first part of this dissertation we examine the relational model (as used in relational database management systems) and we examine frames (a knowledge representation scheme used in expert systems), as proposed by M. Minsky [MIN75]. We then provide our own definition of frames. In the second part, we examine similarities between the two models (the relational model and our frame model), establishing frames as a superclass of relations. We then define normalization for frames and examine how normalization might solve some of the problems we have identified. We then examine the integration of knowledge-based systems and database management systems and classify our normalization of frames as such an attempt. We conclude by examining the place of normalization within the expert system development life cycl

    Comparative Analysis of Data Redundancy and Execution Time between Relational and Object-Oriented Schema Table

    Get PDF
    Database design is one of the important phases in designing software because database is where the data is stored inside the system. One of the most popular techniques used in database design is the relational technique, which focuses on entity relationship diagram and normalization. The relational technique is useful for eliminating data redundancy because normalization produces normal forms on the schema tables. The second technique is the object-oriented technique, which focuses on class diagram and generating schema tables. An advantage of object-oriented technique is its close implementation to programming languages like C++ or Java. This paper is set to compare the performance of both relational and object-oriented techniques in terms of solving data redundancy during the database design phase as well as measuring query execution time. The experimental results based on a course database case study traced 186 redundant records using the relational technique and 204 redundant records when using the object-oriented technique. The query execution time measured was 46.75ms and 31.75ms for relational and object-oriented techniques, respectively

    Clarifying Normalization

    Get PDF
    Confusion exists among database textbooks as to the goal of normalization as well as to which normal form a designer should aspire. This article discusses such discrepancies with the intention of simplifying normalization for both teacher and student. This author’s industry and classroom experiences indicate such simplification yields quicker learning and more complete understanding by students

    Clustering Algorithms for Microarray Data Mining

    Get PDF
    This thesis presents a systems engineering model of modern drug discovery processes and related systems integration requirements. Some challenging problems include the integration of public information content with proprietary corporate content, supporting different types of scientific analyses, and automated analysis tools motivated by diverse forms of biological data.To capture the requirements of the discovery system, we identify the processes, users, and scenarios to form a UML use case model. We then define the object-oriented system structure and attach behavioral elements. We also look at how object-relational database extensions can be applied for such analysis.The next portion of the thesis studies the performance of clustering algorithms based on LVQ, SVMs, and other machine learning algorithms, to two types of analyses - functional and phenotypic classification. We found that LVQ initialized with the LBG codebook yields comparable performance to the optimal separating surfaces generated by related SVM kernels. We also describe a novel similarity measure, called the unnormalized symmetric Kullback-Liebler measure, based on unnormalized expression values. Since the Mercer criterion cannot be applied to this measure, we compared the performance of this similarity measure with the log-Euclidean distance in the LVQ algorithm.The two distance measures perform similarly on cDNA arrays, while the unnormalized symmetric Kullback-Liebler measure outperforms the log-Euclidean distance on certain phenotypic classification problems. Pre-filtering algorithms to find discriminating instances based on PCA, the Find Similar function, and IB3 were also investigated. The Find Similar method gives the best performance in terms of multiple criteria
    corecore