5,962 research outputs found

    Generating Nested XML Documents with Dtd from Relational Views

    Get PDF
    Converting relational database into XML is increasing daily for publishing and exchanging data on the web. Most of the current approaches and tools for generating XML documents from relational database generate flat XML documents that contain data redundancy which leads to produce a massive data on the web. Other approaches assume that the relational database for generating nested XML documents is normalized. In addition, these approaches have problem that lies in the difficult of how to specify the parent elements from the children elements in the nested XML document. Moreover, most of the current approaches and tools do not generate nested XML documents automatically. They require the user to specify the constraints and the schema of the target document. This research proposes an approach to automatically generate nested XML documents from flat relational database views that are unnormalized. The research aims to reduce data redundancy and storage sizes for the generated XML documents. The proposed approach consists of three steps. The first step is converting flat relational view into nested relational view. The second is generating DTD from the nested relational view. The third is generating nested XML document from the nested relational view. The proposed approach is evaluated and compared to other approaches such as NeT, CoT, and Cost-Based and tools such as Allora, Altova, and DbToXml with respect to two measurements: data redundancy and storage size of the document. The first measurement includes several parameters that are number of data values, elements, attributes, and tags. Based on the results of comparing the proposed approach to several other approaches and tools, the proposed approach is more efficient for reducing data redundancy and storage size of XML documents. It can reduce data redundancy and storage size by approximately 50% and 55%, respectively

    Generating nested XML documents from unnormalized relational views using a statistically approach

    Get PDF
    Converting relational database into XML is increasing daily for publishing and exchanging the business applications data on the Web. Most of the current approaches and products convert the unnormalized relational views to XML documents using flat-based that causes data redundancy which leads to generate a massive data. This paper proposes an approach to generate nested XML documents with full automatic from flat relational views. The proposed approach includes two steps: the first is extracting the nested view from the flat relational view using a statistically approach for counting and removing the data redundancy. The statistically approach is based on the analysis of functional dependency taking into account the frequency of the data values for the columns of the relational views. The second step is converting the nested view into nested XML. The columns of the nested view are also divided into groups to achieve the nesting for XML document. The proposed approach generates nested XML documents from the unnormalized relational views with full automatic. The generated nested XML document by the proposed approach has good features such as minimal redundancy and less size of storage

    Generating free redundancy XML documents from non normalized relational views using a statistically approach

    Get PDF
    Converting relational database into XML is increasing daily for publishing and exchanging the business applications data on the Web. Most of the current approaches and products convert the non normalized relational views to XML documents using flat-based that causes data redundancy which leads to generate a massive data. This paper proposes an approach to reduce the redundancy for the XML documents that are generated from non normalized relational views. The proposed approach includes two steps: the first is extracting the nested view from the flat relational view using a statistically approach for counting and removing the data redundancy. The statistically approach is based on the analysis of functional dependency taking into account the frequency of the data values for the columns of the relational views. The document based on the blocks that are created according to frequency of the data values for the first column which has the most frequent of the data values

    Database independent Migration of Objects into an Object-Relational Database

    Get PDF
    This paper reports on the CERN-based WISDOM project which is studying the serialisation and deserialisation of data to/from an object database (objectivity) and ORACLE 9i.Comment: 26 pages, 18 figures; CMS CERN Conference Report cr02_01

    Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective

    Full text link
    This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements. It utilizes UMLS to perform domain adaptation when integrating generic domain NLP tools. It also features stand-off annotations that are specified by positional reference to the original document. We built an interval tree based search engine to efficiently query and retrieve the stand-off annotations by specifying positional requirements. We also developed a utility to convert an inline annotation format to stand-off annotations to enable the reuse of clinical text datasets with inline annotations. We experimented with our system on several NLP facilitated tasks including computational phenotyping for lymphoma patients and semantic relation extraction for clinical notes. These experiments showcased the broader applicability and utility of LAPNLP.Comment: 6 pages, accepted by IEEE BIBM 2018 as regular pape

    Information Integration - the process of integration, evolution and versioning

    Get PDF
    At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

    XML document design via GN-DTD

    Get PDF
    Designing a well-structured XML document is important for the sake of readability and maintainability. More importantly, this will avoid data redundancies and update anomalies when maintaining a large quantity of XML based documents. In this paper, we propose a method to improve XML structural design by adopting graphical notations for Document Type Definitions (GN-DTD), which is used to describe the structure of an XML document at the schema level. Multiples levels of normal forms for GN-DTD are proposed on the basis of conceptual model approaches and theories of normalization. The normalization rules are applied to transform a poorly designed XML document into a well-designed based on normalized GN-DTD, which is illustrated through examples
    corecore