2,129 research outputs found

    Desirable properties for XML update mechanisms

    Get PDF
    The adoption of XML as the default data interchange format and the standardisation of the XPath and XQuery languages has resulted in significant research in the development and implementation of XML databases capable of processing queries efficiently. The ever-increasing deployment of XML in industry and the real-world requirement to support efficient updates to XML documents has more recently prompted research in dynamic XML labelling schemes. In this paper, we provide an overview of the recent research in dynamic XML labelling schemes. Our motivation is to define a set of properties that represent a more holistic dynamic labelling scheme and present our findings through an evaluation matrix for most of the existing schemes that provide update functionality

    Investigation into Indexing XML Data Techniques

    Get PDF
    The rapid development of XML technology improves the WWW, since the XML data has many advantages and has become a common technology for transferring data cross the internet. Therefore, the objective of this research is to investigate and study the XML indexing techniques in terms of their structures. The main goal of this investigation is to identify the main limitations of these techniques and any other open issues. Furthermore, this research considers most common XML indexing techniques and performs a comparison between them. Subsequently, this work makes an argument to find out these limitations. To conclude, the main problem of all the XML indexing techniques is the trade-off between the size and the efficiency of the indexes. So, all the indexes become large in order to perform well, and none of them is suitable for all users’ requirements. However, each one of these techniques has some advantages in somehow

    Accelerating data retrieval steps in XML documents

    Get PDF

    UDante: First Steps Towards the Universal Dependencies Treebank of Dante’s Latin Works

    Get PDF
    This paper1 presents the early stages of the development of a new treebank containing all of Dante Alighieri’s Latin works. In particular, it describes the conversion of the original TEI-XML files to CoNLL-U, the creation of a gold standard, the process of training four annotators and the evaluation of the syntactic annotation in terms of inter-annotator agreement and LA, UAS and LAS. The aim is to release a new resource, in view of the celebrations for the 700th anniversary of Dante’s death, which can support the development of the Vocabolario Dantesco

    Clustering-based Labelling Scheme - A Hybrid Approach for Efficient Querying and Updating XML Documents

    Get PDF
    Extensible Markup Language (XML) has become a dominant technology for transferring data through the worldwide web. The XML labelling schemes play a key role in handling XML data efficiently and robustly. Thus, many labelling schemes have been proposed. However, these labelling schemes have limitations and shortcomings. Thus, the aim of this research was to investigate the existing XML labelling schemes and their limitations in order to address the issue of efficiency of XML query performance. This thesis investigated the existing labelling schemes and classified them into three categories based on certain criteria, in order to identify the limitations and challenges of these labelling schemes. Based on the outcomes of this investigation, this thesis proposed a state-of-theart labelling scheme, called clustering-based labelling scheme, to resolve or improve the key limitations such as the efficiency of the XML query processing, labelling XML nodes, and XML updates cost. This thesis argued that using certain existing labelling schemes to label nodes, and using the clustering-based techniques can improve query and labelling nodes efficiency. Theoretically, the proposed scheme is based on dividing the nodes of an XML document into clusters. Two existing labelling schemes, which are the Dewey and LLS labelling schemes, were selected for labelling these clusters and their nodes. Subsequently, the proposed scheme was designed and implemented. In addition, the Dewey and LLS labelling scheme were implemented for the purpose of evaluating the proposed scheme. Subsequently, four experiments were designed in order to test the proposed scheme against the Dewey and LLS labelling schemes. The results of these experiments suggest that the proposed scheme achieved better results than the Dewey and LLS schemes. Consequently, the research hypothesis was accepted overall with few exceptions, and the proposed scheme showed an improvement in the performance and all the targeted features and aspects

    05061 Abstracts Collection -- Foundations of Semistructured Data

    Get PDF
    From 06.02.05 to 11.02.05, the Dagstuhl Seminar 05061 ``Foundations of Semistructured Data\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    MIT's CWSpace project: packaging metadata for archiving educational content in DSpace

    Get PDF
    This paper describes work in progress on the research project CWSpace, sponsored by the MIT and Microsoft Research iCampus program, to investigate the metadata standards and protocols required to archive the course materials found in MIT’s OpenCourseWare (OCW) into MIT’s institutional repository DSpace. The project goal is “to harvest and digitally archive OCW learning objects, and make them available to learning management systems by using Web Services interfaces on top of DSpace.” The larger vision is one of complex digital objects (CDOs) successfully interoperating amongst MIT’s various learning management systems and learning object repositories, providing archival preservation and persistent identifiers for educational materials, as well as providing the means to richer shared discovery and dissemination mechanisms for those materials. The paper describes work to date on the analysis of the content packaging metadata standards METS (Metadata Encoding and Transmission Standard) and especially IMS-CP (IMS Global Learning Consortium, Content Packaging), and issues faced in the development and use of profiles, extensions, and external schema for these standards. Also addressed are the anticipated issues in the preparation of transformations from one standard to another, noting the importance of well-defined profiles to making that feasible. The paper also briefly touches on the DSpace development work that will be undertaken to provide new import and export functionalities, as the technical specifications for these will largely be determined by the packaging metadata profiles that are developed. Note that the degree of interoperability considered herein might be referred to as “first level,” as this paper addresses the packaging metadata only, which in turn is the carrier or envelope for the descriptive (and other kinds of) metadata. It will no doubt be an even more challenging task to ensure interoperability at what might be referred to as the “second level,” that of semantic metadata.MIT iCampu

    da|ra Metadata Schema: Version 3.0

    Full text link
    • 

    corecore