28,257 research outputs found

    XMCCTree: an algorithm for producing Mcctree in incompact structure

    Get PDF
    Different notions and techniques have been proposed to support query in XML. This led to the generation of algorithms to provide more efficient and accurate query processing. This research proposes an efficient algorithm to support keyword query in XML document. The propose algorithm, namely XMCCTree, employs MCLCA notion with some modifications in order to provide more efficient algorithm

    Building XML data warehouse based on frequent patterns in user queries

    Get PDF
    [Abstract]: With the proliferation of XML-based data sources available across the Internet, it is increasingly important to provide users with a data warehouse of XML data sources to facilitate decision-making processes. Due to the extremely large amount of XML data available on web, unguided warehousing of XML data turns out to be highly costly and usually cannot well accommodate the users’ needs in XML data acquirement. In this paper, we propose an approach to materialize XML data warehouses based on frequent query patterns discovered from historical queries issued by users. The schemas of integrated XML documents in the warehouse are built using these frequent query patterns represented as Frequent Query Pattern Trees (FreqQPTs). Using hierarchical clustering technique, the integration approach in the data warehouse is flexible with respect to obtaining and maintaining XML documents. Experiments show that the overall processing of the same queries issued against the global schema become much efficient by using the XML data warehouse built than by directly searching the multiple data sources

    A Method of XML Document Fragmentation for Reducing Time of XML Fragment Stream Query Processing

    Get PDF
    As XML has been established as the standard for data exchange not just on the Web but among heterogeneous devices, systems, and applications, effective processing of XML queries is one of core components of ubiquitous computing. Most of the mobile/hand-held devices deployed in ubiquitous computing environment are still limited in memory and processing power. An effective query processing is required when the source XML document is of large volume. The framework of fragmenting an XML document and streaming the XML fragments for query processing at the mobile devices has received much attention. However, the main focus was on the memory efficiency to cope with the memory constraint in the mobile devices. Query processing time might be compromised in those techniques. Since the processing power is also limited in the mobile devices, the time optimization deserves attention. We have found out that the query processing time is significantly affected by how the source XML document is fragmented. In this paper, we propose a method of XML document fragmentation whereby query processing gets efficient in time while the size constraint for each resulting fragment is satisfied. Through implementation and a set of detailed experiments, we show that our proposed method considerably outperforms other methods

    Strategies and Approaches for Generating Identical Extensive XML Tree Instances

    Get PDF
    In recent years, XML has become the de facto internet wire language. Data may be organized and given context with the use of XML. A well-organized document facilitates the transformation of raw data into actionable intelligence. In B2B1 applications, the XML data is sent and created. This implies the need for fast query processing on XML data. The processing of XML tree sample queries (XTPQ) that provide an efficient response (also known as sample matching) is a topic of active study in the XML database field.DOM (Parser) may be used to transform an XML document into a tree representation. Extensible Markup Language (XML) query languages like XPath and XQuery use tree samples (twigs) to express query results.XML query processing focuses mostly on effectively locating all instances of twig 1 samples inside an XML database. Numerous techniques for matching such tree samples have been presented in recent years. In this study, we survey recent developments in XTPQ processing. This summary will begin by introducing several algorithms for twig sample matching and then go on to provide some background on holistic techniques to process XTPQ

    EXPedite: A System for Encoded XML Processing

    Get PDF
    As XML becomes an increasingly popular format for information exchange, the efficient processing of broadcast XML data on a constrained device (for example, a cell phone or a PDA) becomes a critical task. In this paper we present the EXPedite system: a new model of data processing in an information exchange environment, which migrates the power of the data-sending server to receivers for efficient processing. It consists of a simple and general encoding scheme for servers, and streaming query processing algorithms on encoded XML stream for data receivers with constrained computing abilities. Experiments show the impressive performance of EXPedite

    FACHBEITRAG Unleashing XQuery for Data-Independent Programming

    Get PDF
    an SQL equivalent for XML data, but its roots in functional programming make it also a perfect choice for processing almost any kind of structured and semi-structured data. Apart from standard XML processing, however, advanced language features make it hard to efficiently implement the complete language for large data volumes. This work proposes a novel compilation strategy that provides both flexibility and efficiency to unleash XQuery’s potential as data programming language. It combines the simplicity and versatility of a storage-independent data abstraction with the scalability advantages of set-oriented processing. Expensive iterative sections in a query are unrolled to a pipeline of relational-style operators, which is open for optimized join processing, index use, and parallelization. The remaining aspects of the language are processed in a standard fashion, yet can be compiled anytime to more efficient native operations of the actual runtime environment. This hybrid compilation mechanism yields an efficient and highly flexible query engine that is able to drive any computation from simple XML transformation to complex data analysis, even on non-XML data. Experiments with our prototype and stateof-the-art competitors in classic XML query processing and business analytics over relational data attest the generality and efficiency of the design

    VAMANA : A High Performance, Scalable and Cost Driven XPath Engine

    Get PDF
    Many applications are migrating or beginning to make use native XML data. We anticipate that queries will emerge that emphasize the structural semantics of XML query languages like XPath and XQuery. This brings a need for an efficient query engine and database management system tailored for XML data similar to traditional relational engines. While mapping large XML documents into relational database systems while possible, poses difficulty in mapping XML queries to the less powerful relational query language SQL and creates a data model mismatch between relational tables and semi-structured XML data. Hence native solutions to efficiently store and query XML data are being developed recently. However, most of these systems thus far fail to demonstrate scalability with large document sizes, to provide robust support for the XPath query language nor to adequately address costing with respect to query optimization. In this thesis, we propose a novel cost-driven XPath engine to support the scalable evaluation of ad-hoc XPath expressions called VAMANA. VAMANA makes use of an efficient XML repository for storing and indexing large XML documents called the Multi-Axis Storage Structure (MASS) developed at WPI. VAMANA extensively uses indexes for query evaluation by considering index-only plans. To the best of our knowledge, it is the only XML query engine that supports an index plan approach for large XML documents. Our index-oriented query plans allow queries to be evaluated while reading only a fraction of the data, as all tuples for a particular context node are clustered together. The pipelined query framework minimizes the cost of handing intermediate data during query processing. Unlike other native solutions, VAMANA provides support for all 13 XPath axes. Our schema independent cost model provides dynamically calculated statistics that are then used for intelligent cost-based transformations, further improving performance. Our optimization strategy for increasing execution time performance is affirmed through our experimental studies on XMark benchmark data. VAMANA query execution is significantly faster than leading available XML query engines

    Hybrid approach for XML access control (HyXAC)

    Get PDF
    While XML has been widely adopted for sharing and managing information over the Internet, the need for efficient XML access control naturally arise. Various access control models and mechanisms have been proposed in the research community, such as view-based approaches and preprocessing approaches. All categories of solutions have their inherent advantages and disadvantages. For instance, view based approach provides high performance in query evaluation, but suffers from the view maintenance issues. To remedy the problems, we propose a hybrid approach, namely HyXAC: Hybrid XML Access Control. HyXAC provides efficient access control and query processing by maximizing the utilization of available (but constrained) resources. HyXAC uses pre-processing approach as a baseline to process queries and define sub-views. It dynamically allocates the available resources (memory and secondary storage) to materialize sub-views to improve query performance. Dynamic and fine-grained view management is introduced to utilize cost-effectiveness analysis for optimal query performance. Fine-grained view management also allows sub-views to be shared across multiple roles to eliminate the redundancies in storage

    Assessing XML Data Management with XMark

    Get PDF
    We discuss some of the experiences we gathered during the development and deployment of XMark, a tool to assess the infrastructure and performance of XML Data Management Systems. Since the appearance of the first XML database prototypes in research institutions and development labs, topics like validation, performance evaluation and optimization of XML query processors have received significant interest. The XMark benchmark follows a tradition in database research and provides a framework to assess the abilities and performance of XML processing system: it helps users to see how a query component integrates into an application and how it copes with a variety of query types that are typically encountered in real-world scenarios. To this end, XMark offers an application scenario and a set of queries; each query is intended to challenge a particular aspect of the query processor like the performance of full-text search combined with structural information or joins. Furthermore, we have designed and made available a benchmark document generator that allows for efficient generation of databases of different sizes ranging from small to very large. In short, XMark attempts to cover the major aspects of XML query processing ranging from small to large document and from textual queries to data analysis and ad hoc queries

    A model for querying semistructured data through the exploitation of regular sub-structures

    Get PDF
    Much research has been undertaken in order to speed up the processing of semistructured data in general and XML in particular. Many approaches for storage, compression, indexing and querying exist, e.g. [1, 2]. We do not present yet another such algorithm but a unifying model in which these algorithm can be understood. The key idea behind this research is the assumption, that most practical queries are based on a particular pattern of data that can be deduced from the query and which can then be captured using a regular structure amendable to efficient processing techniques
    corecore