611 research outputs found

    Semantics and efficient evaluation of partial tree-pattern queries on XML

    Get PDF
    Current applications export and exchange XML data on the web. Usually, XML data are queried using keyword queries or using the standard structured query language XQuery the core of which consists of the navigational query language XPath. In this context, one major challenge is the querying of the data when the structure of the data sources is complex or not fully known to the user. Another challenge is the integration of multiple data sources that export data with structural differences and irregularities. In this dissertation, a query language for XML called Partial Tree-Pattern Query (PTPQ) language is considered. PTPQs generalize and strictly contain Tree-Pattern Queries (TPQs) and can express a broad structural fragment of XPath. Because of their expressive power and flexibility, they are useful for querying XML documents the structure of which is complex or not fully known to the user, and for integrating XML data sources with different structures. The dissertation focuses on three issues. The first one is the design of efficient non-main-memory evaluation methods for PTPQs. The second one is the assignment of semantics to PTPQs so that they return meaningful answers. The third one is the development of techniques for answering TPQs using materialized views. Non-main-memory XML query evaluation can be done in two modes (which also define two evaluation models). In the first mode, data is preprocessed and indexes, called inverted lists, are built for it. In the second mode, data are unindexed and arrives continuously in the form of a stream. Existing algorithms cannot be used directly or indirectly to efficiently compute PTPQs in either mode. Initially, the problem of efficiently evaluating partial path queries in the inverted lists model has been addressed. Partial path queries form a subclass of PTPQs which is not contained in the class of TPQs. Three novel algorithms for evaluating partial path queries including a holistic one have been designed. The analytical and experimental results show that the holistic algorithm outperforms the other two. These results have been extended into holistic and non-holistic approaches for PTPQs in the inverted lists model. The experiments show again the superiority of the holistic approach. The dissertation has also addressed the problem of evaluating PTPQs in the streaming model, and two original efficient streaming algorithms for PTPQs have been designed. Compared to the only known streaming algorithm that supports an extension of TPQs, the experimental results show that the proposed algorithms perform better by orders of magnitude while consuming a much smaller fraction of memory space. An original approach for assigning semantics to PTPQs has also been devised. The novel semantics seamlessly applies to keyword queries and to queries with structural restrictions. In contrast to previous approaches that operate locally on data, the proposed approach operates globally on structural summaries of data to extract tree patterns. Compared to previous approaches, an experimental evaluation shows that our approach has a perfect recall both for XML documents with complete and with incomplete data. It also shows better precision compared to approaches with similar recall. Finally, the dissertation has addressed the problem of answering XML queries using exclusively materialized views. An original approach for materializing views in the context of the inverted lists model has been suggested. Necessary and sufficient conditions have been provided for tree-pattern query answerability in terms of view-to-query homomorphisms. A time and space efficient algorithm was designed for deciding query answerability and a technique for computing queries over view materializations using stack- based holistic algorithms was developed. Further, optimizations were developed which (a) minimize the storage space and avoid redundancy by materializing views as bitmaps, and (b) optimize the evaluation of the queries over the views by applying bitwise operations on view materializations. The experimental results show that the proposed approach obtains largely higher hit rates than previous approaches, speeds up significantly the evaluation of queries without using views, and scales very smoothly in terms of storage space and computational overhead

    What Makes Data Possible? A Sociotechnical View on Structured Data Innovations

    Get PDF
    Drawing from the theory of digital objects, this paper examines the distinction between structured and unstructured data as carriers of facts. We argue that data do not ‘have’ a structure but are made by a structure that confers data their capacity to represent contextual facts. We employ a case vignette involving XBRL (eXtensible Business Reporting Language) and its use in statutory financial reporting to illustrate and explore the sociotechnical nature of data and to describe what we call data innovations: new valuable ways to render phenomena as data. We find that data structure is best viewed as a matter that is relative to a purpose in a context. Theorizing data from a sociotechnical perspective could evolve to provide, in effect, the material science of digital economy

    Design and implementation of XML-based Linux file system runner

    Get PDF
    This thesis presents the design and implementation of XML_based Linux File System Runner (XML_LFS), a file system simulator that integrates the representation ability of Extensible Markup Language (XML) with the beauty of Linux file system architecture. XML_LFS uses a layered approach to design a generic file system runner from scratch utilizing Java programming language and JDOM. The hierarchical directory structure of the file system is kept in an XML file for easy manipulation as well as on disk for crash recovery. UNIX-like file systems such as the Second Extended File System (Ext2), a native mini file system (mini3fs) and Linux kernel codes for file system operations are explored for the real implementation work.;Traditional file system consists of a hierarchical tree, composed of directories and files. Each directory can contain both files and subdirectories. This is an equivalent concept to semi-structured elements in XML. Embedding an XML log file layer into the Linux file system architecture can speed up the directory look up by combining the power of XML and XQuery as well as eliminating the limitations of the existing fixed-attribute file system model by treating files as elements to a customizable XML document. Thus, the whole development environment is more useful for future file system research. The future of XML file system is discussed in detail. Complete system architecture and functionalities are built and the process is described in the thesis. Initial Bonnie-like and Andrew-like benchmarks of the prototype implementation show that XML_LFS achieves the expected performance results

    Virtual Worlds and Conservational Channel Evolution and Pollutant Transport Systems (Concepts)

    Get PDF
    Many models exist that predict channel morphology. Channel morphology is defined as the change in geometric parameters of a river. Channel morphology is affected by many factors. Some of these factors are caused either by man or by nature. To combat the adverse effects that man and nature may cause to a water system, scientists and engineers develop stream rehabilitation plans. Stream rehabilitation as defined by Shields et al., states that restoration is the return from a degraded ecosystem back to a close approximation of its remaining natural potential [Shields et al., 2003]. Engineers construct plans that will restore streams back to their natural state by using techniques such as field investigation, analytical models, or numerical models. Each of these techniques is applied to projects based on specified criteria, objectives, and the expertise of the individuals devising the plan. The utilization of analytical and numerical models can be difficult, for many reasons, one of which is the intuitiveness of the modeling process. Many numerical models exist in the field of hydraulic engineering, fluvial geomorphology, landscape architecture, and stream ecology that evaluate and formulate stream rehabilitation plans. This dissertation will explore, in the field of Hydroscience , the creation of models that are not only accurate but also span the different disciplines. The goal of this dissertation is to transform a discrete numerical model (CONCEPTS) into a realistic 3D environment using open source game engines, while at the same time, conveying at least the equivalent information that was presented in the 1D numerical model

    Binary RDF for Scalable Publishing, Exchanging and Consumption in the Web of Data

    Get PDF
    El actual diluvio de datos está inundando la web con grandes volúmenes de datos representados en RDF, dando lugar a la denominada 'Web de Datos'. En esta tesis proponemos, en primer lugar, un estudio profundo de aquellos textos que nos permitan abordar un conocimiento global de la estructura real de los conjuntos de datos RDF, HDT, que afronta la representación eficiente de grandes volúmenes de datos RDF a través de estructuras optimizadas para su almacenamiento y transmisión en red. HDT representa efizcamente un conjunto de datos RDF a través de su división en tres componentes: la cabecera (Header), el diccionario (Dictionary) y la estructura de sentencias RDF (Triples). A continuación, nos centramos en proveer estructuras eficientes de dichos componentes, ocupando un espacio comprimido al tiempo que se permite el acceso directo a cualquier dat

    Study and Development of Cross-Platform Cloogy Mobile Application for VPS – Virtual Power Solutions.

    Get PDF
    A energia renovável e a conservação de energia tornaram-se tópicos importantes nos últimos anos. As empresas têm realizado esforços para reduzir o consumo de energia através da otimização de dispositivos e da conscientização dos consumidores sobre o seu uso. Para contribuir com este esforço, a Virtual Power Solutions (VPS) fornece uma solução onde os proprietários / utilizadores de edifícios obtêm visibilidade e controle em tempo real dos seus aparelhos elétricos instalados na sua residência. A VPS alcançou com sucesso a gestão de procura, e a tecnologia de automação de edifícios numa única aplicação móvel designada por Cloogy. Esta aplicação fornece aos consumidores de energia e aos seus parceiros a capacidade de verificar e controlar o consumo de energia em tempo real, permitindo reduzir o nível de consumo ao mínimo sem comprometer as operações do dia a dia. Atualmente, a Cloogy tem suas aplicações móveis disponíveis para Android, iOS e Windows Phone com funcionalidades semelhantes. Deste modo, porem cada aplicação requer diferentes linguagens de programação para cada plataforma, o que envolve um custo para manter essas diferentes plataformas. Por esta razão, para a presente tese, a VPS appresentou o objetivo de desenvolver uma aplicação móvel híbrida, que se baseará numa base de código único e terá acesso a todas as APIs da plataforma. Diferentes tipos de ferramentas de desenvolvimento estão disponíveis para construir uma aplicação híbrida. Depois de definir os requisitos funcionais e não-funcionais, um protótipo de aplicação híbrida foi construído usando o Ionic Framework, que consiste numa das Frameworks de código aberto os disponíveis para construir aplicações móveis híbridas. Com a ajuda desta framework, uma aplicação móvel pode ser criada usando um conjunto de tecnologias da web, como JavaScript, HTML e CSS, e implementada o aplicativo em todas as principais plataformas, como Android e iOS. O protótipo construído nos permite-nos aceder a dados de consumo através do nosso smartphone ou tablet a partir de uma localização remota com a ajuda da iEnergy3 API da VPS. As principais características oferecidas pelo protótipo são a monitorização do consumo de energia através de registros e dados em tempo real, e a verificação dos indicadores de consumo como desempenho, média diária, previsões, etc. O protótipo também fornece pegadas ecológicas, conjuntamente com indicadores de consumo, e é capaz de controlar e agendar períodos de consumo de electricidade a partir de um local remoto.N/

    Integrating TrustZone Protection with Communication Paths for Mobile Operating System

    Get PDF
    Nowadays, users perform various essential activities through their smartphones, including mobile payment and financial transaction. Therefore, users’ sensitive data processed by smartphones will be at risk if underlying mobile OSes are compromised. A technology called Trusted Execution Environment (TEE) has been introduced to protect sensitive data in the event of compromised OS and hypervisor. This dissertation points out the limitations of the current design model of mobile TEE, which has a low adoption rate among application developers and has a large size of Trusted Computing Base (TCB). It proposes a new design model for mobile TEE to increase the TEE adoption rate and to decrease the size of TCB. This dissertation applies a new model to protect mobile communication paths in the Android platform. Evaluations are performed to demonstrate the effectiveness of the proposed design model

    KP-LAB Knowledge Practices Laboratory -- Specification of end-user applications

    Get PDF
    deliverablesThe present deliverable provides a high-level view on the new specifications of end user applications defined in the WPII during the M37-M46 period of the KP-Lab project. This is the last in the series of four deliverables that cover all the tools developed in the project, the previous ones being D6.1, D6.4 and D6.6. This deliverable presents specifications for the new functionalities for supporting the dedicated research studies defined in the latest revision of the KP-Lab research strategy. The tools addressed are: the analytic tools (Data export, Time-line-based analyser, Visual analyser), Clipboard, Search, Versioning of uploadable content items, Visual Model Editor (VME) and Visual Modeling Language Editor (VMLE). The main part of the deliverable provides the summary of tool specifications and the description of the Knowledge Practices Environment architecture, as well as an overview of the revised technical design process, of the tools’ relationship with the research studies, and of the driving objectives and the high-level requirements relevant for the present specifications. The full specifications of tools are provided in the annexes 1-9
    corecore