1,934 research outputs found

    Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses

    Get PDF
    A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses

    Content warehouses

    Get PDF
    Nowadays, content management systems are an established technology. Based on the experiences from several application scenarios we discuss the points of contact between content management systems and other disciplines of information systems engineering like data warehouses, data mining, and data integration. We derive a system architecture called "content warehouse" that integrates these technologies and defines a more general and more sophisticated view on content management. As an example, a system for the collection, maintenance, and evaluation of biological content like survey data or multimedia resources is shown as a case study

    ViewDF: a Flexible Framework for Incremental View Maintenance in Stream Data Warehouses

    Get PDF
    Because of the increasing data sizes and demands for low latency in modern data analysis, the traditional data warehousing technologies are greatly pushed beyond their limits. Several stream data warehouse (SDW) systems, which are warehouses that ingest append-only data feeds and support frequent refresh cycles, have been proposed including different methods to improve the responsiveness of the systems. Materialized views are critical in large-scale data warehouses due to their ability to speed up queries. Thus an SDW maintains layers of materialized views. Materialized view maintenance in SDW systems introduces new challenges. However, some of the existing SDW systems do not address the maintenance of views while others employ view maintenance techniques that are not efficient. This thesis presents ViewDF, a flexible framework for incremental maintenance of materialized views in SDW systems that generalizes existing techniques and enables new optimizations for views defined with operators that are common in stream analytics. We give a special view definition (ViewDF) to enhance the traditional way of creating views in SQL by being able to reference any partition of any table. We describe a prototype system based on this idea, which allows users to write ViewDFs directly and can automatically translate a broad class of queries into ViewDFs. Several optimizations are proposed and experiments show that our proposed system can improve view maintenance time by a factor of two or more in practical settings.1 yea

    Incremental Processing and Optimization of Update Streams

    Get PDF
    Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications. Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks

    Acta Cybernetica : Volume 16. Number 1.

    Get PDF

    A survey of temporal knowledge discovery paradigms and methods

    Get PDF
    With the increase in the size of data sets, data mining has recently become an important research topic and is receiving substantial interest from both academia and industry. At the same time, interest in temporal databases has been increasing and a growing number of both prototype and implemented systems are using an enhanced temporal understanding to explain aspects of behavior associated with the implicit time-varying nature of the universe. This paper investigates the confluence of these two areas, surveys the work to date, and explores the issues involved and the outstanding problems in temporal data mining

    Towards Prescriptive Analytics in Cyber-Physical Systems

    Get PDF
    More and more of our physical world today is being monitored and controlled by so-called cyber-physical systems (CPSs). These are compositions of networked autonomous cyber and physical agents such as sensors, actuators, computational elements, and humans in the loop. Today, CPSs are still relatively small-scale and very limited compared to CPSs to be witnessed in the future. Future CPSs are expected to be far more complex, large-scale, wide-spread, and mission-critical, and found in a variety of domains such as transportation, medicine, manufacturing, and energy, where they will bring many advantages such as the increased efficiency, sustainability, reliability, and security. To unleash their full potential, CPSs need to be equipped with, among other features, the support for automated planning and control, where computing agents collaboratively and continuously plan and control their actions in an intelligent and well-coordinated manner to secure and optimize a physical process, e.g., electricity flow in the power grid. In today’s CPSs, the control is typically automated, but the planning is solely performed by humans. Unfortunately, it is intractable and infeasible for humans to plan every action in a future CPS due to the complexity, scale, and volatility of a physical process. Due to these properties, the control and planning has to be continuous and automated in future CPSs. Humans may only analyse and tweak the system’s operation using the set of tools supporting prescriptive analytics that allows them (1) to make predictions, (2) to get the suggestions of the most prominent set of actions (decisions) to be taken, and (3) to analyse the implications as if such actions were taken. This thesis considers the planning and control in the context of a large-scale multi-agent CPS. Based on the smart-grid use-case, it presents a so-called PrescriptiveCPS – which is (the conceptual model of) a multi-agent, multi-role, and multi-level CPS automatically and continuously taking and realizing decisions in near real-time and providing (human) users prescriptive analytics tools to analyse and manage the performance of the underlying physical system (or process). Acknowledging the complexity of CPSs, this thesis provides contributions at the following three levels of scale: (1) the level of a (full) PrescriptiveCPS, (2) the level of a single PrescriptiveCPS agent, and (3) the level of a component of a CPS agent software system. At the CPS level, the contributions include the definition of PrescriptiveCPS, according to which it is the system of interacting physical and cyber (sub-)systems. Here, the cyber system consists of hierarchically organized inter-connected agents, collectively managing instances of so-called flexibility, decision, and prescription models, which are short-lived, focus on the future, and represent a capability, an (user’s) intention, and actions to change the behaviour (state) of a physical system, respectively. At the agent level, the contributions include the three-layer architecture of an agent software system, integrating the number of components specially designed or enhanced to support the functionality of PrescriptiveCPS. At the component level, the most of the thesis contribution is provided. The contributions include the description, design, and experimental evaluation of (1) a unified multi-dimensional schema for storing flexibility and prescription models (and related data), (2) techniques to incrementally aggregate flexibility model instances and disaggregate prescription model instances, (3) a database management system (DBMS) with built-in optimization problem solving capability allowing to formulate optimization problems using SQL-like queries and to solve them “inside a database”, (4) a real-time data management architecture for processing instances of flexibility and prescription models under (soft or hard) timing constraints, and (5) a graphical user interface (GUI) to visually analyse the flexibility and prescription model instances. Additionally, the thesis discusses and exemplifies (but provides no evaluations of) (1) domain-specific and in-DBMS generic forecasting techniques allowing to forecast instances of flexibility models based on historical data, and (2) powerful ways to analyse past, current, and future based on so-called hypothetical what-if scenarios and flexibility and prescription model instances stored in a database. Most of the contributions at this level are based on the smart-grid use-case. In summary, the thesis provides (1) the model of a CPS with planning capabilities, (2) the design and experimental evaluation of prescriptive analytics techniques allowing to effectively forecast, aggregate, disaggregate, visualize, and analyse complex models of the physical world, and (3) the use-case from the energy domain, showing how the introduced concepts are applicable in the real world. We believe that all this contribution makes a significant step towards developing planning-capable CPSs in the future.Mehr und mehr wird heute unsere physische Welt überwacht und durch sogenannte Cyber-Physical-Systems (CPS) geregelt. Dies sind Kombinationen von vernetzten autonomen cyber und physischen Agenten wie Sensoren, Aktoren, Rechenelementen und Menschen. Heute sind CPS noch relativ klein und im Vergleich zu CPS der Zukunft sehr begrenzt. Zukünftige CPS werden voraussichtlich weit komplexer, größer, weit verbreiteter und unternehmenskritischer sein sowie in einer Vielzahl von Bereichen wie Transport, Medizin, Fertigung und Energie – in denen sie viele Vorteile wie erhöhte Effizienz, Nachhaltigkeit, Zuverlässigkeit und Sicherheit bringen – anzutreffen sein. Um ihr volles Potenzial entfalten zu können, müssen CPS unter anderem mit der Unterstützung automatisierter Planungs- und Steuerungsfunktionalität ausgestattet sein, so dass Agents ihre Aktionen gemeinsam und kontinuierlich auf intelligente und gut koordinierte Weise planen und kontrollieren können, um einen physischen Prozess wie den Stromfluss im Stromnetz sicherzustellen und zu optimieren. Zwar sind in den heutigen CPS Steuerung und Kontrolle typischerweise automatisiert, aber die Planung wird weiterhin allein von Menschen durchgeführt. Leider ist diese Aufgabe nur schwer zu bewältigen, und es ist für den Menschen schlicht unmöglich, jede Aktion in einem zukünftigen CPS auf Basis der Komplexität, des Umfangs und der Volatilität eines physikalischen Prozesses zu planen. Aufgrund dieser Eigenschaften müssen Steuerung und Planung in CPS der Zukunft kontinuierlich und automatisiert ablaufen. Der Mensch soll sich dabei ganz auf die Analyse und Einflussnahme auf das System mit Hilfe einer Reihe von Werkzeugen konzentrieren können. Derartige Werkzeuge erlauben (1) Vorhersagen, (2) Vorschläge der wichtigsten auszuführenden Aktionen (Entscheidungen) und (3) die Analyse und potentiellen Auswirkungen der zu fällenden Entscheidungen. Diese Arbeit beschäftigt sich mit der Planung und Kontrolle im Rahmen großer Multi-Agent-CPS. Basierend auf dem Smart-Grid als Anwendungsfall wird ein sogenanntes PrescriptiveCPS vorgestellt, welches einem Multi-Agent-, Multi-Role- und Multi-Level-CPS bzw. dessen konzeptionellem Modell entspricht. Diese PrescriptiveCPS treffen und realisieren automatisch und kontinuierlich Entscheidungen in naher Echtzeit und stellen Benutzern (Menschen) Prescriptive-Analytics-Werkzeuge und Verwaltung der Leistung der zugrundeliegenden physischen Systeme bzw. Prozesse zur Verfügung. In Anbetracht der Komplexität von CPS leistet diese Arbeit Beiträge auf folgenden Ebenen: (1) Gesamtsystem eines PrescriptiveCPS, (2) PrescriptiveCPS-Agenten und (3) Komponenten eines CPS-Agent-Software-Systems. Auf CPS-Ebene umfassen die Beiträge die Definition von PrescriptiveCPS als ein System von wechselwirkenden physischen und cyber (Sub-)Systemen. Das Cyber-System besteht hierbei aus hierarchisch organisierten verbundenen Agenten, die zusammen Instanzen sogenannter Flexibility-, Decision- und Prescription-Models verwalten, welche von kurzer Dauer sind, sich auf die Zukunft konzentrieren und Fähigkeiten, Absichten (des Benutzers) und Aktionen darstellen, die das Verhalten des physischen Systems verändern. Auf Agenten-Ebene umfassen die Beiträge die Drei-Ebenen-Architektur eines Agentensoftwaresystems sowie die Integration von Komponenten, die insbesondere zur besseren Unterstützung der Funktionalität von PrescriptiveCPS entwickelt wurden. Der Schwerpunkt dieser Arbeit bilden die Beiträge auf der Komponenten-Ebene, diese umfassen Beschreibung, Design und experimentelle Evaluation (1) eines einheitlichen multidimensionalen Schemas für die Speicherung von Flexibility- and Prescription-Models (und verwandten Daten), (2) der Techniken zur inkrementellen Aggregation von Instanzen eines Flexibilitätsmodells und Disaggregation von Prescription-Models, (3) eines Datenbankmanagementsystem (DBMS) mit integrierter Optimierungskomponente, die es erlaubt, Optimierungsprobleme mit Hilfe von SQL-ähnlichen Anfragen zu formulieren und sie „in einer Datenbank zu lösen“, (4) einer Echtzeit-Datenmanagementarchitektur zur Verarbeitung von Instanzen der Flexibility- and Prescription-Models unter (weichen oder harten) Zeitvorgaben und (5) einer grafische Benutzeroberfläche (GUI) zur Visualisierung und Analyse von Instanzen der Flexibility- and Prescription-Models. Darüber hinaus diskutiert und veranschaulicht diese Arbeit beispielhaft ohne detaillierte Evaluation (1) anwendungsspezifische und im DBMS integrierte Vorhersageverfahren, die die Vorhersage von Instanzen der Flexibility- and Prescription-Models auf Basis historischer Daten ermöglichen, und (2) leistungsfähige Möglichkeiten zur Analyse von Vergangenheit, Gegenwart und Zukunft auf Basis sogenannter hypothetischer „What-if“-Szenarien und der in der Datenbank hinterlegten Instanzen der Flexibility- and Prescription-Models. Die meisten der Beiträge auf dieser Ebene basieren auf dem Smart-Grid-Anwendungsfall. Zusammenfassend befasst sich diese Arbeit mit (1) dem Modell eines CPS mit Planungsfunktionen, (2) dem Design und der experimentellen Evaluierung von Prescriptive-Analytics-Techniken, die eine effektive Vorhersage, Aggregation, Disaggregation, Visualisierung und Analyse komplexer Modelle der physischen Welt ermöglichen und (3) dem Anwendungsfall der Energiedomäne, der zeigt, wie die vorgestellten Konzepte in der Praxis Anwendung finden. Wir glauben, dass diese Beiträge einen wesentlichen Schritt in der zukünftigen Entwicklung planender CPS darstellen.Mere og mere af vores fysiske verden bliver overvåget og kontrolleret af såkaldte cyber-fysiske systemer (CPSer). Disse er sammensætninger af netværksbaserede autonome IT (cyber) og fysiske (physical) agenter, såsom sensorer, aktuatorer, beregningsenheder, og mennesker. I dag er CPSer stadig forholdsvis små og meget begrænsede i forhold til de CPSer vi kan forvente i fremtiden. Fremtidige CPSer forventes at være langt mere komplekse, storstilede, udbredte, og missionskritiske, og vil kunne findes i en række områder såsom transport, medicin, produktion og energi, hvor de vil give mange fordele, såsom øget effektivitet, bæredygtighed, pålidelighed og sikkerhed. For at frigøre CPSernes fulde potentiale, skal de bl.a. udstyres med støtte til automatiseret planlægning og kontrol, hvor beregningsagenter i samspil og løbende planlægger og styrer deres handlinger på en intelligent og velkoordineret måde for at sikre og optimere en fysisk proces, såsom elforsyningen i elnettet. I nuværende CPSer er styringen typisk automatiseret, mens planlægningen udelukkende er foretaget af mennesker. Det er umuligt for mennesker at planlægge hver handling i et fremtidigt CPS på grund af kompleksiteten, skalaen, og omskifteligheden af en fysisk proces. På grund af disse egenskaber, skal kontrol og planlægning være kontinuerlig og automatiseret i fremtidens CPSer. Mennesker kan kun analysere og justere systemets drift ved hjælp af det sæt af værktøjer, der understøtter præskriptive analyser (prescriptive analytics), der giver dem mulighed for (1) at lave forudsigelser, (2) at få forslagene fra de mest fremtrædende sæt handlinger (beslutninger), der skal tages, og (3) at analysere konsekvenserne, hvis sådanne handlinger blev udført. Denne afhandling omhandler planlægning og kontrol i forbindelse med store multi-agent CPSer. Baseret på en smart-grid use case, præsenterer afhandlingen det såkaldte PrescriptiveCPS hvilket er (den konceptuelle model af) et multi-agent, multi-rolle, og multi-level CPS, der automatisk og kontinuerligt tager beslutninger i nær-realtid og leverer (menneskelige) brugere præskriptiveanalyseværktøjer til at analysere og håndtere det underliggende fysiske system (eller proces). I erkendelse af kompleksiteten af CPSer, giver denne afhandling bidrag til følgende tre niveauer: (1) niveauet for et (fuldt) PrescriptiveCPS, (2) niveauet for en enkelt PrescriptiveCPS agent, og (3) niveauet for en komponent af et CPS agent software system. På CPS-niveau, omfatter bidragene definitionen af PrescriptiveCPS, i henhold til hvilken det er det system med interagerende fysiske- og IT- (under-) systemer. Her består IT-systemet af hierarkisk organiserede forbundne agenter der sammen styrer instanser af såkaldte fleksibilitet (flexibility), beslutning (decision) og præskriptive (prescription) modeller, som henholdsvis er kortvarige, fokuserer på fremtiden, og repræsenterer en kapacitet, en (brugers) intention, og måder til at ændre adfærd (tilstand) af et fysisk system. På agentniveau omfatter bidragene en tre-lags arkitektur af et agent software system, der integrerer antallet af komponenter, der er specielt konstrueret eller udbygges til at understøtte funktionaliteten af PrescriptiveCPS. Komponentniveauet er hvor afhandlingen har sit hovedbidrag. Bidragene omfatter beskrivelse, design og eksperimentel evaluering af (1) et samlet multi- dimensionelt skema til at opbevare fleksibilitet og præskriptive modeller (og data), (2) teknikker til trinvis aggregering af fleksibilitet modelinstanser og disaggregering af præskriptive modelinstanser (3) et database management system (DBMS) med indbygget optimeringsproblemløsning (optimization problem solving) der gør det muligt at formulere optimeringsproblemer ved hjælp af SQL-lignende forespørgsler og at løse dem "inde i en database", (4) en realtids data management arkitektur til at behandle instanser af fleksibilitet og præskriptive modeller under (bløde eller hårde) tidsbegrænsninger, og (5) en grafisk brugergrænseflade (GUI) til visuelt at analysere fleksibilitet og præskriptive modelinstanser. Derudover diskuterer og eksemplificerer afhandlingen (men giver ingen evalueringer af) (1) domæne-specifikke og in-DBMS generiske prognosemetoder der gør det muligt at forudsige instanser af fleksibilitet modeller baseret på historiske data, og (2) kraftfulde måder at analysere tidligere-, nutids- og fremtidsbaserede såkaldte hypotetiske hvad-hvis scenarier og fleksibilitet og præskriptive modelinstanser gemt i en database. De fleste af bidragene på dette niveau er baseret på et smart-grid brugsscenarie. Sammenfattende giver afhandlingen (1) modellen for et CPS med planlægningsmulighed, (2) design og eksperimentel evaluering af præskriptive analyse teknikker der gør det muligt effektivt at forudsige, aggregere, disaggregere, visualisere og analysere komplekse modeller af den fysiske verden, og (3) brugsscenariet fra energiområdet, der viser, hvordan de indførte begreber kan anvendes i den virkelige verden. Vi mener, at dette bidrag udgør et betydeligt skridt i retning af at udvikle CPSer til planlægningsbrug i fremtiden

    Formal design of data warehouse and OLAP systems : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University, Palmerston North, New Zealand

    Get PDF
    A data warehouse is a single data store, where data from multiple data sources is integrated for online business analytical processing (OLAP) of an entire organisation. The rationale being single and integrated is to ensure a consistent view of the organisational business performance independent from different angels of business perspectives. Due to its wide coverage of subjects, data warehouse design is a highly complex, lengthy and error-prone process. Furthermore, the business analytical tasks change over time, which results in changes in the requirements for the OLAP systems. Thus, data warehouse and OLAP systems are rather dynamic and the design process is continuous. In this thesis, we propose a method that is integrated, formal and application-tailored to overcome the complexity problem, deal with the system dynamics, improve the quality of the system and the chance of success. Our method comprises three important parts: the general ASMs method with types, the application tailored design framework for data warehouse and OLAP, and the schema integration method with a set of provably correct refinement rules. By using the ASM method, we are able to model both data and operations in a uniform conceptual framework, which enables us to design an integrated approach for data warehouse and OLAP design. The freedom given by the ASM method allows us to model the system at an abstract level that is easy to understand for both users and designers. More specifically, the language allows us to use the terms from the user domain not biased by the terms used in computer systems. The pseudo-code like transition rules, which gives the simplest form of operational semantics in ASMs, give the closeness to programming languages for designers to understand. Furthermore, these rules are rooted in mathematics to assist in improving the quality of the system design. By extending the ASMs with types, the modelling language is tailored for data warehouse with the terms that are well developed for data-intensive applications, which makes it easy to model the schema evolution as refinements in the dynamic data warehouse design. By providing the application-tailored design framework, we break down the design complexity by business processes (also called subjects in data warehousing) and design concerns. By designing the data warehouse by subjects, our method resembles Kimball's "bottom-up" approach. However, with the schema integration method, our method resolves the stovepipe issue of the approach. By building up a data warehouse iteratively in an integrated framework, our method not only results in an integrated data warehouse, but also resolves the issues of complexity and delayed ROI (Return On Investment) in Inmon's "top-down" approach. By dealing with the user change requests in the same way as new subjects, and modelling data and operations explicitly in a three-tier architecture, namely the data sources, the data warehouse and the OLAP (online Analytical Processing), our method facilitates dynamic design with system integrity. By introducing a notion of refinement specific to schema evolution, namely schema refinement, for capturing the notion of schema dominance in schema integration, we are able to build a set of correctness-proven refinement rules. By providing the set of refinement rules, we simplify the designers's work in correctness design verification. Nevertheless, we do not aim for a complete set due to the fact that there are many different ways for schema integration, and neither a prescribed way of integration to allow designer favored design. Furthermore, given its °exibility in the process, our method can be extended for new emerging design issues easily
    • …
    corecore