189 research outputs found
Effectively Maintaining Single View Consistency in Web Warehouses
Web warehouse provides high availability and efficiency by utilizing materialized webviews, which should be refreshed in time to keep freshness. During the refreshing, the consistency between a webview and its base data, which is formally named single view consistency (abbreviated as SVC), must be guaranteed. Since the base data changes in a web warehousing environment do not propagate from data sources to the information consumers, which is far different from the case in the traditional data warehouses, we must pursue new maintenance methods. In this paper we first introduce the definition for SVC, and then we present an algorithm RCA to keep SVC as well as an effective base data change detection method SAA. We illustrate that RCA and SAA can guarantee SVC and they are effective in the web environment. ? 2005 IEEE.EI
An overview of data warehouse design approaches and tecbniques
A Data Warehouse (DW) is a database that stores information oriented to satisfy decision-making requests. It ia a database with some particular features concerning the data it contains and its utilisation. The features of DWs cause the DW design process and strategies to be different frome the ones for OLTP Systems. This work presents a brief description of different approaches and techniques that address the DW design problem
Manutenção incremental de visões materializadas em ambientes data warehousing
Orientadora: Silvia Regina VergilioDissertação (mestrado) - Universidade Federal do ParanáResumo: Data warehouse é um repositorio de dados coletados de fontes de dados distribuídas, autônomas e heterogêneas. A tecnologia data warehousing tem sido utilizada em Sistemas de Suporte à Decisão (DSS - Decision Support Systems) para auxiliar nos processos decisorios e identificar tendências de mercado. O data warehouse armazena uma ou mais visões materializadas dos ciados das fontes. A qualidade do processo de tomada de decisão em um DSS depende da correta propagação das atualizações ocorridas nas fontes de dados para as visões materializadas no data warehouse. Disso depende a manutenção da consistência dos dados que é em geral irai processo complexo. Nos últimos anos, algoritmos de manutenção incrementai de visões materializadas em data warehouse têm se destacado como uma importante abordagem para o problema. Um estudo comparativo desses algoritmos foi realizado e como conseqüência desse estudo um novo algoritmo, denominado SVM {Algorithm for Scheduling Warehouse View Maintenance), é aqui proposto. Esse algoritmo combina os aspectos positivos dos algoritmos estudados. Sua principal vantagem é definir intervalos de tempo para propagar as atualizações das fontes no data warehouse. Os principais aspectos de implementação do SVM são discutidos e um estudo de caso, composto de diferentes situações que mostram seu funcionamento, é apresentado.Resumo: Data warehouse é um repositorio de dados coletados de fontes de dados distribuídas, autônomas e heterogêneas. A tecnologia data warehousing tem sido utilizada em Sistemas de Suporte à Decisão (DSS - Decision Support Systems) para auxiliar nos processos decisorios e identificar tendências de mercado. O data warehouse armazena uma ou mais visões materializadas dos ciados das fontes. A qualidade do processo de tomada de decisão em um DSS depende da correta propagação das atualizações ocorridas nas fontes de dados para as visões materializadas no data warehouse. Disso depende a manutenção da consistência dos dados que é em geral irai processo complexo. Nos últimos anos, algoritmos de manutenção incrementai de visões materializadas em data warehouse têm se destacado como uma importante abordagem para o problema. Um estudo comparativo desses algoritmos foi realizado e como conseqüência desse estudo um novo algoritmo, denominado SVM {Algorithm for Scheduling Warehouse View Maintenance), é aqui proposto. Esse algoritmo combina os aspectos positivos dos algoritmos estudados. Sua principal vantagem é definir intervalos de tempo para propagar as atualizações das fontes no data warehouse. Os principais aspectos de implementação do SVM são discutidos e um estudo de caso, composto de diferentes situações que mostram seu funcionamento, é apresentado
Recommended from our members
An intelligent inspection and survey robot. Volume 1
ARIES {number_sign}1 (Autonomous Robotic Inspection Experimental System), has been developed for the Department of Energy to survey and inspect drums containing low-level radioactive waste stored in warehouses at DOE facilities. The drums are typically stacked four high and arranged in rows with three-foot aisle widths. The robot will navigate through the aisles and perform an inspection operation, typically performed by a human operator, making decisions about the condition of the drums and maintaining a database of pertinent information about each drum. A new version of the Cybermotion series of mobile robots is the base mobile vehicle for ARIES. The new Model K3A consists of an improved and enhanced mobile platform and a new turret that will permit turning around in a three-foot aisle. Advanced sonar and lidar systems were added to improve navigation in the narrow drum aisles. Onboard computer enhancements include a VMEbus computer system running the VxWorks real-time operating system. A graphical offboard supervisory UNIX workstation is used for high-level planning, control, monitoring, and reporting. A camera positioning system (CPS) includes primitive instructions for the robot to use in referencing and positioning the payload. The CPS retracts to a more compact position when traveling in the open warehouse. During inspection, the CPS extends up to deploy inspection packages at different heights on the four-drum stacks of 55-, 85-, and 110-gallon drums. The vision inspection module performs a visual inspection of the waste drums. This system will locate and identify each drum, locate any unique visual features, characterize relevant surface features of interest and update a data-base containing the inspection data
Design, implementation and realization of an integrated platform dedicated to e-public health, for analysing health data and supporting the management control in healthcare companies.
In healthcare, the information is a fundamental aspect and the human body is the major source of every kind of data: the challenge is to benefit from this huge amount of unstructured data by applying technologic solutions, called Big Data Analysis, that allows the management of data and the extraction of information through informatic systems. This thesis aims to introduce a technologic solution made up of two open source platforms: Power BI and Knime Analytics Platform. First, the importance, the role and the processes of business intelligence and machine learning in healthcare will be discussed; secondly, the platforms will be described, particularly enhancing their feasibility and capacities. Then, the clinical specialties, where they have been applied, will be shown by highlighting the international literature that have been produced: neurology, cardiology, oncology, fetal-monitoring and others. An application in the current pandemic situation due to SARS-CoV-2 will be described by using more than 50000 records: a cascade of 3 platforms helping health facilities to deal with the current worldwide pandemic.
Finally, the advantages, the disadvantages, the limitations and the future developments in this framework will be discussed while the architectural technologic solution containing a data warehouse, a platform to collect data, two platforms to analyse health and management data and the possible applications will be shown
Efficient Incremental View Maintenance for Data Warehousing
Data warehousing and on-line analytical processing (OLAP) are essential elements for decision support applications. Since most OLAP queries are complex and are often executed over huge volumes of data, the solution in practice is to employ materialized views to improve query performance. One important issue for utilizing materialized views is to maintain the view consistency upon source changes. However, most prior work focused on simple SQL views with distributive aggregate functions, such as SUM and COUNT. This dissertation proposes to consider broader types of views than previous work. First, we study views with complex aggregate functions such as variance and regression. Such statistical functions are of great importance in practice. We propose a workarea function model and design a generic framework to tackle incremental view maintenance and answering queries using views for such functions. We have implemented this approach in a prototype system of IBM DB2. An extensive performance study shows significant performance gains by our techniques. Second, we consider materialized views with PIVOT and UNPIVOT operators. Such operators are widely used for OLAP applications and for querying sparse datasets. We demonstrate that the efficient maintenance of views with PIVOT and UNPIVOT operators requires more generalized operators, called GPIVOT and GUNPIVOT. We formally define and prove the query rewriting rules and propagation rules for such operators. We also design a novel view maintenance framework for applying these rules to obtain an efficient maintenance plan. Extensive performance evaluations reveal the effectiveness of our techniques. Third, materialized views are often integrated from multiple data sources. Due to source autonomicity and dynamicity, concurrency may occur during view maintenance. We propose a generic concurrency control framework to solve such maintenance anomalies. This solution extends previous work in that it solves the anomalies under both source data and schema changes and thus achieves full source autonomicity. We have implemented this technique in a data warehouse prototype developed at WPI. The extensive performance study shows that our techniques put little extra overhead on existing concurrent data update processing techniques while allowing for this new functionality
Un sistema para el mantenimiento de almacenes de datos
Un almacén de datos es una base de datos diseñada para dar soporte al proceso de toma de decisiones en una organización. Un sistema de almacén de datos integra en un único repositorio, información
histórica procedente de distintas fuentes de datos operacionales de la organización o externas a ella. Para que el almacén de datos sea en todo momento un reflejo fiel de la organización a la que sirve, debe
ser actualizado periódicamente. Este proceso puede consumir muchos recursos, y en algunos casos inhabilitar el almacén de datos para los usuarios. En organizaciones donde el sistema debe estar disponible
para los analistas en todo momento, el mantenimiento del almacén se convierte en un punto crítico del sistema. Por este motivo la investigación en estrategias eficientes de mantenimiento de almacenes
de datos ha recibido la atención de los investigadores desde la aparición de esta tecnología.
El mantenimiento de un almacén de datos se realiza en tres fases: extracción de datos de las fuentes, transformación de los datos y actualización del almacén.
En este trabajo de tesis se han abordado, las fases de transformación y principalmente la fase de actualización. Para la fase de transformación se ha desarrollado un sistema que permite realizar tareas de limpieza
moderada de los datos, integración de formato e integración semántica.
Pero, el trabajo principal se ha centrado en la fase de actualización, para ella se han definido e implementado dos algoritmos que permiten realizar la actualización del almacén de datos de forma incremental y
en línea, es decir evitando inhabilitar el almacén de datos durante el mantenimiento. Los algoritmos se basan en una estrategia multiversión, que permite mantener un número ilimitado de versiones
de los datos actualizados, permitiendo de esta manera que los usuarios accedan a una misma versión del almacén mientras éste se está actualizando.García Gerardo, C. (2008). Un sistema para el mantenimiento de almacenes de datos [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/2505Palanci
- …