118 research outputs found

    Adding semantic modules to improve goal-oriented analysis of data warehouses using I-star

    Get PDF
    The success rate of data warehouse (DW) development is improved by performing a requirements elicitation stage in which the users’ needs are modeled. Currently, among the different proposals for modeling requirements, there is a special focus on goal-oriented models, and in particular on the i* framework. In order to adapt this framework for DW development, we previously developed a UML profile for DWs. However, as the general i* framework, the proposal lacks modularity. This has a specially negative impact for DW development, since DW requirement models tend to include a huge number of elements with crossed relationships between them. In turn, the readability of the models is decreased, harming their utility and increasing the error rate and development time. In this paper, we propose an extension of our i* profile for DWs considering the modularization of goals. We provide a set of guidelines in order to correctly apply our proposal. Furthermore, we have performed an experiment in order to assess the validity our proposal. The benefits of our proposal are an increase in the modularity and scalability of the models which, in turn, increases the error correction capability, and makes complex models easier to understand by DW developers and non expert users.This work has been partially supported by the ProS-Req (TIN2010-19130-C02-01) and by the MESOLAP (TIN2010-14860) and SERENIDAD (PEII-11-0327-7035) projects from the Spanish Ministry of Education and the Junta de Comunidades de Castilla La Mancha respectively. Alejandro Maté is funded by the Generalitat Valenciana under an ACIF grant (ACIF/2010/298)

    Public universities employees perception of electronic information sharing between universities and the Ministry of Higher Education and Scientific Research

    Get PDF
    Electronic information sharing benefits organizations and institutions in various aspects including increasing the level of information accuracy and timeliness, improving the accountability and decision making, and minimizing the cost of information management. There is a high degree of information sharing between Iraqi public universities and Ministry of Higher Education and Scientific Research (MOHESR), however, limited electronic information sharing exists between them, which brings difficulties and delay in making decisions. This limitation also creates challenges and barriers in supporting the decentralization principle taken by the public universities in universities’ governance. Thus, there is a need to conduct a study to identify the possible steps and strategies to increase electronic information sharing between the ministry and universities. The main objective of this study is to propose a model of electronic information sharing between Iraqi public universities and MOHESR. Social Exchange Theory, Critical Mass Theory and Transactive Memory System Theory have been used to solve the problem and achieve the objectives. Purposive sampling has been used and multiple linear regression analyses were applied for data analysis. A total of 660 questionnaires have been distributed in five universities in Iraq and the returned response was 274 (42%). From the 16 factors proposed, ten factors are found to be significance which are IT capability, information quality, compatibility, complexity, data warehouse, top management, policy/legal framework, interagency trust, upper level leadership and social network. Based on the results obtained, the study presents a model of electronic information sharing between public universities in Iraq and MOHESR. A comprehensive understanding of this model will contribute to the improvement of the planning and implementation of three dimensions; technological, organizational and environmental of the public universities in their way forward to improvise electronic information sharing in the future. According to the findings, it can be concluded that three dimensions and ten factors can essentially increase the electronic information sharing among public universities and MOHESR

    The Family of MapReduce and Large Scale Data Processing Systems

    Full text link
    In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

    Automatic physical database design : recommending materialized views

    Get PDF
    This work discusses physical database design while focusing on the problem of selecting materialized views for improving the performance of a database system. We first address the satisfiability and implication problems for mixed arithmetic constraints. The results are used to support the construction of a search space for view selection problems. We proposed an approach for constructing a search space based on identifying maximum commonalities among queries and on rewriting queries using views. These commonalities are used to define candidate views for materialization from which an optimal or near-optimal set can be chosen as a solution to the view selection problem. Using a search space constructed this way, we address a specific instance of the view selection problem that aims at minimizing the view maintenance cost of multiple materialized views using multi-query optimization techniques. Further, we study this same problem in the context of a commercial database management system in the presence of memory and time restrictions. We also suggest a heuristic approach for maintaining the views while guaranteeing that the restrictions are satisfied. Finally, we consider a dynamic version of the view selection problem where the workload is a sequence of query and update statements. In this case, the views can be created (materialized) and dropped during the execution of the workload. We have implemented our approaches to the dynamic view selection problem and performed extensive experimental testing. Our experiments show that our approaches perform in most cases better than previous ones in terms of effectiveness and efficiency

    Cloud BI: A Multi-party Authentication Framework for Securing Business Intelligence on the Cloud

    Get PDF
    Business intelligence (BI) has emerged as a key technology to be hosted on Cloud computing. BI offers a method to analyse data thereby enabling informed decision making to improve business performance and profitability. However, within the shared domains of Cloud computing, BI is exposed to increased security and privacy threats because an unauthorised user may be able to gain access to highly sensitive, consolidated business information. The business process contains collaborating services and users from multiple Cloud systems in different security realms which need to be engaged dynamically at runtime. If the heterogamous Cloud systems located in different security realms do not have direct authentication relationships then it is technically difficult to enable a secure collaboration. In order to address these security challenges, a new authentication framework is required to establish certain trust relationships among these BI service instances and users by distributing a common session secret to all participants of a session. The author addresses this challenge by designing and implementing a multiparty authentication framework for dynamic secure interactions when members of different security realms want to access services. The framework takes advantage of the trust relationship between session members in different security realms to enable a user to obtain security credentials to access Cloud resources in a remote realm. This mechanism can help Cloud session users authenticate their session membership to improve the authentication processes within multi-party sessions. The correctness of the proposed framework has been verified by using BAN Logics. The performance and the overhead have been evaluated via simulation in a dynamic environment. A prototype authentication system has been designed, implemented and tested based on the proposed framework. The research concludes that the proposed framework and its supporting protocols are an effective functional basis for practical implementation testing, as it achieves good scalability and imposes only minimal performance overhead which is comparable with other state-of-art methods

    Research and Development of a General Purpose Instrument DAQ-Monitoring Platform applied to the CLOUD/CERN experiment

    Get PDF
    The current scientific environment has experimentalists and system administrators allocating large amounts of time for data access, parsing and gathering as well as instrument management. This is a growing challenge since there is an increasing number of large collaborations with significant amount of instrument resources, remote instrumentation sites and continuously improved and upgraded scientific instruments. DAQBroker is a new software designed to monitor networks of scientific instruments while also providing simple data access methods for any user. Data can be stored in one or several local or remote databases running on any of the most popular relational databases (MySQL, PostgreSQL, Oracle). It also provides the necessary tools for creating and editing the metadata associated with different instruments, perform data manipulation and generate events based on instrument measurements, regardless of the user’s know-how of individual instruments. Time series stored in a DAQBroker database also benefit from several statistical methods for time series classification, comparison and event detection as well as multivariate time series analysis methods to determine the most statistically relevant time series, rank the most influential time series and also determine the periods of most activity during specific experimental periods. This thesis presents the architecture behind the framework, assesses the performance under controlled conditions and presents a use-case under the CLOUD experiment at CERN, Switzerland. The univariate and multivariate time series statistical methods applied to this framework are also studied.O processo de investigação científica moderno requer que tanto experimentalistas como administradores de sistemas dediquem uma parte significativa do seu tempo a criar estratégias para aceder, armazenar e manipular instrumentos científicos e os dados que estes produzem. Este é um desafio crescente considerando o aumento de colaborações que necessitam de vários instrumentos, investigação em áreas remotas e instrumentos científicos com constantes alterações. O DAQBroker é uma nova plataforma desenhada para a monitorização de instrumentos científicos e ao mesmo tempo fornece métodos simples para qualquer utilizador aceder aos seus dados. Os dados podem ser guardados em uma ou várias bases de dados locais ou remotas utilizando os gestores de bases de dados mais comuns (MySQL, PostgreSQL, Oracle). Esta plataforma também fornece as ferramentas necessárias para criar e editar versões virtuais de instrumentos científicos e manipular os dados recolhidos dos instrumentos, independentemente do grau de conhecimento que o utilizador tenha com o(s) instrumento(s) utilizado(s). Séries temporais guardadas numa base de dados DAQBroker beneficiam de um conjunto de métodos estatísticos para a classificação, comparação e detecção de eventos, determinação das séries com maior influência e os sub-períodos experimentais com maior actividade. Esta tese apresenta a arquitectura da plataforma, os resultados de diversos testes de esforço efectuados em ambientes controlados e um caso real da sua utilização na experiência CLOUD, no CERN, Suíça. São estudados também os métodos de análise de séries temporais, tanto singulares como multivariadas aplicados na plataforma
    corecore