37,702 research outputs found

    Designing Traceability into Big Data Systems

    Full text link
    Providing an appropriate level of accessibility and traceability to data or process elements (so-called Items) in large volumes of data, often Cloud-resident, is an essential requirement in the Big Data era. Enterprise-wide data systems need to be designed from the outset to support usage of such Items across the spectrum of business use rather than from any specific application view. The design philosophy advocated in this paper is to drive the design process using a so-called description-driven approach which enriches models with meta-data and description and focuses the design process on Item re-use, thereby promoting traceability. Details are given of the description-driven design of big data systems at CERN, in health informatics and in business process management. Evidence is presented that the approach leads to design simplicity and consequent ease of management thanks to loose typing and the adoption of a unified approach to Item management and usage.Comment: 10 pages; 6 figures in Proceedings of the 5th Annual International Conference on ICT: Big Data, Cloud and Security (ICT-BDCS 2015), Singapore July 2015. arXiv admin note: text overlap with arXiv:1402.5764, arXiv:1402.575

    Supporting emerging researchers in data management and curation

    Get PDF
    While scholarly publishing remains the key means for determining researchers’ impact, international funding body requirements and government recommendations relating to research data management (RDM), sharing and preservation mean that the underlying research data are becoming increasingly valuable in their own right. This is true not only for researchers in the sciences but also in the humanities and creative arts as well. The ability to exploit their own - and others’ - data is emerging as a crucial skill for researchers across all disciplines. However, despite Generation Y researchers being ‘highly competent and ubiquitous users of information technologies generally’ they appears to be a widespread lack of understanding and uncertainty about open access and self-archived resources (Jisc study, 2012). This chapter will consider the potential support that academic librarians might provide to support Generation Y researchers in this shifting research data landscape and examine the role of the library as part of institutional infrastructure. The changing landscape will impact research libraries most keenly over the next few years as they work to develop infrastructure and support systems to identify and maintain access to a diverse array of research data outputs. However, the data that are being produced through research are no different to those being produced by artists, politicians and the general public. In this respect, all libraries - whether they be academic, national, or local - will need to be gearing up to ensure they are able to accept and provide access to an ever increasing range of complex digital objects

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    A systematic literature review of the use of social media for business process management

    Get PDF
    In today’s expansion of new technologies, innovation is found necessary for organizations to be up to date with the latest management trends. Although organizations are increasingly using new technologies, opportunities still exist to achieve the nowadays essential omnichannel management strategy. More precisely, social media are opening a path for benefiting more from an organization’s process orientation. However, social media strategies are still an under-investigated field, especially when it comes to the research of social media use for the management and improvement of business processes or the internal way of working in organizations. By classifying a variety of articles, this study explores the evolution of social media implementation within the BPM discipline. We also provide avenues for future research and strategic implications for practitioners to use social media more comprehensively

    Digital curation and the cloud

    Get PDF
    Digital curation involves a wide range of activities, many of which could benefit from cloud deployment to a greater or lesser extent. These range from infrequent, resource-intensive tasks which benefit from the ability to rapidly provision resources to day-to-day collaborative activities which can be facilitated by networked cloud services. Associated benefits are offset by risks such as loss of data or service level, legal and governance incompatibilities and transfer bottlenecks. There is considerable variability across both risks and benefits according to the service and deployment models being adopted and the context in which activities are performed. Some risks, such as legal liabilities, are mitigated by the use of alternative, e.g., private cloud models, but this is typically at the expense of benefits such as resource elasticity and economies of scale. Infrastructure as a Service model may provide a basis on which more specialised software services may be provided. There is considerable work to be done in helping institutions understand the cloud and its associated costs, risks and benefits, and how these compare to their current working methods, in order that the most beneficial uses of cloud technologies may be identified. Specific proposals, echoing recent work coordinated by EPSRC and JISC are the development of advisory, costing and brokering services to facilitate appropriate cloud deployments, the exploration of opportunities for certifying or accrediting cloud preservation providers, and the targeted publicity of outputs from pilot studies to the full range of stakeholders within the curation lifecycle, including data creators and owners, repositories, institutional IT support professionals and senior manager
    • …
    corecore