2,891 research outputs found

    Next Generation Cloud Computing: New Trends and Research Directions

    Get PDF
    The landscape of cloud computing has significantly changed over the last decade. Not only have more providers and service offerings crowded the space, but also cloud infrastructure that was traditionally limited to single provider data centers is now evolving. In this paper, we firstly discuss the changing cloud infrastructure and consider the use of infrastructure from multiple providers and the benefit of decentralising computing away from data centers. These trends have resulted in the need for a variety of new computing architectures that will be offered by future cloud infrastructure. These architectures are anticipated to impact areas, such as connecting people and devices, data-intensive computing, the service space and self-learning systems. Finally, we lay out a roadmap of challenges that will need to be addressed for realising the potential of next generation cloud systems.Comment: Accepted to Future Generation Computer Systems, 07 September 201

    Tenant Level Checkpointing of Meta-data for Multi-tenancy SaaS

    Get PDF
    Traditional checkpointing techniques are facing a grave challenge when applied to multi-tenancy software-as-a-service (SaaS) systems due to the huge scale of the system state and the diversity of users' requirements on the quality of services. This paper proposes the notion of tenant level checkpointing and an algorithm that exploits Big Data techniques to checkpoint tenant's meta-data, which are widely used in configuring SaaS for tenant-specific features. The paper presents a prototype implementation of the proposed technique using NoSQL database Couchbase and reports the experiments that compare it with traditional implementation of checkpointing using file systems. Experiments show that the Big Data approach has a significantly lower latency in comparison with the traditional approach

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    Intelligent Management and Efficient Operation of Big Data

    Get PDF
    This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources, the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services, and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201

    Normalization of Unstructured Log Data into Streams of Structured Event Objects

    Get PDF
    Monitoring plays a crucial role in the operation of any sizeable distributed IT infrastructure. Whether it is a university network or cloud datacenter, monitoring information is continuously used in a wide spectrum of ways ranging from mission-critical jobs, e.g. accounting or incident handling, to equally important development-related tasks, e.g. debugging or fault-detection. Whilst pursuing a novel vision of new-generation event-driven monitoring systems, we have identified that a particularly rich source of monitoring information, computer logs, is also one of the most problematic in terms of automated processing. Log data are predominantly generated in an ad-hoc manner using a variety of incompatible formats with the most important pieces of information, i.e. log messages, in the form of unstructured strings. This clashes with our long-term goal of designing a system enabling its users to transparently define real-time continuous queries over homogeneous streams of properly defined monitoring event objects with explicitly described structure. Our goal is to bridge this gap by normalizing the poorly structured log data into streams of structured event objects. The combined challenge of this goal is structuring the log data, whilst considering the high velocity with which they are generated in modern IT infrastructures. This paper summarizes the contributions of a dissertation thesis "Normalization of Unstructured Log Data into Streams of Structured Event Objects" dealing with the matter at hand in detail

    ENHANCEMENT FOR DATA SECURITY IN CLOUD COMPUTING ENVIRONMENT

    Get PDF
    Cloud computing, a rapidly developing information technology, has aroused the concern of the whole world. Cloud computing is Internet-based computing, whereby shared resources, software and information, are provided to computers and devices on-demand, like the electricity grid. Cloud computing is the product of the fusion of traditional computing technology and network technology like grid computing, distributed computing parallel computing and so on. It aims to construct a perfect system with powerful computing capability through a large number of relatively low-cost computing entity, and using the advanced business model like SaaS (Software as a Service) to distribute the powerful computing capacity to end users’ hands. To address this longstanding limitation by building a multi-tenant system. Our system provides the environment for the user to perform his tasks, but with very high security. By using further facilities provided in this system user fill secure about his data and his account

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    A Survey on Big Data, Hadoop and it’s Ecosystem

    Get PDF
    Now days, The 21st century is emphasized by a rapid and enormous change in the field of information technology. It is a non-separable part of our daily life and of multiple other industries like education, genetics, entertainment, science & technology, business etc. In this information age, a vast amount of data generation takes place. This vast amount of data is referred as Big Data. There is a number of challenges present in the Big Data such as capturing data, data analysis, searching of data, sharing of data, filtering of data etc. Today Big Data is applied in various fields like shopping websites such as Amazon, Flipkart, Social networking sites such as Twitter, Facebook, and so on. It is reviewed from some literature that, the Big data tends to use different analysis methods, like predictive analysis, user analysis etc. This paper represents the fact that, Big Data required an open source technology for operating and storing huge amount of data. This paper greatly emphasizes on Apache Hadoop, which has become dominant due to its applicability for processing of big data.Hadoop supports thousands of terabytes of data. Hadoop framework facilitates the analysis of big data and its processing methodologies as well as the structure of an ecosystem

    Spatial Big Data Analytics: The New Boundaries of Retail Location Decision-Making

    Get PDF
    This dissertation examines the current state and evolution of retail location decision-making (RLDM) in Canada. The major objectives are: (i) To explore the type and scale of location decisions that retail firms are currently undertaking; (ii) To identify the availability and use of technology and Spatial Big Data (SBD) within the decision-making process; (iii) To identify the awareness, availability, use, adoption and development of SBD; and, (iv) To assess the implications of SBD in RLDM. These objectives were investigated by using a three stage multi-method research process. First, an online survey of retail location decision makers across a range of sizes and sub-sectors was administered. Secondly, structured interviews were conducted with 24 retail location decision makers, and lastly, three in-depth cases studies were undertaken in order to highlight the changes to RLDM over the last decade and to develop a deeper understanding of RLDM. This dissertation found that within the last decade RLDM changed in three main ways: (i) There has been an increase in the availability and use of technology and SBD within the decision-making process; (ii) The type and scale of location decisions that a firm undertakes remain relatively unchanged even with the growth of new data; and, (iii) The range of location research methods that are employed within retail firms is only just beginning to change given the presence of new data sources and data analytics technology. Traditional practices still dominate the RLDM process. While the adoption of SBD applications is starting to appear within retail planning, they are not widespread. Traditional data sources, such as those highlighted in past studies by Hernandez and Emmons (2012) and Byrom et al. (2001) are still the most commonly used data sources. It was evident that at the heart of SBD adoption is a data environment that promotes transparency and a clear corporate strategy. While most retailers are aware of the new SBD techniques that exist, they are not often adopted and routinized
    • …
    corecore