17,155 research outputs found

    Quantifying volume, velocity, and variety to support (Big) data-intensive application development

    Get PDF
    © 2017 IEEE. In the era of digital economies, data can be considered as the new commodity, fueling the next-generation software services and applications. Increasing amounts of data, generated on a daily basis by various domains, such as social networks, stock exchanges, the Internet of Things, and cyber-physical systems, are soon expected to exceed the yottabyte1 frontier. To process this overwhelming amount, Big Data solutions are being developed to enable a new generation of data-centric/data-intensive applications (DIAs) and services. However, many of such applications currently fail to meet the increasingly demanding data management requirements. In particular, proper techniques and tools to support architects and developers in DIA design are required to cope with these pressing Big Data challenges. This paper makes an initial step in this direction, aiming at reducing the gap between the architects and DIAs they have to develop. The proposed approach extends the conventional Big Data process workflow with a way of capturing and modeling the 'three Vs' of Big Data (i.e. volume, velocity, and variety) to provide useful insights on the overall process, knowing the behavior of its individual components. Starting from the V-attributes of the Big Data process components, the proposed framework provides an estimation of its V-metrics by evaluating a performance model generated from the process. To demonstrate the feasibility and the effectiveness of the approach, a case study on a computer vision DIA is reported

    ShenZhen transportation system (SZTS): a novel big data benchmark suite

    Get PDF
    Data analytics is at the core of the supply chain for both products and services in modern economies and societies. Big data workloads, however, are placing unprecedented demands on computing technologies, calling for a deep understanding and characterization of these emerging workloads. In this paper, we propose ShenZhen Transportation System (SZTS), a novel big data Hadoop benchmark suite comprised of real-life transportation analysis applications with real-life input data sets from Shenzhen in China. SZTS uniquely focuses on a specific and real-life application domain whereas other existing Hadoop benchmark suites, such as HiBench and CloudRank-D, consist of generic algorithms with synthetic inputs. We perform a cross-layer workload characterization at the microarchitecture level, the operating system (OS) level, and the job level, revealing unique characteristics of SZTS compared to existing Hadoop benchmarks as well as general-purpose multi-core PARSEC benchmarks. We also study the sensitivity of workload behavior with respect to input data size, and we propose a methodology for identifying representative input data sets

    Big Data Research in Information Systems: Toward an Inclusive Research Agenda

    Get PDF
    Big data has received considerable attention from the information systems (IS) discipline over the past few years, with several recent commentaries, editorials, and special issue introductions on the topic appearing in leading IS outlets. These papers present varying perspectives on promising big data research topics and highlight some of the challenges that big data poses. In this editorial, we synthesize and contribute further to this discourse. We offer a first step toward an inclusive big data research agenda for IS by focusing on the interplay between big data’s characteristics, the information value chain encompassing people-process-technology, and the three dominant IS research traditions (behavioral, design, and economics of IS). We view big data as a disruption to the value chain that has widespread impacts, which include but are not limited to changing the way academics conduct scholarly work. Importantly, we critically discuss the opportunities and challenges for behavioral, design science, and economics of IS research and the emerging implications for theory and methodology arising due to big data’s disruptive effects

    Forecasting the Number of Students in Multiple Linear Regressions

    Get PDF
    The most important element of higher education was students, therefore every university must continue to improve services in the future, and one of them was by using decision support. This case could be done by utilizing the University of Big Data. Predicting the number of prospective students in higher education was done by utilizing data mining and multiple linear regression approaches. By using 2 independent variables, namely administration costs (X1), accreditation score (X2), and the number of students who was registered each year as dependent variable (Y). For the test data, it used database for the last 13 years. By using multiple linear regression, the intercept value was sought and the coefficient of determination until the regression coefficient was obtained with the equation Y = 45.28 + -0.02.X1 + 121.58.X2, noted that if X2 was constant, the increasing of one unit was in X1 would have the effect of increasing -0.02 units on Y. Secondly, if X1 was constant, the increasing of one unit was in X2, would have the effect of increasing 121.58 units in Y. Thirdly, if X1 and X2 were equal to zero, the magnitude of Y was 45.28 units. Therefore, the proposed approach could be provided the acceptable predictive results

    Big Data Major Security Issues: Challenges and Defense Strategies

    Get PDF
    Big data has unlocked the door to significant advances in a wide range of scientific fields, and it has emerged as a highly attractive subject both in the world of academia and in business as a result. It has also made significant contributions to innovation, productivity gains, and competitiveness enhancements. However, there are many difficulties associated with data collecting, storage, usage, analysis, privacy, and trust that must be addressed at this time. In addition, inaccurate or misleading big data may lead to an incorrect or invalid interpretation of findings, which can negatively impact the consumers\u27 experiences. This article examines the challenges related to implementing big data security and some important solutions for addressing these problems. So, a total of 12 papers have been extracted and analyzed to add to the corpus of literature by concentrating on several critical issues in the big data analytics sector as well as shedding light on how these challenges influence many domains such as healthcare, education, and business intelligence, among others. While studies have proven that big data poses issues, their approaches to overcoming these obstacles vary. The most frequently mentioned challenges were data, process, privacy, and management. To address these issues, this paper included previously discovered solutions

    Analytics for Autonomous C4ISR within e-Government: a Research Agenda

    Get PDF
    e-Government enables big data analytics to support decision processes in governing. C4ISR (Command, Control, Communications, Computers, Intelligence, Surveillance and Reconnaissance) is essentially e-Government scoped to military decision processes. The value of big data and its challenges are common to both. High variety and demand for veracity compel domain expertise-specific data analysis, and increasing volume and velocity hinder data analytics at scale. These conditions challenge even highly automated methods for comprehensive cross-domain analytics, and motivate cognitive approaches such as underlie Autonomous Systems (AS) aimed at C4ISR. A C4ISR framework is examined by parts, linking each C to ISR capability, and a taxonomy of analytics is extended to include cognitive autonomy enablers. Coupling these frameworks, the authors propose an extension of cognitive approaches for autonomy in C4ISR to e-Government in general and outline a research agenda for attaining it
    corecore