31,842 research outputs found

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    A Taxonomy of Workflow Management Systems for Grid Computing

    Full text link
    With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing complex workflows. Therefore, many efforts have been made towards the development of workflow management systems for Grid computing. In this paper, we propose a taxonomy that characterizes and classifies various approaches for building and executing workflows on Grids. We also survey several representative Grid workflow systems developed by various projects world-wide to demonstrate the comprehensiveness of the taxonomy. The taxonomy not only highlights the design and engineering similarities and differences of state-of-the-art in Grid workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure

    The role of learning on industrial simulation design and analysis

    Full text link
    The capability of modeling real-world system operations has turned simulation into an indispensable problemsolving methodology for business system design and analysis. Today, simulation supports decisions ranging from sourcing to operations to finance, starting at the strategic level and proceeding towards tactical and operational levels of decision-making. In such a dynamic setting, the practice of simulation goes beyond being a static problem-solving exercise and requires integration with learning. This article discusses the role of learning in simulation design and analysis motivated by the needs of industrial problems and describes how selected tools of statistical learning can be utilized for this purpose

    Functional Regression

    Full text link
    Functional data analysis (FDA) involves the analysis of data whose ideal units of observation are functions defined on some continuous domain, and the observed data consist of a sample of functions taken from some population, sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the development of this field, which has accelerated in the past 10 years to become one of the fastest growing areas of statistics, fueled by the growing number of applications yielding this type of data. One unique characteristic of FDA is the need to combine information both across and within functions, which Ramsay and Silverman called replication and regularization, respectively. This article will focus on functional regression, the area of FDA that has received the most attention in applications and methodological development. First will be an introduction to basis functions, key building blocks for regularization in functional regression methods, followed by an overview of functional regression methods, split into three types: [1] functional predictor regression (scalar-on-function), [2] functional response regression (function-on-scalar) and [3] function-on-function regression. For each, the role of replication and regularization will be discussed and the methodological development described in a roughly chronological manner, at times deviating from the historical timeline to group together similar methods. The primary focus is on modeling and methodology, highlighting the modeling structures that have been developed and the various regularization approaches employed. At the end is a brief discussion describing potential areas of future development in this field

    Strategic Directions in Object-Oriented Programming

    Get PDF
    This paper has provided an overview of the field of object-oriented programming. After presenting a historical perspective and some major achievements in the field, four research directions were introduced: technologies integration, software components, distributed programming, and new paradigms. In general there is a need to continue research in traditional areas:\ud (1) as computer systems become more and more complex, there is a need to further develop the work on architecture and design; \ud (2) to support the development of complex systems, there is a need for better languages, environments, and tools; \ud (3) foundations in the form of the conceptual framework and other theories must be extended to enhance the means for modeling and formal analysis, as well as for understanding future computer systems

    Implementing Session Centered Calculi

    Get PDF
    Recently, specific attention has been devoted to the development of service oriented process calculi. Besides the foundational aspects, it is also interesting to have prototype implementations for them in order to assess usability and to minimize the gap between theory and practice. Typically, these implementations are done in Java taking advantage of its mechanisms supporting network applications. However, most of the recurrent features of service oriented applications are re-implemented from scratch. In this paper we show how to implement a service oriented calculus, CaSPiS (Calculus of Services with Pipelines and Sessions) using the Java framework IMC, where recurrent mechanisms for network applications are already provided. By using the session oriented and pattern matching communication mechanisms provided by IMC, it is relatively simple to implement in Java all CaSPiS abstractions and thus to easily write the implementation in Java of a CaSPiS process

    Semiconductor manufacturing simulation design and analysis with limited data

    Full text link
    This paper discusses simulation design and analysis for Silicon Carbide (SiC) manufacturing operations management at New York Power Electronics Manufacturing Consortium (PEMC) facility. Prior work has addressed the development of manufacturing system simulation as the decision support to solve the strategic equipment portfolio selection problem for the SiC fab design [1]. As we move into the phase of collecting data from the equipment purchased for the PEMC facility, we discuss how to redesign our manufacturing simulations and analyze their outputs to overcome the challenges that naturally arise in the presence of limited fab data. We conclude with insights on how an approach aimed to reflect learning from data can enable our discrete-event stochastic simulation to accurately estimate the performance measures for SiC manufacturing at the PEMC facility
    • …
    corecore