31,842 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
A Taxonomy of Workflow Management Systems for Grid Computing
With the advent of Grid and application technologies, scientists and
engineers are building more and more complex applications to manage and process
large data sets, and execute scientific experiments on distributed resources.
Such application scenarios require means for composing and executing complex
workflows. Therefore, many efforts have been made towards the development of
workflow management systems for Grid computing. In this paper, we propose a
taxonomy that characterizes and classifies various approaches for building and
executing workflows on Grids. We also survey several representative Grid
workflow systems developed by various projects world-wide to demonstrate the
comprehensiveness of the taxonomy. The taxonomy not only highlights the design
and engineering similarities and differences of state-of-the-art in Grid
workflow systems, but also identifies the areas that need further research.Comment: 29 pages, 15 figure
The role of learning on industrial simulation design and analysis
The capability of modeling real-world system operations has turned simulation into an indispensable problemsolving methodology for business system design and analysis. Today, simulation supports decisions ranging
from sourcing to operations to finance, starting at the strategic level and proceeding towards tactical and
operational levels of decision-making. In such a dynamic setting, the practice of simulation goes beyond
being a static problem-solving exercise and requires integration with learning. This article discusses the role
of learning in simulation design and analysis motivated by the needs of industrial problems and describes
how selected tools of statistical learning can be utilized for this purpose
Functional Regression
Functional data analysis (FDA) involves the analysis of data whose ideal
units of observation are functions defined on some continuous domain, and the
observed data consist of a sample of functions taken from some population,
sampled on a discrete grid. Ramsay and Silverman's 1997 textbook sparked the
development of this field, which has accelerated in the past 10 years to become
one of the fastest growing areas of statistics, fueled by the growing number of
applications yielding this type of data. One unique characteristic of FDA is
the need to combine information both across and within functions, which Ramsay
and Silverman called replication and regularization, respectively. This article
will focus on functional regression, the area of FDA that has received the most
attention in applications and methodological development. First will be an
introduction to basis functions, key building blocks for regularization in
functional regression methods, followed by an overview of functional regression
methods, split into three types: [1] functional predictor regression
(scalar-on-function), [2] functional response regression (function-on-scalar)
and [3] function-on-function regression. For each, the role of replication and
regularization will be discussed and the methodological development described
in a roughly chronological manner, at times deviating from the historical
timeline to group together similar methods. The primary focus is on modeling
and methodology, highlighting the modeling structures that have been developed
and the various regularization approaches employed. At the end is a brief
discussion describing potential areas of future development in this field
Strategic Directions in Object-Oriented Programming
This paper has provided an overview of the field of object-oriented programming. After presenting a historical perspective and some major achievements in the field, four research directions were introduced: technologies integration, software components, distributed programming, and new paradigms. In general there is a need to continue research in traditional areas:\ud
(1) as computer systems become more and more complex, there is a need to further develop the work on architecture and design; \ud
(2) to support the development of complex systems, there is a need for better languages, environments, and tools; \ud
(3) foundations in the form of the conceptual framework and other theories must be extended to enhance the means for modeling and formal analysis, as well as for understanding future computer systems
Implementing Session Centered Calculi
Recently, specific attention has been devoted to the development of service oriented process calculi. Besides the foundational aspects, it is also interesting to have prototype implementations for them in order to assess usability and to minimize the gap between theory and practice. Typically, these implementations are done in Java taking advantage of its mechanisms supporting network applications. However, most of the recurrent features of service oriented applications are re-implemented from scratch. In this paper we show how to implement a service oriented calculus, CaSPiS (Calculus of Services with Pipelines and Sessions) using the Java framework IMC, where recurrent mechanisms for network applications are already provided. By using the session oriented and pattern matching communication mechanisms provided by IMC, it is relatively simple to implement in Java all CaSPiS abstractions and thus to easily write the implementation in Java of a CaSPiS process
Semiconductor manufacturing simulation design and analysis with limited data
This paper discusses simulation design and analysis for Silicon Carbide (SiC) manufacturing operations management at New York Power Electronics Manufacturing Consortium (PEMC) facility. Prior work has addressed the development of manufacturing system simulation as the decision support to solve the strategic equipment portfolio selection problem for the SiC fab design [1]. As we move into the phase of collecting data from the equipment purchased for the PEMC facility, we discuss how to redesign our manufacturing simulations and analyze their outputs to overcome the challenges that naturally arise in the presence of limited fab data. We conclude with insights on how an approach aimed to reflect learning from data can enable our discrete-event stochastic simulation to accurately estimate the performance measures for SiC manufacturing at the PEMC facility
- …