5,616 research outputs found
Reconciling Synthesis and Decomposition: A Composite Approach to Capability Identification
Stakeholders' expectations and technology constantly evolve during the
lengthy development cycles of a large-scale computer based system.
Consequently, the traditional approach of baselining requirements results in an
unsatisfactory system because it is ill-equipped to accommodate such change. In
contrast, systems constructed on the basis of Capabilities are more
change-tolerant; Capabilities are functional abstractions that are neither as
amorphous as user needs nor as rigid as system requirements. Alternatively,
Capabilities are aggregates that capture desired functionality from the users'
needs, and are designed to exhibit desirable software engineering
characteristics of high cohesion, low coupling and optimum abstraction levels.
To formulate these functional abstractions we develop and investigate two
algorithms for Capability identification: Synthesis and Decomposition. The
synthesis algorithm aggregates detailed rudimentary elements of the system to
form Capabilities. In contrast, the decomposition algorithm determines
Capabilities by recursively partitioning the overall mission of the system into
more detailed entities. Empirical analysis on a small computer based library
system reveals that neither approach is sufficient by itself. However, a
composite algorithm based on a complementary approach reconciling the two polar
perspectives results in a more feasible set of Capabilities. In particular, the
composite algorithm formulates Capabilities using the cohesion and coupling
measures as defined by the decomposition algorithm and the abstraction level as
determined by the synthesis algorithm.Comment: This paper appears in the 14th Annual IEEE International Conference
and Workshop on the Engineering of Computer Based Systems (ECBS); 10 pages, 9
figure
A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures
Scientific problems that depend on processing large amounts of data require
overcoming challenges in multiple areas: managing large-scale data
distribution, co-placement and scheduling of data with compute resources, and
storing and transferring large volumes of data. We analyze the ecosystems of
the two prominent paradigms for data-intensive applications, hereafter referred
to as the high-performance computing and the Apache-Hadoop paradigm. We propose
a basis, common terminology and functional factors upon which to analyze the
two approaches of both paradigms. We discuss the concept of "Big Data Ogres"
and their facets as means of understanding and characterizing the most common
application workloads found across the two paradigms. We then discuss the
salient features of the two paradigms, and compare and contrast the two
approaches. Specifically, we examine common implementation/approaches of these
paradigms, shed light upon the reasons for their current "architecture" and
discuss some typical workloads that utilize them. In spite of the significant
software distinctions, we believe there is architectural similarity. We discuss
the potential integration of different implementations, across the different
levels and components. Our comparison progresses from a fully qualitative
examination of the two paradigms, to a semi-quantitative methodology. We use a
simple and broadly used Ogre (K-means clustering), characterize its performance
on a range of representative platforms, covering several implementations from
both paradigms. Our experiments provide an insight into the relative strengths
of the two paradigms. We propose that the set of Ogres will serve as a
benchmark to evaluate the two paradigms along different dimensions.Comment: 8 pages, 2 figure
- …