56,612 research outputs found
MLPerf Inference Benchmark
Machine-learning (ML) hardware and software system demand is burgeoning.
Driven by ML applications, the number of different ML inference systems has
exploded. Over 100 organizations are building ML inference chips, and the
systems that incorporate existing models span at least three orders of
magnitude in power consumption and five orders of magnitude in performance;
they range from embedded devices to data-center solutions. Fueling the hardware
are a dozen or more software frameworks and libraries. The myriad combinations
of ML hardware and ML software make assessing ML-system performance in an
architecture-neutral, representative, and reproducible manner challenging.
There is a clear need for industry-wide standard ML benchmarking and evaluation
criteria. MLPerf Inference answers that call. In this paper, we present our
benchmarking method for evaluating ML inference systems. Driven by more than 30
organizations as well as more than 200 ML engineers and practitioners, MLPerf
prescribes a set of rules and best practices to ensure comparability across
systems with wildly differing architectures. The first call for submissions
garnered more than 600 reproducible inference-performance measurements from 14
organizations, representing over 30 systems that showcase a wide range of
capabilities. The submissions attest to the benchmark's flexibility and
adaptability.Comment: ISCA 202
Recommended from our members
Systematic evaluation of software product line architectures
The architecture of a software product line is one of its most important artifacts as it represents an abstraction of the products that can be generated. It is crucial to evaluate the quality attributes of a product line architecture in order to: increase the productivity of the product line process and the quality of the products; provide a means to understand the potential behavior of the products and, consequently, decrease their time to market; and, improve the handling of the product line variability. The evaluation of product line architecture can serve as a basis to analyze the managerial and economical values of a product line for software managers and architects. Most of the current research on the evaluation of product line architecture does not take into account metrics directly obtained from UML models and their variabilities; the metrics used instead are difficult to be applied in general and to be used for quantitative analysis. This paper presents a Systematic Evaluation Method for UML-based Software Product Line Architecture, the SystEM-PLA. SystEM-PLA differs from current research as it provides stakeholders with a means to: (i) estimate and analyze potential products; (ii) use predefined basic UML-based metrics to compose quality attribute metrics; (iii) perform feasibility and trade-off analysis of a product line architecture with respect to its quality attributes; and, (iv) make the evaluation of product line architecture more flexible. An example using the SEI’s Arcade Game Maker (AGM) product line is presented as a proof of concept, illustrating SystEM-PLA activities. Metrics for complexity and extensibility quality attributes are defined and used to
perform a trade-off analysis
Softer perspectives on enhancing the patient experience using IS/IT
Purpose – This paper aims to argue that the implementation of the Choose and Book system has failed due to the inability of project sponsors to appreciate the complex and far-reaching softer implications of the implementation, especially in a complex organisation such as the NHS, which has multifarious stakeholders.
Design/methodology/approach – The authors use practice-oriented research to try and isolate key parameters. These parameters are compared with existing conventional thinking in a number of focused areas.
Findings – Like many previous NHS initiatives, the focus of this system is in its obvious link to patients. However we find that although this project has cultural, social and organisational implications, programme managers and champions of the Connecting for Health programme emphasised the technical domains to IS/IT adoption.
Research limitations/implications – This paper has been written in advance of a fully implemented Choose and Book system.
Practical implications – The paper requests that more attention be paid to the softer side of IS/IT delivery, implementation, introduction and adoption.
Originality/value – The paper shows that patient experience within the UK healthcare sector is still well below what is desired
An Approach for the Empirical Validation of Software Complexity Measures
Software metrics are widely accepted tools to control and assure software quality. A large number of software metrics with a variety of content can be found in the literature; however most of them are not adopted in industry as they are seen as irrelevant to needs, as they are unsupported, and the major reason behind this is due to improper
empirical validation. This paper tries to identify possible root causes for the improper empirical validation of the software metrics. A practical model for the empirical validation of software metrics is proposed along with root causes. The model is validated by applying it to recently proposed and well known metrics
Integrated Solution Support System for Water Management
Solving water management problems involves technical, social, economic, political and legal challenges and thus requires an integrated approach involving people from different backgrounds and roles. The integrated approach has been given a prominent role within the European Union¿s Water Framework Directive (WFD). The WFD requires an integrated approach in water management to achieve good ecological status of all water bodies. It consists amongst others of the following main planning stages: describing objectives, assessing present state, identifying gaps between objectives and present state, developing management plan, implementing measures and evaluating their impacts. The directive prescribes broad participation and consultation to achieve its objectives. Besides the obvious desktop software, such an integrated approach can benefit from using a variety of support tools. In addition to tools for specific tasks such as numerical models and questionnaires, knowledge bases on options and process support tools may be utilized. Water stress, defined as the lack of water of appropriate quality is one issue related to, but not specifically addressed by the WFD. However, like in the WFD, a participatory approach could be used to mitigate water stress. Similarly various tools can or need to be used in such a complex process. In the AquaStress Integrated project the Integrated Solution Support System (I3S ¿ I-triple-S) is developed. One of the cornerstones of the approach taken in AquaStress is that organizing available knowledge provides sufficient information to improve the possibility to make a water stress mitigation process truly end-user driven, meaning that dedicated local information is only collected after specific need is expressed by the stakeholders in the process. The novelty of the I3S lies in the combination of such knowledge stored in knowledge-bases, with adaptable workflow management facilities and with specific task-oriented tools ¿ all originating from different sources. This paper describes the I3S
Recommended from our members
AUnit - a testing framework for alloy
textWriting declarative models of software designs and analyzing them to detect defects is an effective methodology for developing more dependable software systems. However, writing such models correctly can be challenging for practitioners who may not be proficient in declarative programming, and their models themselves may be buggy. We introduce the foundations of a novel test automation framework, AUnit, which we envision for testing declarative models written in Alloy -- a first-order, relational language that is supported by its SAT-based analyzer. We take inspiration from the success of the family of xUnit frameworks that are used widely in practice for test automation, albeit for imperative or object-oriented programs. The key novelty of our work is to define a basis for unit testing for Alloy, specifically, to define the concepts of test case and test coverage as well as coverage criteria for declarative models. We reduce the problems of declarative test execution and coverage computation to partial evaluation without requiring SAT solving. Our vision is to blend how developers write unit tests in commonly used programming languages with how Alloy users formulate their models in Alloy, thereby facilitating the development and testing of Alloy models for both new Alloy users as well as experts. We illustrate our ideas using a small but complex Alloy model. While we focus on Alloy, our ideas generalize to other declarative languages (such as Z, B, ASM).Electrical and Computer Engineerin
Do System Test Cases Grow Old?
Companies increasingly use either manual or automated system testing to
ensure the quality of their software products. As a system evolves and is
extended with new features the test suite also typically grows as new test
cases are added. To ensure software quality throughout this process the test
suite is continously executed, often on a daily basis. It seems likely that
newly added tests would be more likely to fail than older tests but this has
not been investigated in any detail on large-scale, industrial software
systems. Also it is not clear which methods should be used to conduct such an
analysis. This paper proposes three main concepts that can be used to
investigate aging effects in the use and failure behavior of system test cases:
test case activation curves, test case hazard curves, and test case half-life.
To evaluate these concepts and the type of analysis they enable we apply them
on an industrial software system containing more than one million lines of
code. The data sets comes from a total of 1,620 system test cases executed a
total of more than half a million times over a time period of two and a half
years. For the investigated system we find that system test cases stay active
as they age but really do grow old; they go through an infant mortality phase
with higher failure rates which then decline over time. The test case half-life
is between 5 to 12 months for the two studied data sets.Comment: Updated with nicer figs without border around the
Cloud engineering is search based software engineering too
Many of the problems posed by the migration of computation to cloud platforms can be formulated and solved using techniques associated with Search Based Software Engineering (SBSE). Much of cloud software engineering involves problems of optimisation: performance, allocation, assignment and the dynamic balancing of resources to achieve pragmatic trade-offs between many competing technical and business objectives. SBSE is concerned with the application of computational search and optimisation to solve precisely these kinds of software engineering challenges. Interest in both cloud computing and SBSE has grown rapidly in the past five years, yet there has been little work on SBSE as a means of addressing cloud computing challenges. Like many computationally demanding activities, SBSE has the potential to benefit from the cloud; ‘SBSE in the cloud’. However, this paper focuses, instead, of the ways in which SBSE can benefit cloud computing. It thus develops the theme of ‘SBSE for the cloud’, formulating cloud computing challenges in ways that can be addressed using SBSE
- …