2,378 research outputs found

    Large-scale linear regression: Development of high-performance routines

    Full text link
    In statistics, series of ordinary least squares problems (OLS) are used to study the linear correlation among sets of variables of interest; in many studies, the number of such variables is at least in the millions, and the corresponding datasets occupy terabytes of disk space. As the availability of large-scale datasets increases regularly, so does the challenge in dealing with them. Indeed, traditional solvers---which rely on the use of black-box" routines optimized for one single OLS---are highly inefficient and fail to provide a viable solution for big-data analyses. As a case study, in this paper we consider a linear regression consisting of two-dimensional grids of related OLS problems that arise in the context of genome-wide association analyses, and give a careful walkthrough for the development of {\sc ols-grid}, a high-performance routine for shared-memory architectures; analogous steps are relevant for tailoring OLS solvers to other applications. In particular, we first illustrate the design of efficient algorithms that exploit the structure of the OLS problems and eliminate redundant computations; then, we show how to effectively deal with datasets that do not fit in main memory; finally, we discuss how to cast the computation in terms of efficient kernels and how to achieve scalability. Importantly, each design decision along the way is justified by simple performance models. {\sc ols-grid} enables the solution of 101110^{11} correlated OLS problems operating on terabytes of data in a matter of hours

    HL-Pow: A Learning-Based Power Modeling Framework for High-Level Synthesis

    Full text link
    High-level synthesis (HLS) enables designers to customize hardware designs efficiently. However, it is still challenging to foresee the correlation between power consumption and HLS-based applications at an early design stage. To overcome this problem, we introduce HL-Pow, a power modeling framework for FPGA HLS based on state-of-the-art machine learning techniques. HL-Pow incorporates an automated feature construction flow to efficiently identify and extract features that exert a major influence on power consumption, simply based upon HLS results, and a modeling flow that can build an accurate and generic power model applicable to a variety of designs with HLS. By using HL-Pow, the power evaluation process for FPGA designs can be significantly expedited because the power inference of HL-Pow is established on HLS instead of the time-consuming register-transfer level (RTL) implementation flow. Experimental results demonstrate that HL-Pow can achieve accurate power modeling that is only 4.67% (24.02 mW) away from onboard power measurement. To further facilitate power-oriented optimizations, we describe a novel design space exploration (DSE) algorithm built on top of HL-Pow to trade off between latency and power consumption. This algorithm can reach a close approximation of the real Pareto frontier while only requiring running HLS flow for 20% of design points in the entire design space.Comment: published as a conference paper in ASP-DAC 202

    Using similitude theory and discrete element modeling to understand the effects of digging parameters on excavation performance for rubber tire loaders

    Get PDF
    The large sizes of mining equipment pose challenges for analysis using experiments or simulation. While scaled physical and simulation models can address this challenge, no previous work has explored how similitude theory and modeling can provide valid analysis of large equipment such as rubber tire loaders. The objective of this research was to apply similitude theory and discrete element modeling (DEM) to study the effect of different digging parameters on the penetration and the draft on the buckets of rubber tire loaders. The work sought to (1) test the hypothesis that the geometry of a rubber tire loader bucket and operating conditions significantly affects the resistive force (draft) and penetration; (2) test the hypothesis that different geometry orientations and operating conditions of a rubber tire loader bucket significantly affects draft and penetration; (3) apply DEM to scale models of rubber tire loader buckets to understand the effect of bucket geometry, orientations, and operating conditions on draft and penetration; and (4) evaluate the effectiveness of using discrete element models and similitude theory to predict draft and penetration. The results show that geometry, muckpile particle sizes, height above the floor, rake angle, speed, and motor power output are correlated to penetration and draft. This work has demonstrated that we can build valid DEM models for predicting at a larger scale. The chamfer angle of semi-spade bucket cutting blades significantly affects the draft on the buckets and 30° chamfer cut angle performs the best with the lowest peak resistive forces and energy consumption. The work finds that the forces observed during the rotation phase of the simulation are lower than the observed forces during penetration --Abstract, page iii

    A survey on run-time power monitors at the edge

    Get PDF
    Effectively managing energy and power consumption is crucial to the success of the design of any computing system, helping mitigate the efficiency obstacles given by the downsizing of the systems while also being a valuable step towards achieving green and sustainable computing. The quality of energy and power management is strongly affected by the prompt availability of reliable and accurate information regarding the power consumption for the different parts composing the target monitored system. At the same time, effective energy and power management are even more critical within the field of devices at the edge, which exponentially proliferated within the past decade with the digital revolution brought by the Internet of things. This manuscript aims to provide a comprehensive conceptual framework to classify the different approaches to implementing run-time power monitors for edge devices that appeared in literature, leading the reader toward the solutions that best fit their application needs and the requirements and constraints of their target computing platforms. Run-time power monitors at the edge are analyzed according to both the power modeling and monitoring implementation aspects, identifying specific quality metrics for both in order to create a consistent and detailed taxonomy that encompasses the vast existing literature and provides a sound reference to the interested reader
    • …
    corecore