Search CORE

3 research outputs found

3rd Many-core Applications Research Community (MARC) Symposium. (KIT Scientific Reports ; 7598)

Author: Becker Jürgen
Göhringer Diana
Hübner Michael
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2011
Field of study

This manuscript includes recent scientific work regarding the Intel Single Chip Cloud computer and describes approaches for novel approaches for programming and run-time organization

KITopen

DVFS-control techniques for dense linear algebra operations on multi-core processors

Author: A Mtibaa
C Hsu
D King
E Anderson
Enrique S. Quintana-Ortí
ES Quintana-Ortí
Francisco D. Igual
G Quintana-Ortí
GH Golub
J Dongarra
LR Shaffer
Manuel F. Dolz
Pedro Alonso
R Gruber
R Li
Rafael Mayo
S Albers
T Ludwig
VW Freeh
W Feng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2012
Field of study

[EN] This paper analyzes the impact on power con- sumption of two DVFS-control strategies when applied to the execution of dense linear algebra operations on multi- core processors. The strategies considered here, prototyped as the Slack Reduction Algorithm (SRA) and the Race-to- Idle Algorithm (RIA), adjust the operation frequency of the cores during execution of a collection of tasks (in which many dense linear algebra algorithms can be decomposed) with a very different approach to save energy. A power- aware simulator, in charge of scheduling the execution of tasks to processor cores, is employed to evaluate the perfor- mance benefits of these power-control policies for two ref- erence algorithms for the LU factorization, a key operation for the solution of linear systems of equations.The authors from Univ. Jaume I were supported by project CICYT TIN2008-06570-C04 and FEDER.Alonso-Jordá, P.; Dolz Zaragozá, MF.; Igual, FD.; Mayo, R.; Quintana Ortí, ES. (2012). DVFS-control techniques for dense linear algebra operations on multi-core processors. Computer Science - Research and Development. 27(4):289-298. https://doi.org/10.1007/s00450-011-0188-7S289298274Albers S (2010) Energy-efficient algorithms. Commun ACM 53:86–96Dongarra J et al. (2011) The international ExaScale software project roadmap. Int J High Perform Comput Appl, 25(1):3–60Duranton M et al. (2010) The HiPEAC vision. Available from http://www.hipeac.net/roadmapFeng W, Feng X, Ce R (2008) Green supercomputing comes of age. IT Prof 10(1):17–23Gruber R, Keller V (2010) One joule per GFlop for BLAS2 now! In: Simos TE, Psihoyios G, Tsitouras C (eds) AIP conf proceedings, vol 1281. American Institute of Physics, College Park, pp 1321–1324Ludwig T (2010) Editorial for the first international conference on energy-aware high performance computing. Comput Sci Res Dev 25(3):123–124Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. The Johns Hopkins University Press, BaltimoreVan Zee FG (2009) libflame: the complete reference. www.lulu.comAnderson E, Bai Z, Bischof C, Blackford LS, Demmel J, Dongarra JJ, Croz Du J, Hammarling S, Greenbaum A, McKenney A, Sorensen D (1999) LAPACK users’ guide, 3rd edn. SIAM, PhiladelphiaHsu C, Feng W (2005) A feasibility analysis of power awareness in commodity-based high-performance clusters. In: Cluster 2005Quintana-Ortí ES, van de Geijn RA (2008) Updating an LU factorization with pivoting. ACM Trans Math Softw 35(2):11:1–11:16Quintana-Ortí G, Quintana-Ortí ES, van de Geijn RA, Van Zee FG, Chan E (2009) Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans Math Softw 36(3):14:1–14:26Freeh VW, Lowenthal DK, Pan F, Kappiah N, Springer R, Rountree BL, Femal ME (2007) Analyzing the energy-time trade-off in high-performance computing applications. IEEE Trans Parallel Distrib Syst 18:835–848King D, Ahmad I, Sheikh HF (2010) Stretch and compress based re-scheduling techniques for minimizing the execution times of DAGs on multi-core processors under energy constraints. In: International conference on green computing. IEEE Press, New York, pp 49–60Palli K (2005) Scheduling dags for minimum finish time and power consumption on heterogeneous processors. Master’s thesis, Albers University, Albers, ALShaffer LR, Ritter JB, Meyer WL (1965) The critical-path method. McGraw-Hill, New YorkAlonso P, Dolz MF, Mayo R, Quintana-Ortí ES (2011) Improving power efficiency of dense linear algorithms on multi-core processors via slack control. Proceedings of the 2011 international conference on high performance computing & simulation (HPCS 2011). IEE Catzlog Number. CFP1178H-CDR, pp. 463–470Alonso P, Dolz MF, Mayo R, Quintana-Ortí ES (2011) Energy-aware scheduling of dense linear algebra operations on multi-core processors. Technical report 2011-04-01, Depto. de Ingeniería y Ciencia de los Computadores, Universitat Jaume I, April 2011Li R, Huang HC (2007) List scheduling for jobs with arbitrary release times and similar lengths. J Sched 10(6):365–373Mtibaa A, Ouni B, Abid M (2007) An efficient list scheduling algorithm for time placement problem. Comput Electr Eng 33(4):285–29

Crossref

Repositori Institucional de la Universitat Jaume I

RiuNet

Energy-efficient execution of dense linear algebra algorithms on multi-core processors

Author: Alonso-Jordá Pedro
Dolz Zaragozá Manuel Francisco
Mayo Rafael
Quintana-Ortí Enrique S.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/05/2012
Field of study

This paper addresses the efficient exploitation of task-level parallelism, present in many dense linear alge- bra operations, from the point of view of both computa- tional performance and energy consumption. The strategies considered here, referred to as the Slack Reduction Algo- rithm (SRA) and the Race-to-Idle Algorithm (RIA), adjust the operation frequency of the cores during the execution of a collection of tasks (in which many dense linear alge- bra algorithms can be decomposed) with very different ap- proaches to save energy. The procedures are evaluated using an energy-aware simulator, which is in charge of schedul- ing/mapping the execution of these tasks to the cores, lever- aging dynamic frequency voltage scaling featured by current technology. Experiments with this tool and the practical in- tegration of the RIA strategy into a runtime show the energy gains for two versions of the QR factorization.This work was supported by project CICYT TIN2011-23283 and FEDER.Alonso-Jordá, P.; Dolz Zaragozá, MF.; Mayo, R.; Quintana-Ortí, ES. (2013). Energy-efficient execution of dense linear algebra algorithms on multi-core processors. Cluster Computing. 16(3):497-509. https://doi.org/10.1007/s10586-012-0215-xS497509163Borkar, S., Chien, A.: The future of microprocessors. Commun. ACM 54, 67–77 (2011)Esmaeilzadeh, H., Blem, E., Amant, R.St., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. In: Proceeding of the 38th Annual International Symposium on Computer Architecture, ISCA’11, New York, NY, USA, pp. 365–376. ACM Press, New York (2011)Dongarra, J., Beckman, P., Moore, T., Aerts, P., Aloisio, G., Andre, J.C., Barkai, D., Berthou, J.Y., Boku, T., Braunschweig, B., et al.: The international exascale software project roadmap. Int. J. High Perform. Comput. Appl. 25(1), 3 (2011)Duranton, M., et al.: The HiPEAC vision (2010). Available from http://www.hipeac.net/roadmapFeng, W.-c., Feng, X., Ce, R.: Green supercomputing comes of age. IT Prof. 10(1), 17–23 (2008)Hsu, C., Feng, W.: A feasibility analysis of power awareness in commodity-based high-performance clusters. In: Cluster 2005 (2005)Albers, S.: Energy-efficient algorithms. Commun. ACM 53, 86–96 (2010)Cilk project home page (2012). http://supertech.csail.mit.edu/cilk/SMP superscalar project home page (2012). http://www.bsc.es/plantillaG.php?cat_id=385StarPU project home page (2012). http://runtime.bordeaux.inria.fr/StarPU/Van Zee, F.G.: libflame: The Complete Reference (2009). www.lulu.comAnderson, E., Bai, Z., Bischof, C., Blackford, L.S., Demmel, J., Dongarra, J.J., Du Croz, J., Hammarling, S., Greenbaum, A., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. SIAM, Philadelphia (1999)PLASMA project home page (2012). http://icl.cs.utk.edu/plasma/Alonso, P., Dolz, M.F., Mayo, R., Quintana-Ortí, E.S.: Improving power efficiency on multi-core processors via slack control. In: Proceedings of the 2011 International Conference on High Performance Computing & Simulation (HPCS 2011). IEE Catalog Number CFP1178H-CDR, pp. 463–470 (2011)Alonso, P., Dolz, M.F., Igual, F., Mayo, R., Quintana-Ortí, E.S.: DVFS-control techniques for dense linear algebra operations on multi-core processors. Comput. Sci. Res. Dev., 1–10 (2011). doi: 10.1007/s00450-011-0188-7Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. The Johns Hopkins University Press, Baltimore (1996)Gunter, B.C., van de Geijn, R.A.: Parallel out-of-core computation and updating the QR factorization. ACM Trans. Math. Softw. 31(1), 60–78 (2005)Etinski, M., Corbalán, J., Labarta, J., Valero, M.: Utilization driven power-aware parallel job scheduling. Comput. Sci. Res. Dev. 25(3–4), 207–216 (2010)Yao, F., Demers, A., Shenker, S.: A scheduling model for reduced cpu energy. In: Proceedings of the 36th Annual Symposium on Foundations of Computer Science, FOCS’95, Washington, DC, USA, p. 374. IEEE Computer Society, Los Alamitos (1995)Manzak, A., Chakrabarti, C.: Variable voltage task scheduling for minimizing energy or minimizing power. In: Proceedings on IEEE International Conference of the Acoustics, Speech, and Signal Processing, 2000, Washington, DC, USA, vol. 06, pp. 3239–3242. IEEE Computer Society, Los Alamitos (2000)Gruian, F., Kuchcinski, K.: Lenes: task scheduling for low-energy systems using variable supply voltage processors. In: Proceedings of the 2001 Asia and South Pacific Design Automation Conference, ASP-DAC’01, New York, NY, USA, pp. 449–455. ACM Press, New York (2001)Martin, S.M., Flautner, K., Mudge, T., Blaauw, D.: Combined dynamic voltage scaling and adaptive body biasing for lower power microprocessors under dynamic workloads. In: Proceedings of the 2002 IEEE/ACM International Conference on Computer-aided Design, ICCAD’02, New York, NY, USA, pp. 721–725. ACM Press, New York (2002)Zhang, Y., Hu, X.S., Chen, D.Z.: Task scheduling and voltage selection for energy minimization. In: Proceedings of the 39th Annual Design Automation Conference, DAC’02, New York, NY, USA, pp. 183–188. ACM Press, New York (2002)Robert, Y., Parashar, M., Badrinath, R., Prasanna, V.K.: High performance computing—HiPC 2006. In: Proceedings of 13th International Conference, Bangalore, India, December 18–21, 2006. Lecture Notes in Computer Science, vol. 4297. Springer, Berlin (2006)Lee, Y.C., Zomaya, A.Y.: Minimizing energy consumption for precedence-constrained applications using dynamic voltage scaling. In: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid-Volume 00, pp. 92–99. IEEE Computer Society, Los Alamitos (2009)Kimura, H., Sato, M., Hotta, Y., Boku, T., Takahashi, D.: Empirical study on reducing energy of parallel programs using slack reclamation by DVFS in a power-scalable high performance cluster. In: IEEE International Conference on Cluster Computing, 2006, pp. 1–10. IEEE Press, New York (2007)Shekar, V., Izadi, B.: Energy aware scheduling for DAG structured applications on heterogeneous and DVS enabled processors. In: International Conference on Green Computing, pp. 495–502. IEEE Press, New York (2010)King, D., Ahmad, I., Sheikh, H.F.: Stretch and compress based re-scheduling techniques for minimizing the execution times of DAGs on multi-core processors under energy constraints. In: International Conference on Green Computing, pp. 49–60. IEEE Press, New York (2010)Palli, K.: Scheduling dags for minimum finish time and power consumption on heterogeneous processors. Master’s thesis, Albers University, Albers, AL (2005)Shaffer, L.R., Ritter, J.B., Meyer, W.L.: The Critical-Path Method. McGraw-Hill, New York (1965)Li, R., Huang, H.C.: List scheduling for jobs with arbitrary release times and similar lengths. J. Sched. 10(6), 365–373 (2007)Mtibaa, A., Ouni, B., Abid, M.: An efficient list scheduling algorithm for time placement problem. Comput. Electr. Eng. 33(4), 285–298 (2007)Quintana-Ortí, G., Quintana-Ortí, E.S., van de Geijn, R.A., Van Zee, F.G., Chan, E.: Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Trans. Math. Softw. 36(3), 14:1–14:26 (2009

Crossref

Repositori Institucional de la Universitat Jaume I

RiuNet