256 research outputs found

    Inflammation in Cachexia

    Full text link

    Empirical Installation of Linear Algebra Shared-Memory Subroutines for Auto-Tuning

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10766-013-0249-6The introduction of auto-tuning techniques in linear algebra shared-memory routines is analyzed. Information obtained in the installation of the routines is used at running time to take some decisions to reduce the total execution time. The study is carried out with routines at different levels (matrix multiplication, LU and Cholesky factorizations and linear systems symmetric or general routines) and with calls to routines in the LAPACK and PLASMA libraries with multithread implementations. Medium NUMA and large cc-NUMA systems are used in the experiments. This variety of routines, libraries and systems allows us to obtain general conclusions about the methodology to use for linear algebra shared-memory routines auto-tuning. Satisfactory execution times are obtained with the proposed methodology.Partially supported by Fundacion Seneca, Consejeria de Educacion de la Region de Murcia, 08763/PI/08, PROMETEO/2009/013 from Generalitat Valenciana, the Spanish Ministry of Education and Science through TIN2012-38341-C04-03, and the High-Performance Computing Network on Parallel Heterogeneus Architectures (CAPAP-H). The authors gratefully acknowledge the computer resources and assistance provided by the Supercomputing Centre of the Scientific Park Foundation of Murcia and by the Centre de Supercomputacio de Catalunya.Cámara, J.; Cuenca, J.; Giménez, D.; García, LP.; Vidal Maciá, AM. (2014). Empirical Installation of Linear Algebra Shared-Memory Subroutines for Auto-Tuning. International Journal of Parallel Programming. 42(3):408-434. https://doi.org/10.1007/s10766-013-0249-6S408434423Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P., Tomov, S.: Numerical linear algebra on emerging architectures: the PLASMA and MAGMA projects. J. Phys. Conf. Ser. 180(1), 1–5 (2009)Alberti, P., Alonso, P., Vidal, A.M., Cuenca, J., Giménez, D.: Designing polylibraries to speed up linear algebra computations. Int. J. High Perform. Comput. Netw. 1/2/3(1), 75–84 (2004)Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J.J., Du Croz, J., Grenbaum, A., Hammarling, S., McKenney, A., Ostrouchov, S., D. Sorensen, S.: LAPACK User’s Guide. Society for Industrial and Applied Mathematics, Philadelphia (1995)Bernabé, G., Cuenca, J., Giménez, D.: Optimization techniques for 3D-FWT on systems with manycore GPUs and multicore CPUs. In: ICCS (2013)Buttari, A., Langou, J., Kurzak, J., Dongarra, J.: A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Comput. 35(1), 38–53 (2009)Cámara, J., Cuenca, J., Giménez, D., Vidal. A.M.: Empirical autotuning of two-level parallel linear algebra routines on large cc-NUMA systems. In: ISPA (2012)Caron, E., Desprez, F., Suter, F.: Parallel extension of a dynamic performance forecasting tool. Scalable Comput. Pract. Exp. 6(1), 57–69 (2005)Chen, Z., Dongarra, J., Luszczek, P., Roche, K.: Self adapting software for numerical linear algebra and LAPACK for clusters. Parallel Comput. 29, 1723–1743 (2003)Cuenca, J., Giménez, D., González, J.: Achitecture of an automatic tuned linear algebra library. Parallel Comput. 30(2), 187–220 (2004)Cuenca, J., García, L.P., Giménez, D.: Improving linear algebra computation on NUMA platforms through auto-tuned nested parallelism. In: Proceedings of the 2012 EUROMICRO Conference on Parallel, Distributed and Network Processing (2012)Frigo, M.: FFTW: An adaptive software architecture for the FFT. In: Proceedings of the ICASSP Conference, vol. 3, p. 1381 (1998)Golub, G., Van Loan, C.F.: Matrix Computations, 3rd edn. The John Hopkins University Press, Baltimore (1996)Im, E.-J., Yelick, K., Vuduc, R.: Sparsity: optimization framework for sparse matrix kernels. Int. J. High Perform. Comput. Appl. (IJHPCA) 18(1), 135–158 (2004)Intel MKL web page.: http://software.intel.com/en-us/intel-mkl/Jerez, S., Montávez, J.-P., Giménez, D.: Optimizing the execution of a parallel meteorology simulation code. In: Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium. IEEE (2009)Katagiri, T., Kise, K., Honda, H., Yuba, T.: Fiber: a generalized framework for auto-tuning software. Springer LNCS 2858, 146–159 (2003)Katagiri, T., Kise, K., Honda, H., Yuba, T.: ABCLib-DRSSED: a parallel eigensolver with an auto-tuning facility. Parallel Comput. 32(3), 231–250 (2006)Kurzak, J., Tomov, S., Dongarra, J.: Autotuning gemm kernels for the FERMI GPU. IEEE Trans. Parallel Distrib. Syst. 23(11), 2045–2057 (2012)Lastovetsky, A.L., Reddy, R., Higgins, R.: Building the functional performance model of a processor. In: SAC, pp. 746–753 (2006)Li, J., Skjellum, A., Falgout, R.D.: A poly-algorithm for parallel dense matrix multiplication on two-dimensional process grid topologies. Concurrency Pract. Exp. 9(5), 345–389 (1997)Naono, K., Teranishi, K., Cavazos, J., Suda, R., (eds.): Software Automatic Tuning. From Concepts to State-of-the-Art Results. Springer, Berlin (2010)Nath, R., Tomov, S., Dongarra, J.: An improved MAGMA gemm for FERMI graphics processing units. IJHPCA 24(4), 511–515 (2010)Petitet, A., Blackford, L.S., Dongarra, J., Ellis, B., Fagg, G.E., Roche, K., Vadhiyar, S.S.: Numerical libraries and the grid. IJHPCA 15(4), 359–374 (2001)PLASMA.: http://icl.cs.utk.edu/plasma/Püschel, M., Moura, J.M.F., Singer, B., Xiong, J., Johnson, J.R., Padua, D.A., Veloso, M.M., Johnson, R.W.: Spiral: a generator for platform-adapted libraries of signal processing algorithms. IJHPCA 18(1), 21–45 (2004)Seshagiri, L., Wu, M.-S., Sosonkina, M., Zhang, Z., Gordon, M.S., Schmidt, M.W.: Enhancing adaptive middleware for quantum chemistry applications with a database framework. In: IPDPS Workshops, pp. 1–8 (2010)Tanaka, T., Katagiri, T., Yuba, T.: d-Spline based incremental parameter estimation in automatic performance tuning. In: PARA, pp. 986–995 (2006)Vuduc, R., Demmel, J., Bilmes, J.: Statistical models for automatic performance tuning. In: International Conference on Computational Science (1), pp. 117–126 (2001)Whaley, R.C., Petitet, A., Dongarra, J.: Automated empirical optimizations of software and the ATLAS project. Parallel Comput. 27(1–2), 3–35 (2001

    Incidental prostate cancer prevalence at radical cystoprostatectomy—importance of the histopathological work-up

    Get PDF
    The reported incidental prostate cancer prevalence rates at radical cystoprostatectomy cover a range from 4 to 60%. We investigated the influence of the histopathological work-up on prostate cancer prevalence rates. We identified 114 patients who had undergone cystoprostatectomy for bladder cancer between 2000 and 2012. Complete histopathological assessment was defined as follows: (i) complete embedding of the prostate gland, (ii) sectioning of 15 or more prostate sections, and (iii) processing as whole mount slides. Prostate cancer prevalence rates derived from complete and incomplete histopathological assessments were compared. The overall prostate cancer prevalence rate was 59.6%. A mean of 14.4 macroscopic tissue sections (thickness 3-5mm) were sectioned. Sectioning ≥15 sections resulted in a prostate cancer detection rate of 75%, compared to 42.6% when sectioning <15 sections (p < 0.001). Complete embedding yielded a prostate cancer detection rate of 72.3 and of 23.1% for partly embedded prostates (p < 0.0001). Prostate cancer was detected in 68.8% of the whole mounted samples and in 38.2% of the samples sectioned as standard slides (p < 0.01); according to the criteria described by Epstein and Ohori, 44.1% of the detected prostate cancers were clinically significant. The quality of the histopathological work-up significantly influences prostate cancer detection rates and might at least partially explain the highly variable reported incidental prostate cancer prevalence rates at cystoprostatectomy (CP). The high proportion of significant prostate cancer found in our series calls for a careful surgical approach to the prostate during CP
    • …
    corecore