266 research outputs found

    A fast method to compute orthogonal loadings partial least squares

    Get PDF
    \Ve give a computationally fast method for the orthogonal loadings partial least squares. Our algorithm avoids the multiple regression computations at each step and yields identical scores and loadings to the usual method. We give a proof of the equivalence to the standard algorithm. "Ve discuss briefiy the computational advantages over both orthogonal scores and orthogonal loadings partial least squares

    Nonparametric estimation of a mixing density via the kernel method

    Get PDF
    We present a method to estimate the latent distribution for a mixture model. Our method is motivated by the standard kernel density estimation but instead of using an estimate based on the unobserved latent variables, we take the expectation with respect to their distribution conditional on the data. The resulting estimator is continuous and, hence, is appropriate when there is a strong belief in the continuity of the mixing distribution. We present an asymptotic justification and we discuss the associated computational problems. The method is illustrated by an example of fission track analysis where we estimate the densi ty of the age of crystals

    Explaining the saddlepoint approximation

    Get PDF
    Saddlepoint approximations are powerful tools for obtaining accurate expressions for densities and distribution functions. \Ve give an elementary motivation and explanation of saddlepoint approximation techniques, stressing the connection with the familiar Taylor series expansions and the Laplace approximation of integrals. Saddlepoint methods are applied to the convolution of simple densities and, using the Fourier inversion formula, the saddlepoint approximation to the density of a random variable is derived. \Ve then apply the method to densities of sample means of iid random variables, and also demonstrate the technique for approximating the density of a maximum likelihood estimator in exponential families

    A parametric model for heterogeneity in paired poisson counts

    Get PDF
    \Ve present a model for data in the form of match pairs of counts. Our work is motivated by a problem in fission track analysis, where the determination of a crystal age is based on the ratio of counts of spontaneous and induced tracks. It is often reasonable to assume that the counts follow a Poisson distribution but, typically, they are overdispersed and there exists a positive correlation between the numbers of spontaneous and induced tracks at the same crystal. We propose a model that allows for both overdispersion and correlation by assuming that the mean densities follow a bivariate Wishart distribution. Our model is quite general, having the usual negative binomial or Poisson models as special cases. \Ve propose a maximum likelihood estimation method based on a stochastic implementation of the EM algorithm and we derive the asymptotic standard errors of the parameter estimates. vVe illustrate the method by a data set of fission tracks counts in matched areas of zircon crystals

    A Matrix--Matrix Multiplication methodology for single/multi-core architectures using SIMD

    Get PDF
    In this paper, a new methodology for speeding up Matrix–Matrix Multiplication using Single Instruction Multiple Data unit, at one and more cores having a shared cache, is presented. This methodology achieves higher execution speed than ATLAS state of the art library (speedup from 1.08 up to 3.5), by decreasing the number of instructions (load/store and arithmetic) and the data cache accesses and misses in thememory hierarchy. This is achieved by fully exploiting the software characteristics (e.g. data reuse) and hardware parameters (e.g. data caches sizes and associativities) as one problem and not separately, giving high quality solutions and a smaller search space

    Array size computation under uniform overlapping and irregular accesses

    Get PDF
    The size required to store an array is crucial for an embedded system, as it affects the memory size, the energy per memory access, and the overall system cost. Existing techniques for finding the minimum number of resources required to store an array are less efficient for codes with large loops and not regularly occurring memory accesses. They have to approximate the accessed parts of the array leading to overestimation of the required resources. Otherwise, their exploration time is increased with an increase over the number of the different accessed parts of the array. We propose a methodology to compute the minimum resources required for storing an array which keeps the exploration time low and provides a near-optimal result for regularly and non-regularly occurring memory accesses and overlapping writes and reads

    A template-based methodology for efficient microprocessor and FPGA accelerator co-design

    Get PDF
    Embedded applications usually require Software/Hardware (SW/HW) designs to meet the hard timing constraints and the required design flexibility. Exhaustive exploration for SW/HW designs is a very time consuming task, while the adhoc approaches and the use of partially automatic tools usually lead to less efficient designs. To support a more efficient codesign process for FPGA platforms we propose a systematic methodology to map an application to SW/HW platform with a custom HW accelerator and a microprocessor core. The methodology mapping steps are expressed through parametric templates for the SW/HW Communication Organization, the Foreground (FG) Memory Management and the Data Path (DP) Mapping. Several performance-area tradeoff design Pareto points are produced by instantiating the templates. A real-time bioimaging application is mapped on a FPGA to evaluate the gains of our approach, i.e. 44,8% on performance compared with pure SW designs and 58% on area compared with pure HW designs

    Area-throughput trade-offs for SHA-1 and SHA-256 hash functions’ pipelined designs

    Get PDF
    High-throughput designs of hash functions are strongly demanded due to the need for security in every transmitted packet of worldwide e-transactions. Thus, optimized and non-optimized pipelined architectures have been proposed raising, however, important questions. Which is the optimum number of the pipeline stages? Is it worth to develop optimized designs or could the same results be achieved by increasing only the pipeline stages of the non-optimized designs? The paper answers the above questions studying extensively many pipelined architectures of SHA-1 and SHA-256 hashes, implemented in FPGAs, in terms of throughput/area (T/A) factor. Also, guides for developing efficient security schemes designs are provided. Read More: https://www.worldscientific.com/doi/abs/10.1142/S021812661650032
    corecore