6 research outputs found

    Analysis of A Splitting Approach for the Parallel Solution of Linear Systems on GPU Cards

    Full text link
    We discuss an approach for solving sparse or dense banded linear systems Ax=b{\bf A} {\bf x} = {\bf b} on a Graphics Processing Unit (GPU) card. The matrix A∈RNΓ—N{\bf A} \in {\mathbb{R}}^{N \times N} is possibly nonsymmetric and moderately large; i.e., 10000≀N≀50000010000 \leq N \leq 500000. The ${\it split\ and\ parallelize}( ({\tt SaP})approachseekstopartitionthematrix) approach seeks to partition the matrix {\bf A}intodiagonalsubβˆ’blocks into diagonal sub-blocks {\bf A}_i,, i=1,\ldots,P,whichareindependentlyfactoredinparallel.Thesolutionmaychoosetoconsiderortoignorethematricesthatcouplethediagonalsubβˆ’blocks, which are independently factored in parallel. The solution may choose to consider or to ignore the matrices that couple the diagonal sub-blocks {\bf A}_i.Thisapproach,alongwiththeKrylovsubspaceβˆ’basediterativemethodthatitpreconditions,areimplementedinasolvercalled. This approach, along with the Krylov subspace-based iterative method that it preconditions, are implemented in a solver called {\tt SaP::GPU},whichiscomparedintermsofefficiencywiththreecommonlyusedsparsedirectsolvers:, which is compared in terms of efficiency with three commonly used sparse direct solvers: {\tt PARDISO},, {\tt SuperLU},and, and {\tt MUMPS}.. {\tt SaP::GPU},whichrunsentirelyontheGPUexceptseveralstagesinvolvedinpreliminaryrowβˆ’columnpermutations,isrobustandcompareswellintermsofefficiencywiththeaforementioneddirectsolvers.InacomparisonagainstIntelβ€²s, which runs entirely on the GPU except several stages involved in preliminary row-column permutations, is robust and compares well in terms of efficiency with the aforementioned direct solvers. In a comparison against Intel's {\tt MKL},, {\tt SaP::GPU}alsofareswellwhenusedtosolvedensebandedsystemsthatareclosetobeingdiagonallydominant. also fares well when used to solve dense banded systems that are close to being diagonally dominant. {\tt SaP::GPU}$ is publicly available and distributed as open source under a permissive BSD3 license.Comment: 38 page

    Bayesian analysis of finite mixture distributions using the allocation sampler

    Get PDF
    Finite mixture distributions are receiving more and more attention from statisticians in many different fields of research because they are a very flexible class of models. They are typically used for density estimation or to model population heterogeneity. One can think of a finite mixture distribution as grouping the observations into components from which they are assumed to have arisen. In certain settings these groups have a physical interpretation. The interest in these distributions has been boosted recently because of the ever increasing computer power available to researchers to carry out the computationally intensive tasks required in their analysis. In order to fit a finite mixture distribution taking a Bayesian approach a posterior distribution has to be evaluated. When the number of components in the model is assumed known this posterior distribution can be sampled from using methods such as Data Augmentation or Gibbs sampling (Tanner and Wong (1987) and Gelfand and Smith (1990)) and the Metropolis-Hastings algorithm (Hastings (1970)). However, the number of components in the model can also be considered an unknown and an object of inference. Richardson and Green (1997) and Stephens (2000a) both describe Bayesian methods to sample across models with different numbers of components. This enables an estimate of the posterior distribution of the number of components to be evaluated. Richardson and Green (1997) define a reversible jump Markov chain Monte Carlo (RJMCMC) sampler while Stephens (2000a) uses a Markov birth-death process approach sample from the posterior distribution. In this thesis a Markov chain Monte Carlo method, named the allocation sampler. This sampler differs from the RJMCMC method reported in Richardson and Green (1997) because the state space of the sampler is simplified by the assumption that the components' parameters and weights can be analytically integrated out of the model. This in turn has the advantage that only minimal changes are required to the sampler for mixtures of components from other parametric families. This thesis illustrates the allocation sampler's performance on both simulated and real data sets. Chapter 1 provides a background to finite mixture distributions and gives an overview of some inferential techniques that have already been used to analyse these distributions. Chapter 2 sets out the Bayesian model framework that is used throughout this thesis and defines all the required distributional results. Chapter 3 describes the allocation sampler. Chapter 4 tests the performance of the allocation sampler using simulated datasets from a collection of 15 different known mixture distributions. Chapter 5 illustrates the allocation sampler with real datasets from a number of different research fields. Chapter 6 summarises the research in the thesis and provides areas of possible future research

    A Study of Image-based C-arm Tracking Using Minimal Fiducials

    Get PDF
    Image-based tracking of the c-arm continues to be a critical and challenging problem for many clinical applications due to its widespread use in many computer-assisted procedures that rely upon its accuracy for further planning, registration, and reconstruction tasks. In this thesis, a variety of approaches are presented to improve current c-arm tracking methods and devices for intra-operative procedures. The first approach presents a novel two-dimensional fiducial comprising a set of coplanar conics and an improved single-image pose estimation algorithm that addresses segmentation errors using a mathematical equilibration approach. Simulation results show an improvement in the mean rotation and translation errors by factors of 4 and 1.75, respectively, as a result of using the proposed algorithm. Experiments using real data obtained by imaging a simple precisely machined model consisting of three coplanar ellipses retrieve pose estimates that are in good agreement with those obtained by a ground truth optical tracker. This two-dimensional fiducial can be easily placed under the patient allowing a wide field of view for the motion of the c-arm. The second approach employs learning-based techniques to two-view geometrical theories. A demonstrative algorithm is used to simultaneously tackle matching and segmentation issues of features segmented from pairs of acquired images. The corrected features can then be used to retrieve the epipolar geometry which can ultimately provide pose parameters using a one-dimensional fiducial. The problem of match refinement for epipolar geometry estimation is formulated in a reinforcement-learning framework. Experiments demonstrate the ability to both reject false matches and fix small localization errors in the segmentation of true noisy matches in a minimal number of steps. The third approach presents a feasibility study for an approach that entirely eliminates the use of tracking fiducials. It relies only on preoperative data to initialize a point-based model that is subsequently used to iteratively estimate the pose and the structure of the point-like intraoperative implant using three to six images simultaneously. This method is tested in the framework of prostate brachytherapy in which preoperative data including planned 3-D locations for a large number of point-like implants called seeds is usually available. Simultaneous pose estimation for the c-arm for each image and localization of the seeds is studied in a simulation environment. Results indicate mean reconstruction errors that are less than 1.2 mm for noisy plans of 84 seeds or fewer. These are attained when the 3D mean error introduced to the plan as a result of adding Gaussian noise is less than 3.2 mm

    Algorithm 548: Solution of the Assignment Problem [H]

    No full text
    corecore