19,470 research outputs found
A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs
Sorting is at the core of many database operations, such as index creation,
sort-merge joins, and user-requested output sorting. As GPUs are emerging as a
promising platform to accelerate various operations, sorting on GPUs becomes a
viable endeavour. Over the past few years, several improvements have been
proposed for sorting on GPUs, leading to the first radix sort implementations
that achieve a sorting rate of over one billion 32-bit keys per second. Yet,
state-of-the-art approaches are heavily memory bandwidth-bound, as they require
substantially more memory transfers than their CPU-based counterparts.
Our work proposes a novel approach that almost halves the amount of memory
transfers and, therefore, considerably lifts the memory bandwidth limitation.
Being able to sort two gigabytes of eight-byte records in as little as 50
milliseconds, our approach achieves a 2.32-fold improvement over the
state-of-the-art GPU-based radix sort for uniform distributions, sustaining a
minimum speed-up of no less than a factor of 1.66 for skewed distributions.
To address inputs that either do not reside on the GPU or exceed the
available device memory, we build on our efficient GPU sorting approach with a
pipelined heterogeneous sorting algorithm that mitigates the overhead
associated with PCIe data transfers. Comparing the end-to-end sorting
performance to the state-of-the-art CPU-based radix sort running 16 threads,
our heterogeneous approach achieves a 2.06-fold and a 1.53-fold improvement for
sorting 64 GB key-value pairs with a skewed and a uniform distribution,
respectively.Comment: 16 pages, accepted at SIGMOD 201
Preparation of a geologic photo map and hydrologic study of the Yemen Arab Republic
There are no author-identified significant results in this report
Charged Higgs phenomenology in the flipped two Higgs doublet model
We study the phenomenology of the charged Higgs boson in the "flipped" two
Higgs doublet model, in which one doublet gives mass to up-type quarks and
charged leptons and the other gives mass to down-type quarks. We present the
charged Higgs branching ratios and summarize the indirect constraints. We
extrapolate existing LEP searches for H+H- and Tevatron searches for t tbar
with t --> H+ b into the flipped model and extract constraints on MH+ and the
parameter tan(beta). We finish by reviewing existing LHC charged Higgs searches
and suggest that the LHC reach in this model could be extended for charged
Higgs masses below the tb threshold by considering t tbar with t --> H+ b and
H+ --> q qbar, as has been used in Tevatron searches.Comment: 23 pages, 7 figures. V2: added refs on H+W- associated productio
Facet ridge end points in crystal shapes
Equilibrium crystal shapes (ECS) near facet ridge end points (FRE) are
generically complex. We study the body-centered solid-on-solid model on a
square lattice with an enhanced uniaxial interaction range to test the
stability of the so-called stochastic FRE point where the model maps exactly
onto one dimensional Kardar-Parisi-Zhang type growth and the local ECS is
simple. The latter is unstable. The generic ECS contains first-order ridges
extending into the rounded part of the ECS, where two rough orientations
coexist and first-order faceted to rough boundaries terminating in
Pokrovsky-Talapov type end points.Comment: Contains 4 pages, 5 eps figures. Uses RevTe
Unitary ambiguity in the extraction of the E2/M1 ratio for the transition
The resonant electric quadrupole amplitude in the transition is of great interest for the understanding of
baryon structure. Various dynamical models have been developed to extract it
from the corresponding photoproduction multipole of pions on nucleons. It is
shown that once such a model is specified, a whole class of unitarily
equivalent models can be constructed, all of them providing exactly the same
fit to the experimental data. However, they may predict quite different
resonant amplitudes. Therefore, the extraction of the E2/M1() ratio (bare or dressed) which is based on a dynamical
model using a largely phenomenological interaction is not unique.Comment: 10 pages revtex including 4 postscript figure
Recommended from our members
Learning under Distributed Weak Supervision
The availability of training data for supervision is a frequently encountered bottleneck of medical image analysis methods. While typically established by a clinical expert rater, the increase in acquired imaging data renders traditional pixel-wise segmentations less feasible. In this paper, we examine the use of a crowdsourcing platform for the distribution of super-pixel weak annotation tasks and collect such annotations from a crowd of non-expert raters. The crowd annotations are subsequently used for training a fully convolutional neural network to address the problem of fetal brain segmentation in T2-weighted MR images. Using this approach we report encouraging results compared to highly targeted, fully supervised methods and potentially address a frequent problem impeding image analysis research
Periodic solutions of a delayed predator-prey model with stage structure for predator
A periodic time-dependent Lotka-Volterra-type predator-prey model
with stage structure for the predator and time delays due to
negative feedback and gestation is investigated. Sufficient
conditions are derived, respectively, for the existence and global
stability of positive periodic solutions to the proposed model
On rigidly rotating perfect fluid cylinders
The gravitational field of a rigidly rotating perfect fluid cylinder with
gamma- law equation of state is found analytically. The solution has two
parameters and is physically realistic for gamma in the interval (1.41,2].
Closed timelike curves always appear at large distances.Comment: 10 pages, Revtex (galley
Reconstruction of environmental histories to investigate patterns of larval radiated shanny (Ulvaria subbifurcata) growth and selective survival in a large bay of Newfoundland
We used otolith microstructure analysis to reconstruct the growth histories of larval radiated shanny ( Ulvaria subbifurcata ) collected over a 2-week period in Trinity Bay, Newfoundland. A dynamic 3-dimensional, eddy-resolving circulation model of the region provided larval drift patterns, which were combined with measurements of temperature and zooplankton abundance to assess the environmental history of the larvae. The abundance of juvenile and adult capelin ( Mallotus villosus ), the dominant planktivorous fish in this area, was monitored using five hydroacoustic surveys. The goal was to determine whether environmental histories are helpful in explaining spatial and temporal differences in larval shanny growth, measured as cumulative distribution functions (CDF) of growth rates. We found evidence for a selective loss of slower growing individuals and recognized considerable spatial differences in the CDF of larval growth rates. Consistent patterns in capelin abundance suggested that faster growing survivors, sampled at the end of the 2-week period, developed in areas of low predator densities. A dome-shaped relationship between temperature and larval growth was observed, explaining a significant but small amount of the overall variability (14%). Effects of experienced prey concentrations on larval growth rates could not be demonstrated
E(5), X(5), and Prolate to Oblate Shape Phase Transitions in Relativistic Hartree Bogoliubov Theory
Relativistic mean field theory with the NL3 force is used for producing
potential energy surfaces (PES) for series of isotopes suggested as exhibiting
critical point symmetries. Relatively flat PES are obtained for nuclei showing
the E(5) symmetry, while in nuclei corresponding to the X(5) case, PES with a
bump are obtained. The PES corresponding to the Pt chain of isotopes suggest a
transition from prolate to oblate shapes at 186-Pt.Comment: 21 pages, LaTeX, including 14 .eps figure
- …