6,318 research outputs found
A Tuned and Scalable Fast Multipole Method as a Preeminent Algorithm for Exascale Systems
Among the algorithms that are likely to play a major role in future exascale
computing, the fast multipole method (FMM) appears as a rising star. Our
previous recent work showed scaling of an FMM on GPU clusters, with problem
sizes in the order of billions of unknowns. That work led to an extremely
parallel FMM, scaling to thousands of GPUs or tens of thousands of CPUs. This
paper reports on a a campaign of performance tuning and scalability studies
using multi-core CPUs, on the Kraken supercomputer. All kernels in the FMM were
parallelized using OpenMP, and a test using 10^7 particles randomly distributed
in a cube showed 78% efficiency on 8 threads. Tuning of the
particle-to-particle kernel using SIMD instructions resulted in 4x speed-up of
the overall algorithm on single-core tests with 10^3 - 10^7 particles. Parallel
scalability was studied in both strong and weak scaling. The strong scaling
test used 10^8 particles and resulted in 93% parallel efficiency on 2048
processes for the non-SIMD code and 54% for the SIMD-optimized code (which was
still 2x faster). The weak scaling test used 10^6 particles per process, and
resulted in 72% efficiency on 32,768 processes, with the largest calculation
taking about 40 seconds to evaluate more than 32 billion unknowns. This work
builds up evidence for our view that FMM is poised to play a leading role in
exascale computing, and we end the paper with a discussion of the features that
make it a particularly favorable algorithm for the emerging heterogeneous and
massively parallel architectural landscape
Cross-Disciplinary Analysis of the On-Farm Transition from Conventional to Organic Vegetable Production
This farm-scale analysis of the three-year transition to organic from conventional vegetable production tracked the changes in crop, soil, pest and management on two ranches (40 and 47 ha) in the Salinas Valley, California. Many small plantings of a diverse set of cash crop and cover crop species were used, as compared to only a few species in large monocultures in conventional production. The general trends with time were: increase in soil biological indicators, low soil nitrate pools, adequate crop nutrients, minor disease and weed problems, and sporadic mild insect damage. Some crops and cultivars consistently produced higher yields than others, relative to the maximum yield for a given crop. Differences in insect and disease damage were also observed. These results support the value of initially using a biodiverse set of taxa to reduce risk, then later choosing the best-suited varieties for optimal production. The grower used some principles of organic farming (e.g., crop diversity, crop rotation, and organic matter management), but also relied on substitution-based management, such as fertigation with soluble nutrients, initially heavy applications of organic pesticides, and use of inputs derived from off-farm sources. The organic transition was conducive to both production goals and environmental quality
Computing the k-th Eigenvalue of Symmetric -Matrices
The numerical solution of eigenvalue problems is essential in various
application areas of scientific and engineering domains. In many problem
classes, the practical interest is only a small subset of eigenvalues so it is
unnecessary to compute all of the eigenvalues. Notable examples are the
electronic structure problems where the -th smallest eigenvalue is closely
related to the electronic properties of materials. In this paper, we consider
the -th eigenvalue problems of symmetric dense matrices with low-rank
off-diagonal blocks. We present a linear time generalized LDL decomposition of
matrices and combine it with the bisection eigenvalue algorithm
to compute the -th eigenvalue with controllable accuracy. In addition, if
more than one eigenvalue is required, some of the previous computations can be
reused to compute the other eigenvalues in parallel. Numerical experiments show
that our method is more efficient than the state-of-the-art dense eigenvalue
solver in LAPACK/ScaLAPACK and ELPA. Furthermore, tests on electronic state
calculations of carbon nanomaterials demonstrate that our method outperforms
the existing HSS-based bisection eigenvalue algorithm on 3D problems.Comment: 14 pages, 11 figure
Locking Local Oscillator Phase to the Atomic Phase via Weak Measurement
We propose a new method to reduce the frequency noise of a Local Oscillator
(LO) to the level of white phase noise by maintaining (not destroying by
projective measurement) the coherence of the ensemble pseudo-spin of atoms over
many measurement cycles. This scheme uses weak measurement to monitor the phase
in Ramsey method and repeat the cycle without initialization of phase and we
call, "atomic phase lock (APL)" in this paper. APL will achieve white phase
noise as long as the noise accumulated during dead time and the decoherence are
smaller than the measurement noise. A numerical simulation confirms that with
APL, Allan deviation is averaged down at a maximum rate that is proportional to
the inverse of total measurement time, tau^-1. In contrast, the current atomic
clocks that use projection measurement suppress the noise only down to the
level of white frequency, in which case Allan deviation scales as tau^-1/2.
Faraday rotation is one of the possible ways to realize weak measurement for
APL. We evaluate the strength of Faraday rotation with 171Yb+ ions trapped in a
linear rf-trap and discuss the performance of APL. The main source of the
decoherence is a spontaneous emission induced by the probe beam for Faraday
rotation measurement. One can repeat the Faraday rotation measurement until the
decoherence become comparable to the SNR of measurement. We estimate this
number of cycles to be ~100 cycles for a realistic experimental parameter.Comment: 18 pages, 7 figures, submitted to New Journal of Physic
Leaky Lamb Wave Along VCR Magnetic Tapes
High recording density with the home-use digital VCRs requires the use of narrow tracks, short recording wavelength, and thin magnetic tapes. Knowledge of Young’s modulus of the tape is essential for the precise positioning of the tape on the rotating drums and then a stable tape-to-head interface. The magnetic tapes usually show different Young’s moduli for the machine direction (MD) and the transverse direction (TD) [1]. The anisotropy develops mainly in the base film of polyethylene terephthalate (PET) through the partial crystallization and the crystallite orientation alignment during the stretching process on the tapes [2], while the original PET sheet, from which the tapes are cut, shows much less anisotropy. This situation requires the determination of Young’s moduli for both MD and TD of the tape. The tapes on play are straightened by tensile loads, which should be controlled with Young’s modulus for the MD. Too much load may distort the recorded tracks or damage the tape. Besides, the vertical load is applied onto both edges of the running tape by the guiding rollers. Again, too much load may induce the tape buckling. Critical load is proportional to the Young’s modulus in the TD. Large moduli are desirable for both directions
Ensiling Characteristics of Sudangrass Silage Treated with Green Tea Leaf Waste or Green Tea Polyphenols
Green tea waste (GTW), emitted from beverage companies manufacturing tea drinks, contains high crude protein (CP) and polyphenols. Kondo et al. (2004) showed that GTW addition to forage ensiling enhanced lactic acid fermentation and decreased pH value. Ishihara et al. (2001) showed that high counts of Lactobacillus species were maintained and the counts of clostridia were decreased in the intestinal microflora of animals fed the diet containing green tea polyphenols (GTP). It is hypothesised that GTP might activate lactic acid bacteria and enhance silage fermentation. This study was conducted to evaluate the potential of GTW and GTP as silage additives and explored the mechanisms of enhanced lactic acid fermentation by GTW
- …