6,783 research outputs found
New Structure In The Shapley Supercluster
We present new radial velocities for 189 galaxies in a 91 sq. deg region of
the Shapley supercluster measured with the FLAIR-II spectrograph on the UK
Schmidt Telescope. The data reveal two sheets of galaxies linking the major
concentrations of the supercluster. The supercluster is not flattened in
Declination as was suggested previously and it may be at least 30 percent
larger than previously thought with a correspondingly larger contribution to
the motion of the Local Group.Comment: LaTex: 2 pages, 1 figure, includes conf_iap.sty style file. To appear
in proceedings of The 14th IAP Colloquium: Wide Field Surveys in Cosmology,
held in Paris, 1998 May 26--30, eds. S.Colombi, Y.Mellie
Light to Mass Variations with Environment
Large and well defined variations exist between the distribution of mass and
the light of stars on extragalactic scales. Mass concentrations in the range
10^12 - 10^13 M_sun manifest the most light per unit mass. Group halos in this
range are typically the hosts of spiral and irregular galaxies with ongoing
star formation. On average M/L_B ~ 90 M_sun/L_sun in these groups . More
massive halos have less light per unit mass. Within a given mass range, halos
that are dynamically old as measured by crossing times and galaxy morphologies
have distinctly less light per unit mass. At the other end of the mass
spectrum, below 10^12 M_sun, there is a cutoff in the manifestation of light.
Group halos in the range 10^11 - 10^12 M_sun can host dwarf galaxies but with
such low luminosities that M/L_B values can range from several hundred to
several thousand. It is suspected that there must be completely dark halos at
lower masses. Given the form of the halo mass function, it is the low relative
luminosities of the high mass halos that has the greatest cosmological
implications. Of order half the clustered mass may reside in halos with greater
than 10^14 M_sun. By contrast, only 5-10% of clustered mass would lie in
entities with less than 10^12 M_sun.Comment: 15 pages, 9 figures, 2 tables, Accepted Astrophysical Journal 619,
000, 2005 (Jan 1
Blocked algorithms for the reduction to Hessenberg-triangular form revisited
We present two variants of Moler and Stewart's algorithm for reducing a matrix pair to Hessenberg-triangular (HT) form with increased data locality in the access to the matrices. In one of these variants, a careful reorganization and accumulation of Givens rotations enables the use of efficient level 3 BLAS. Experimental results on four different architectures, representative of current high performance processors, compare the performances of the new variants with those of the implementation of Moler and Stewart's algorithm in subroutine DGGHRD from LAPACK, Dackland and Kågström's two-stage algorithm for the HT form, and a modified version of the latter which requires considerably less flop
Architecture-Aware Configuration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors
Asymmetric multicore processors (AMPs) have recently emerged as an appealing
technology for severely energy-constrained environments, especially in mobile
appliances where heterogeneity in applications is mainstream. In addition,
given the growing interest for low-power high performance computing, this type
of architectures is also being investigated as a means to improve the
throughput-per-Watt of complex scientific applications.
In this paper, we design and embed several architecture-aware optimizations
into a multi-threaded general matrix multiplication (gemm), a key operation of
the BLAS, in order to obtain a high performance implementation for ARM
big.LITTLE AMPs. Our solution is based on the reference implementation of gemm
in the BLIS library, and integrates a cache-aware configuration as well as
asymmetric--static and dynamic scheduling strategies that carefully tune and
distribute the operation's micro-kernels among the big and LITTLE cores of the
target processor. The experimental results on a Samsung Exynos 5422, a
system-on-chip with ARM Cortex-A15 and Cortex-A7 clusters that implements the
big.LITTLE model, expose that our cache-aware versions of gemm with asymmetric
scheduling attain important gains in performance with respect to its
architecture-oblivious counterparts while exploiting all the resources of the
AMP to deliver considerable energy efficiency
- …