64,236 research outputs found
A 64-point Fourier transform chip for high-speed wireless LAN application using OFDM
In this article, we present a novel fixed-point 16-bit word-width 64-point FFT/IFFT processor developed primarily for the application in the OFDM based IEEE 802.11a Wireless LAN (WLAN) baseband processor. The 64-point FFT is realized by decomposing it into a 2-D structure of 8-point FFTs. This approach reduces the number of required complex multiplications compared to the conventional radix-2 64-point FFT algorithm. The complex multiplication operations are realized using shift-and-add operations. Thus, the processor does not use any 2-input digital multiplier. It also does not need any RAM or ROM for internal storage of coefficients. The proposed 64-point FFT/IFFT processor has been fabricated and tested successfully using our in-house 0.25 ?m BiCMOS technology. The core area of this chip is 6.8 mm2. The average dynamic power consumption is 41 mW @ 20 MHz operating frequency and 1.8 V supply voltage. The processor completes one parallel-to-parallel (i. e., when all input data are available in parallel and all output data are generated in parallel) 64-point FFT computation in 23 cycles. These features show that though it has been developed primarily for application in the IEEE 802.11a standard, it can be used for any application that requires fast operation as well as low power consumption
The Herschel Multi-tiered Extragalactic Survey: HerMES
The Herschel Multi-tiered Extragalactic Survey, HerMES, is a legacy program
designed to map a set of nested fields totalling ~380 deg^2. Fields range in
size from 0.01 to ~20 deg^2, using Herschel-SPIRE (at 250, 350 and 500 \mu m),
and Herschel-PACS (at 100 and 160 \mu m), with an additional wider component of
270 deg^2 with SPIRE alone. These bands cover the peak of the redshifted
thermal spectral energy distribution from interstellar dust and thus capture
the re-processed optical and ultra-violet radiation from star formation that
has been absorbed by dust, and are critical for forming a complete
multi-wavelength understanding of galaxy formation and evolution.
The survey will detect of order 100,000 galaxies at 5\sigma in some of the
best studied fields in the sky. Additionally, HerMES is closely coordinated
with the PACS Evolutionary Probe survey. Making maximum use of the full
spectrum of ancillary data, from radio to X-ray wavelengths, it is designed to:
facilitate redshift determination; rapidly identify unusual objects; and
understand the relationships between thermal emission from dust and other
processes. Scientific questions HerMES will be used to answer include: the
total infrared emission of galaxies; the evolution of the luminosity function;
the clustering properties of dusty galaxies; and the properties of populations
of galaxies which lie below the confusion limit through lensing and statistical
techniques.
This paper defines the survey observations and data products, outlines the
primary scientific goals of the HerMES team, and reviews some of the early
results.Comment: 23 pages, 17 figures, 9 Tables, MNRAS accepte
ArrayBridge: Interweaving declarative array processing with high-performance computing
Scientists are increasingly turning to datacenter-scale computers to produce
and analyze massive arrays. Despite decades of database research that extols
the virtues of declarative query processing, scientists still write, debug and
parallelize imperative HPC kernels even for the most mundane queries. This
impedance mismatch has been partly attributed to the cumbersome data loading
process; in response, the database community has proposed in situ mechanisms to
access data in scientific file formats. Scientists, however, desire more than a
passive access method that reads arrays from files.
This paper describes ArrayBridge, a bi-directional array view mechanism for
scientific file formats, that aims to make declarative array manipulations
interoperable with imperative file-centric analyses. Our prototype
implementation of ArrayBridge uses HDF5 as the underlying array storage library
and seamlessly integrates into the SciDB open-source array database system. In
addition to fast querying over external array objects, ArrayBridge produces
arrays in the HDF5 file format just as easily as it can read from it.
ArrayBridge also supports time travel queries from imperative kernels through
the unmodified HDF5 API, and automatically deduplicates between array versions
for space efficiency. Our extensive performance evaluation in NERSC, a
large-scale scientific computing facility, shows that ArrayBridge exhibits
statistically indistinguishable performance and I/O scalability to the native
SciDB storage engine.Comment: 12 pages, 13 figure
Recommended from our members
The OpenupEd quality label: benchmarks for MOOCs
In this paper we report on the development of the OpenupEd Quality Label, a self-assessment and review quality assurance process for the new European OpenupEd portal (www.openuped.eu) for MOOCs (massive open online courses). This process is focused on benchmark statements that seek to capture good practice, both at the level of the institution and at the level of individual courses. The benchmark statements for MOOCs are derived from benchmarks produced by the E xcellence e learning quality projects (E-xcellencelabel.eadtu.eu/). A process of self-assessment and review is intended to encourage quality enhancement, captured in an action plan. We suggest that a quality label for MOOCs will benefit all MOOC stakeholders
Speculative Approximations for Terascale Analytics
Model calibration is a major challenge faced by the plethora of statistical
analytics packages that are increasingly used in Big Data applications.
Identifying the optimal model parameters is a time-consuming process that has
to be executed from scratch for every dataset/model combination even by
experienced data scientists. We argue that the incapacity to evaluate multiple
parameter configurations simultaneously and the lack of support to quickly
identify sub-optimal configurations are the principal causes. In this paper, we
develop two database-inspired techniques for efficient model calibration.
Speculative parameter testing applies advanced parallel multi-query processing
methods to evaluate several configurations concurrently. The number of
configurations is determined adaptively at runtime, while the configurations
themselves are extracted from a distribution that is continuously learned
following a Bayesian process. Online aggregation is applied to identify
sub-optimal configurations early in the processing by incrementally sampling
the training dataset and estimating the objective function corresponding to
each configuration. We design concurrent online aggregation estimators and
define halting conditions to accurately and timely stop the execution. We apply
the proposed techniques to distributed gradient descent optimization -- batch
and incremental -- for support vector machines and logistic regression models.
We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big
Data analytics system -- and evaluate their performance over terascale-size
synthetic and real datasets. The results confirm that as many as 32
configurations can be evaluated concurrently almost as fast as one, while
sub-optimal configurations are detected accurately in as little as a
fraction of the time
An Efficient Multiway Mergesort for GPU Architectures
Sorting is a primitive operation that is a building block for countless
algorithms. As such, it is important to design sorting algorithms that approach
peak performance on a range of hardware architectures. Graphics Processing
Units (GPUs) are particularly attractive architectures as they provides massive
parallelism and computing power. However, the intricacies of their compute and
memory hierarchies make designing GPU-efficient algorithms challenging. In this
work we present GPU Multiway Mergesort (MMS), a new GPU-efficient multiway
mergesort algorithm. MMS employs a new partitioning technique that exposes the
parallelism needed by modern GPU architectures. To the best of our knowledge,
MMS is the first sorting algorithm for the GPU that is asymptotically optimal
in terms of global memory accesses and that is completely free of shared memory
bank conflicts.
We realize an initial implementation of MMS, evaluate its performance on
three modern GPU architectures, and compare it to competitive implementations
available in state-of-the-art GPU libraries. Despite these implementations
being highly optimized, MMS compares favorably, achieving performance
improvements for most random inputs. Furthermore, unlike MMS, state-of-the-art
algorithms are susceptible to bank conflicts. We find that for certain inputs
that cause these algorithms to incur large numbers of bank conflicts, MMS can
achieve up to a 37.6% speedup over its fastest competitor. Overall, even though
its current implementation is not fully optimized, due to its efficient use of
the memory hierarchy, MMS outperforms the fastest comparison-based sorting
implementations available to date
- …