3,536 research outputs found
A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids
The efficient implementation of collective communiction operations has
received much attention. Initial efforts produced "optimal" trees based on
network communication models that assumed equal point-to-point latencies
between any two processes. This assumption is violated in most practical
settings, however, particularly in heterogeneous systems such as clusters of
SMPs and wide-area "computational Grids," with the result that collective
operations perform suboptimally. In response, more recent work has focused on
creating topology-aware trees for collective operations that minimize
communication across slower channels (e.g., a wide-area network). While these
efforts have significant communication benefits, they all limit their view of
the network to only two layers. We present a strategy based upon a multilayer
view of the network. By creating multilevel topology-aware trees we take
advantage of communication cost differences at every level in the network. We
used this strategy to implement topology-aware versions of several MPI
collective operations in MPICH-G2, the Globus Toolkit[tm]-enabled version of
the popular MPICH implementation of the MPI standard. Using information about
topology provided by MPICH-G2, we construct these multilevel topology-aware
trees automatically during execution. We present results demonstrating the
advantages of our multilevel approach by comparing it to the default
(topology-unaware) implementation provided by MPICH and a topology-aware
two-layer implementation.Comment: 16 pages, 8 figure
Cluster, Classify, Regress: A General Method For Learning Discountinous Functions
This paper presents a method for solving the supervised learning problem in
which the output is highly nonlinear and discontinuous. It is proposed to solve
this problem in three stages: (i) cluster the pairs of input-output data
points, resulting in a label for each point; (ii) classify the data, where the
corresponding label is the output; and finally (iii) perform one separate
regression for each class, where the training data corresponds to the subset of
the original input-output pairs which have that label according to the
classifier. It has not yet been proposed to combine these 3 fundamental
building blocks of machine learning in this simple and powerful fashion. This
can be viewed as a form of deep learning, where any of the intermediate layers
can itself be deep. The utility and robustness of the methodology is
illustrated on some toy problems, including one example problem arising from
simulation of plasma fusion in a tokamak.Comment: 12 files,6 figure
{\sc CosmoNet}: fast cosmological parameter estimation in non-flat models using neural networks
We present a further development of a method for accelerating the calculation
of CMB power spectra, matter power spectra and likelihood functions for use in
cosmological Bayesian inference. The algorithm, called {\sc CosmoNet}, is based
on training a multilayer perceptron neural network. We compute CMB power
spectra (up to ) and matter transfer functions over a hypercube in
parameter space encompassing the confidence region of a selection of
CMB (WMAP + high resolution experiments) and large scale structure surveys (2dF
and SDSS). We work in the framework of a generic 7 parameter non-flat
cosmology. Additionally we use {\sc CosmoNet} to compute the WMAP 3-year, 2dF
and SDSS likelihoods over the same region. We find that the average error in
the power spectra is typically well below cosmic variance for spectra, and
experimental likelihoods calculated to within a fraction of a log unit. We
demonstrate that marginalised posteriors generated with {\sc CosmoNet} spectra
agree to within a few percent of those generated by {\sc CAMB} parallelised
over 4 CPUs, but are obtained 2-3 times faster on just a \emph{single}
processor. Furthermore posteriors generated directly via {\sc CosmoNet}
likelihoods can be obtained in less than 30 minutes on a single processor,
corresponding to a speed up of a factor of . We also demonstrate the
capabilities of {\sc CosmoNet} by extending the CMB power spectra and matter
transfer function training to a more generic 10 parameter cosmological model,
including tensor modes, a varying equation of state of dark energy and massive
neutrinos. {\sc CosmoNet} and interfaces to both {\sc CosmoMC} and {\sc
Bayesys} are publically available at {\tt
www.mrao.cam.ac.uk/software/cosmonet}.Comment: 8 pages, submitted to MNRA
Predicting the Performance of MPI Applications over Different Grid Architectures
في الوقت الحاضر خوارزميات التحسين عالية السرعة تكون مطلوبة. في معظم الحالات ، يحتاج الباحثــــــــون إلى طريقة للتنبؤ ببعض المعايير بدقة مقبولة لاستخدامها في خوارزمياتهم. ومع ذلك ، في مجال الحوسبة المتوازية يمكن اعتبار وقت التنفيذ من أهم المعايير. لذا، يعرض هذا البحث نموذجًا جديدًا للتنبؤ بالوقت للتنفيذ لتطبيقات المتوازيـــــة الموزعة المنفذه على العديد من سيناريوهات الشبكة. حيث يمتلك النموذج المقترح القدرة علـــــــــى التنبؤ بوقت تنفيذ التطبيقات المتوازية التي تعمل عبر أي تكوين للشبكة من حيث عدد العقد المختلفة وقوى الحوسبة الخاصة بها.
لقد تم تنفيذ التجارب على المحاكي سمكرد الذي يمتلك خاصية السهولة في بناء نماذج شبكية متعدد ومختلفة. نتائــج الاختبارات بين اوقات التنفيذ الاصلية والاوقات المتنبئة بينت دقة تجيربية جيده. معدل الخطأ النسبي بين وقت التنفيذ الاصلي والمتنبأ لثلاث برامج معيارية تكون هي 4.36٪، 5.79٪ و 6.81٪.Nowadays, the high speed and accurate optimization algorithms are required. In most of the cases, researchers need a method to predict some criteria with acceptable accuracy to use it after in their algorithms. However, in the field of parallel computing the execution time can be considered the most important criteria. Consequently, this paper presents new execution time prediction model for message passing interface applications execute over numerous grid scenarios. The model has ability to predict the execution time of the message passing applications running over any grid configuration in term of different number of nodes and their computing powers. The experiments are evaluated over SimGrid simulator to simulate the grid configuration scenarios. The results of comparing the real and the predicted execution time show a good accuracy. The average error ratio between the real and the predicted execution time for three benchmarks are 4.36%, 5.79% and 6.81%
The edge cloud: A holistic view of communication, computation and caching
The evolution of communication networks shows a clear shift of focus from
just improving the communications aspects to enabling new important services,
from Industry 4.0 to automated driving, virtual/augmented reality, Internet of
Things (IoT), and so on. This trend is evident in the roadmap planned for the
deployment of the fifth generation (5G) communication networks. This ambitious
goal requires a paradigm shift towards a vision that looks at communication,
computation and caching (3C) resources as three components of a single holistic
system. The further step is to bring these 3C resources closer to the mobile
user, at the edge of the network, to enable very low latency and high
reliability services. The scope of this chapter is to show that signal
processing techniques can play a key role in this new vision. In particular, we
motivate the joint optimization of 3C resources. Then we show how graph-based
representations can play a key role in building effective learning methods and
devising innovative resource allocation techniques.Comment: to appear in the book "Cooperative and Graph Signal Pocessing:
Principles and Applications", P. Djuric and C. Richard Eds., Academic Press,
Elsevier, 201
- …