7,902 research outputs found
Maiter: An Asynchronous Graph Processing Framework for Delta-based Accumulative Iterative Computation
Myriad of graph-based algorithms in machine learning and data mining require
parsing relational data iteratively. These algorithms are implemented in a
large-scale distributed environment in order to scale to massive data sets. To
accelerate these large-scale graph-based iterative computations, we propose
delta-based accumulative iterative computation (DAIC). Different from
traditional iterative computations, which iteratively update the result based
on the result from the previous iteration, DAIC updates the result by
accumulating the "changes" between iterations. By DAIC, we can process only the
"changes" to avoid the negligible updates. Furthermore, we can perform DAIC
asynchronously to bypass the high-cost synchronous barriers in heterogeneous
distributed environments. Based on the DAIC model, we design and implement an
asynchronous graph processing framework, Maiter. We evaluate Maiter on local
cluster as well as on Amazon EC2 Cloud. The results show that Maiter achieves
as much as 60x speedup over Hadoop and outperforms other state-of-the-art
frameworks.Comment: ScienceCloud 2012, TKDE 201
Microgrid - The microthreaded many-core architecture
Traditional processors use the von Neumann execution model, some other
processors in the past have used the dataflow execution model. A combination of
von Neuman model and dataflow model is also tried in the past and the resultant
model is referred as hybrid dataflow execution model. We describe a hybrid
dataflow model known as the microthreading. It provides constructs for
creation, synchronization and communication between threads in an intermediate
language. The microthreading model is an abstract programming and machine model
for many-core architecture. A particular instance of this model is named as the
microthreaded architecture or the Microgrid. This architecture implements all
the concurrency constructs of the microthreading model in the hardware with the
management of these constructs in the hardware.Comment: 30 pages, 16 figure
Accumulated Gradient Normalization
This work addresses the instability in asynchronous data parallel
optimization. It does so by introducing a novel distributed optimizer which is
able to efficiently optimize a centralized model under communication
constraints. The optimizer achieves this by pushing a normalized sequence of
first-order gradients to a parameter server. This implies that the magnitude of
a worker delta is smaller compared to an accumulated gradient, and provides a
better direction towards a minimum compared to first-order gradients, which in
turn also forces possible implicit momentum fluctuations to be more aligned
since we make the assumption that all workers contribute towards a single
minima. As a result, our approach mitigates the parameter staleness problem
more effectively since staleness in asynchrony induces (implicit) momentum, and
achieves a better convergence rate compared to other optimizers such as
asynchronous EASGD and DynSGD, which we show empirically.Comment: 16 pages, 12 figures, ACML201
Towards a generic platform for developing CSCL applications using Grid infrastructure
The goal of this paper is to explore the possibility of using CSCL component-based software under a Grid infrastructure. The merge of these technologies represents an attractive, but probably quite laborious enterprise if we consider not only the benefits but also the barriers that we have to overcome. This work presents an attempt toward this direction by developing a generic platform of CSCL components and discussing the advantages that we could obtain if we adapted it to the Grid. We then propose a means that could make this adjustment possible due to the high degree of genericity that our library component is endowed with by being based on the generic programming paradigm. Finally, an application of our library is proposed both for validating the adequacy of the platform which it is based on and for indicating the possibilities gained by using it under the Grid.Peer ReviewedPostprint (published version
- …