13,863 research outputs found
Modeling Scalability of Distributed Machine Learning
Present day machine learning is computationally intensive and processes large
amounts of data. It is implemented in a distributed fashion in order to address
these scalability issues. The work is parallelized across a number of computing
nodes. It is usually hard to estimate in advance how many nodes to use for a
particular workload. We propose a simple framework for estimating the
scalability of distributed machine learning algorithms. We measure the
scalability by means of the speedup an algorithm achieves with more nodes. We
propose time complexity models for gradient descent and graphical model
inference. We validate our models with experiments on deep learning training
and belief propagation. This framework was used to study the scalability of
machine learning algorithms in Apache Spark.Comment: 6 pages, 4 figures, appears at ICDE 201
Parallel processing and expert systems
Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 90's cannot enjoy an increased level of autonomy without the efficient use of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real time demands are met for large expert systems. Speed-up via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial labs in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems was surveyed. The survey is divided into three major sections: (1) multiprocessors for parallel expert systems; (2) parallel languages for symbolic computations; and (3) measurements of parallelism of expert system. Results to date indicate that the parallelism achieved for these systems is small. In order to obtain greater speed-ups, data parallelism and application parallelism must be exploited
A statistical approach to the inverse problem in magnetoencephalography
Magnetoencephalography (MEG) is an imaging technique used to measure the
magnetic field outside the human head produced by the electrical activity
inside the brain. The MEG inverse problem, identifying the location of the
electrical sources from the magnetic signal measurements, is ill-posed, that
is, there are an infinite number of mathematically correct solutions. Common
source localization methods assume the source does not vary with time and do
not provide estimates of the variability of the fitted model. Here, we
reformulate the MEG inverse problem by considering time-varying locations for
the sources and their electrical moments and we model their time evolution
using a state space model. Based on our predictive model, we investigate the
inverse problem by finding the posterior source distribution given the multiple
channels of observations at each time rather than fitting fixed source
parameters. Our new model is more realistic than common models and allows us to
estimate the variation of the strength, orientation and position. We propose
two new Monte Carlo methods based on sequential importance sampling. Unlike the
usual MCMC sampling scheme, our new methods work in this situation without
needing to tune a high-dimensional transition kernel which has a very high
cost. The dimensionality of the unknown parameters is extremely large and the
size of the data is even larger. We use Parallel Virtual Machine (PVM) to speed
up the computation.Comment: Published in at http://dx.doi.org/10.1214/14-AOAS716 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Parallel processing and expert systems
Whether it be monitoring the thermal subsystem of Space Station Freedom, or controlling the navigation of the autonomous rover on Mars, NASA missions in the 1990s cannot enjoy an increased level of autonomy without the efficient implementation of expert systems. Merely increasing the computational speed of uniprocessors may not be able to guarantee that real-time demands are met for larger systems. Speedup via parallel processing must be pursued alongside the optimization of sequential implementations. Prototypes of parallel expert systems have been built at universities and industrial laboratories in the U.S. and Japan. The state-of-the-art research in progress related to parallel execution of expert systems is surveyed. The survey discusses multiprocessors for expert systems, parallel languages for symbolic computations, and mapping expert systems to multiprocessors. Results to date indicate that the parallelism achieved for these systems is small. The main reasons are (1) the body of knowledge applicable in any given situation and the amount of computation executed by each rule firing are small, (2) dividing the problem solving process into relatively independent partitions is difficult, and (3) implementation decisions that enable expert systems to be incrementally refined hamper compile-time optimization. In order to obtain greater speedups, data parallelism and application parallelism must be exploited
Simulation in Statistics
Simulation has become a standard tool in statistics because it may be the
only tool available for analysing some classes of probabilistic models. We
review in this paper simulation tools that have been specifically derived to
address statistical challenges and, in particular, recent advances in the areas
of adaptive Markov chain Monte Carlo (MCMC) algorithms, and approximate
Bayesian calculation (ABC) algorithms.Comment: Draft of an advanced tutorial paper for the Proceedings of the 2011
Winter Simulation Conferenc
- …