2,771 research outputs found
Speculative Approximations for Terascale Analytics
Model calibration is a major challenge faced by the plethora of statistical
analytics packages that are increasingly used in Big Data applications.
Identifying the optimal model parameters is a time-consuming process that has
to be executed from scratch for every dataset/model combination even by
experienced data scientists. We argue that the incapacity to evaluate multiple
parameter configurations simultaneously and the lack of support to quickly
identify sub-optimal configurations are the principal causes. In this paper, we
develop two database-inspired techniques for efficient model calibration.
Speculative parameter testing applies advanced parallel multi-query processing
methods to evaluate several configurations concurrently. The number of
configurations is determined adaptively at runtime, while the configurations
themselves are extracted from a distribution that is continuously learned
following a Bayesian process. Online aggregation is applied to identify
sub-optimal configurations early in the processing by incrementally sampling
the training dataset and estimating the objective function corresponding to
each configuration. We design concurrent online aggregation estimators and
define halting conditions to accurately and timely stop the execution. We apply
the proposed techniques to distributed gradient descent optimization -- batch
and incremental -- for support vector machines and logistic regression models.
We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big
Data analytics system -- and evaluate their performance over terascale-size
synthetic and real datasets. The results confirm that as many as 32
configurations can be evaluated concurrently almost as fast as one, while
sub-optimal configurations are detected accurately in as little as a
fraction of the time
Neural Connectivity with Hidden Gaussian Graphical State-Model
The noninvasive procedures for neural connectivity are under questioning.
Theoretical models sustain that the electromagnetic field registered at
external sensors is elicited by currents at neural space. Nevertheless, what we
observe at the sensor space is a superposition of projected fields, from the
whole gray-matter. This is the reason for a major pitfall of noninvasive
Electrophysiology methods: distorted reconstruction of neural activity and its
connectivity or leakage. It has been proven that current methods produce
incorrect connectomes. Somewhat related to the incorrect connectivity
modelling, they disregard either Systems Theory and Bayesian Information
Theory. We introduce a new formalism that attains for it, Hidden Gaussian
Graphical State-Model (HIGGS). A neural Gaussian Graphical Model (GGM) hidden
by the observation equation of Magneto-encephalographic (MEEG) signals. HIGGS
is equivalent to a frequency domain Linear State Space Model (LSSM) but with
sparse connectivity prior. The mathematical contribution here is the theory for
high-dimensional and frequency-domain HIGGS solvers. We demonstrate that HIGGS
can attenuate the leakage effect in the most critical case: the distortion EEG
signal due to head volume conduction heterogeneities. Its application in EEG is
illustrated with retrieved connectivity patterns from human Steady State Visual
Evoked Potentials (SSVEP). We provide for the first time confirmatory evidence
for noninvasive procedures of neural connectivity: concurrent EEG and
Electrocorticography (ECoG) recordings on monkey. Open source packages are
freely available online, to reproduce the results presented in this paper and
to analyze external MEEG databases
- …