7,532 research outputs found
The Data Big Bang and the Expanding Digital Universe: High-Dimensional, Complex and Massive Data Sets in an Inflationary Epoch
Recent and forthcoming advances in instrumentation, and giant new surveys,
are creating astronomical data sets that are not amenable to the methods of
analysis familiar to astronomers. Traditional methods are often inadequate not
merely because of the size in bytes of the data sets, but also because of the
complexity of modern data sets. Mathematical limitations of familiar algorithms
and techniques in dealing with such data sets create a critical need for new
paradigms for the representation, analysis and scientific visualization (as
opposed to illustrative visualization) of heterogeneous, multiresolution data
across application domains. Some of the problems presented by the new data sets
have been addressed by other disciplines such as applied mathematics,
statistics and machine learning and have been utilized by other sciences such
as space-based geosciences. Unfortunately, valuable results pertaining to these
problems are mostly to be found only in publications outside of astronomy. Here
we offer brief overviews of a number of concepts, techniques and developments,
some "old" and some new. These are generally unknown to most of the
astronomical community, but are vital to the analysis and visualization of
complex datasets and images. In order for astronomers to take advantage of the
richness and complexity of the new era of data, and to be able to identify,
adopt, and apply new solutions, the astronomical community needs a certain
degree of awareness and understanding of the new concepts. One of the goals of
this paper is to help bridge the gap between applied mathematics, artificial
intelligence and computer science on the one side and astronomy on the other.Comment: 24 pages, 8 Figures, 1 Table. Accepted for publication: "Advances in
Astronomy, special issue "Robotic Astronomy
Predicting respiratory motion for real-time tumour tracking in radiotherapy
Purpose. Radiation therapy is a local treatment aimed at cells in and around
a tumor. The goal of this study is to develop an algorithmic solution for
predicting the position of a target in 3D in real time, aiming for the short
fixed calibration time for each patient at the beginning of the procedure.
Accurate predictions of lung tumor motion are expected to improve the precision
of radiation treatment by controlling the position of a couch or a beam in
order to compensate for respiratory motion during radiation treatment.
Methods. For developing the algorithmic solution, data mining techniques are
used. A model form from the family of exponential smoothing is assumed, and the
model parameters are fitted by minimizing the absolute disposition error, and
the fluctuations of the prediction signal (jitter). The predictive performance
is evaluated retrospectively on clinical datasets capturing different behavior
(being quiet, talking, laughing), and validated in real-time on a prototype
system with respiratory motion imitation.
Results. An algorithmic solution for respiratory motion prediction (called
ExSmi) is designed. ExSmi achieves good accuracy of prediction (error
mm/s) with acceptable jitter values (5-7 mm/s), as tested on out-of-sample
data. The datasets, the code for algorithms and the experiments are openly
available for research purposes on a dedicated website.
Conclusions. The developed algorithmic solution performs well to be
prototyped and deployed in applications of radiotherapy
Detection of dirt impairments from archived film sequences : survey and evaluations
Film dirt is the most commonly encountered artifact in archive restoration applications. Since dirt usually appears as a temporally impulsive event, motion-compensated interframe processing is widely applied for its detection. However, motion-compensated prediction requires a high degree of complexity and can be unreliable when motion estimation fails. Consequently, many techniques using spatial or spatiotemporal filtering without motion were also been proposed as alternatives. A comprehensive survey and evaluation of existing methods is presented, in which both qualitative and quantitative performances are compared in terms of accuracy, robustness, and complexity. After analyzing these algorithms and identifying their limitations, we conclude with guidance in choosing from these algorithms and promising directions for future research
Forecasting Time Series with VARMA Recursions on Graphs
Graph-based techniques emerged as a choice to deal with the dimensionality
issues in modeling multivariate time series. However, there is yet no complete
understanding of how the underlying structure could be exploited to ease this
task. This work provides contributions in this direction by considering the
forecasting of a process evolving over a graph. We make use of the
(approximate) time-vertex stationarity assumption, i.e., timevarying graph
signals whose first and second order statistical moments are invariant over
time and correlated to a known graph topology. The latter is combined with VAR
and VARMA models to tackle the dimensionality issues present in predicting the
temporal evolution of multivariate time series. We find out that by projecting
the data to the graph spectral domain: (i) the multivariate model estimation
reduces to that of fitting a number of uncorrelated univariate ARMA models and
(ii) an optimal low-rank data representation can be exploited so as to further
reduce the estimation costs. In the case that the multivariate process can be
observed at a subset of nodes, the proposed models extend naturally to Kalman
filtering on graphs allowing for optimal tracking. Numerical experiments with
both synthetic and real data validate the proposed approach and highlight its
benefits over state-of-the-art alternatives.Comment: submitted to the IEEE Transactions on Signal Processin
Task-related edge density (TED) - a new method for revealing large-scale network formation in fMRI data of the human brain
The formation of transient networks in response to external stimuli or as a
reflection of internal cognitive processes is a hallmark of human brain
function. However, its identification in fMRI data of the human brain is
notoriously difficult. Here we propose a new method of fMRI data analysis that
tackles this problem by considering large-scale, task-related synchronisation
networks. Networks consist of nodes and edges connecting them, where nodes
correspond to voxels in fMRI data, and the weight of an edge is determined via
task-related changes in dynamic synchronisation between their respective times
series. Based on these definitions, we developed a new data analysis algorithm
that identifies edges in a brain network that differentially respond in unison
to a task onset and that occur in dense packs with similar characteristics.
Hence, we call this approach "Task-related Edge Density" (TED). TED proved to
be a very strong marker for dynamic network formation that easily lends itself
to statistical analysis using large scale statistical inference. A major
advantage of TED compared to other methods is that it does not depend on any
specific hemodynamic response model, and it also does not require a
presegmentation of the data for dimensionality reduction as it can handle large
networks consisting of tens of thousands of voxels. We applied TED to fMRI data
of a fingertapping task provided by the Human Connectome Project. TED revealed
network-based involvement of a large number of brain areas that evaded
detection using traditional GLM-based analysis. We show that our proposed
method provides an entirely new window into the immense complexity of human
brain function.Comment: 21 pages, 11 figure
- …