183 research outputs found
Circular polarization measurement in millimeter-wavelength spectral-line VLBI observations
This paper considers the problem of accurate measurement of circular
polarization in imaging spectral-line VLBI observations in the lambda=7 mm and
lambda=3 mm wavelength bands. This capability is especially valuable for the
full observational study of compact, polarized SiO maser components in the
near-circumstellar environment of late-type, evolved stars. Circular VLBI
polarimetry provides important constraints on SiO maser astrophysics, including
the theory of polarized maser emission transport, and on the strength and
distribution of the stellar magnetic field and its dynamical role in this
critical circumstellar region. We perform an analysis here of the data model
containing the instrumental factors that limit the accuracy of circular
polarization measurements in such observations, and present a corresponding
data reduction algorithm for their correction. The algorithm is an enhancement
of existing spectral line VLBI polarimetry methods using autocorrelation data
for calibration, but with innovations in bandpass determination,
autocorrelation polarization self-calibration, and general optimizations for
the case of low SNR, as applicable at these wavelengths. We present an example
data reduction at mm and derive an estimate of the predicted
accuracy of the method of m_c < 0.5% or better at lambda=7 mm and m_c < 0.5-1%
or better at lambda=3 mm. Both the strengths and weaknesses of the proposed
algorithm are discussed, along with suggestions for future work.Comment: 23 pages, 13 figure
Efficient execution of Java programs on GPU
Dissertação de mestrado em Informatics EngineeringWith the overwhelming increase of demand of computational power made by fields as Big
Data, Deep Machine learning and Image processing the Graphics Processing Units (GPUs)
has been seen as a valuable tool to compute the main workload involved. Nonetheless,
these solutions have limited support for object-oriented languages that often require manual
memory handling which is an obstacle to bringing together the large community of object oriented programmers and the high-performance computing field.
In this master thesis, different memory optimizations and their impacts were studied
in a GPU Java context using Aparapi. These include solutions for different identifiable
bottlenecks of commonly used kernels exploiting its full capabilities by studying the GPU
hardware and current techniques available. These results were set against common used
C/OpenCL benchmarks and respective optimizations proving, that high-level languages can
be a solution to high-performance software demand.Com o aumento de poder computacional requisitado por campos como Big Data, Deep Machine Learning e Processamento de Imagens, as unidades de processamento gráfico (GPUs) tem sido vistas como uma ferramenta valiosa para executar a principal carga de trabalho envolvida. No entanto, esta solução tem suporte limitado para linguagens orientadas a objetos. Frequentemente estas requerem manipulação manual de memória, o que é um obstáculo para reunir a grande comunidade de programadores orientados a objetos e o campo da computação de alto desempenho. Nesta dissertação de mestrado, diferentes otimizações de memória e os seus impactos foram estudados utilizando Aparapi. As otimizações estudadas pretendem solucionar bottle-necks identificáveis em kernels frequentemente utilizados. Os resultados obtidos foram comparados com benchmarks C / OpenCL populares e as suas respectivas otimizações, provando que as linguagens de alto nível podem ser uma solução para programas que requerem computação de alto desempenho
Exploring the flexibility of MIL-47(V)-type materials using force field molecular dynamics simulations
The flexibility of three MIL-47(V)-type materials (MIL-47, COMOC-2, and COMOC-3) has been explored by constructing the pressure versus volume and free energy versus volume profiles at various temperatures ranging from 100 to 400 K This is done with first-principles-based force fields using the recently proposed QuickFF parametrization protocol. Specific terms were added for the materials at hand to describe the asymmetry of the one-dimensional vanadium oxide chain and to account for the flexibility of the organic linkers. The force fields are used in a series of molecular dynamics simulations at fixed volumes but varying unit cell shapes. The three materials show a distinct pressure-volume behavior, which underlines the ability to tune the mechanical properties by varying the linkers toward different applications such as nanosprings, dampers, and shock absorbers
Large Scale Clustering with Variational EM for Gaussian Mixture Models
How can we efficiently find large numbers of clusters in large data sets with
high-dimensional data points? Our aim is to explore the current efficiency and
large-scale limits in fitting a parametric model for clustering to data
distributions. To do so, we combine recent lines of research which have
previously focused on separate specific methods for complexity reduction. We
first show theoretically how the clustering objective of variational EM (which
reduces complexity for many clusters) can be combined with coreset objectives
(which reduce complexity for many data points). Secondly, we realize a concrete
highly efficient iterative procedure which combines and translates the
theoretical complexity gains of truncated variational EM and coresets into a
practical algorithm. For very large scales, the high efficiency of parameter
updates then requires (A) highly efficient coreset construction and (B) highly
efficient initialization procedures (seeding) in order to avoid computational
bottlenecks. Fortunately very efficient coreset construction has become
available in the form of light-weight coresets, and very efficient
initialization has become available in the form of AFK-MC seeding. The
resulting algorithm features balanced computational costs across all
constituting components. In applications to standard large-scale benchmarks for
clustering, we investigate the algorithm's efficiency/quality trade-off.
Compared to the best recent approaches, we observe speedups of up to one order
of magnitude, and up to two orders of magnitude compared to the -means++
baseline. To demonstrate that the observed efficiency enables previously
considered unfeasible applications, we cluster the entire and unscaled 80 Mio.
Tiny Images dataset into up to 32,000 clusters. To the knowledge of the
authors, this represents the largest scale fit of a parametric data model for
clustering reported so far
XQuery Streaming by Forest Transducers
Streaming of XML transformations is a challenging task and only very few
systems support streaming. Research approaches generally define custom
fragments of XQuery and XPath that are amenable to streaming, and then design
custom algorithms for each fragment. These languages have several shortcomings.
Here we take a more principles approach to the problem of streaming
XQuery-based transformations. We start with an elegant transducer model for
which many static analysis problems are well-understood: the Macro Forest
Transducer (MFT). We show that a large fragment of XQuery can be translated
into MFTs --- indeed, a fragment of XQuery, that can express important features
that are missing from other XQuery stream engines, such as GCX: our fragment of
XQuery supports XPath predicates and let-statements. We then rely on a
streaming execution engine for MFTs, one which uses a well-founded set of
optimizations from functional programming, such as strictness analysis and
deforestation. Our prototype achieves time and memory efficiency comparable to
the fastest known engine for XQuery streaming, GCX. This is surprising because
our engine relies on the OCaml built in garbage collector and does not use any
specialized buffer management, while GCX's efficiency is due to clever and
explicit buffer management.Comment: Full version of the paper in the Proceedings of the 30th IEEE
International Conference on Data Engineering (ICDE 2014
Tuning the Performance of a Computational Persistent Homology Package
In recent years, persistent homology has become an attractive method for data analysis. It captures topological features, such as connected components, holes, and voids from point cloud data and summarizes the way in which these features appear and disappear in a filtration sequence. In this project, we focus on improving the performanceof Eirene, a computational package for persistent homology. Eirene is a 5000-line open-source software library implemented in the dynamic programming language Julia. We use the Julia profiling tools to identify performance bottlenecks and develop novel methods to manage them, including the parallelization of some time-consuming functions on multicore/manycore hardware. Empirical results show that performance can be greatly improved
- …