4,808 research outputs found
Dynamical projections for the visualization of PDFSense data
A recent paper on visualizing the sensitivity of hadronic experiments to
nucleon structure [1] introduces the tool PDFSense which defines measures to
allow the user to judge the sensitivity of PDF fits to a given experiment. The
sensitivity is characterized by high-dimensional data residuals that are
visualized in a 3-d subspace of the 10 first principal components or using
t-SNE [2]. We show how a tour, a dynamic visualisation of high dimensional
data, can extend this tool beyond 3-d relationships. This approach enables
resolving structure orthogonal to the 2-d viewing plane used so far, and hence
finer tuned assessment of the sensitivity.Comment: Format of the animations changed for easier viewin
Capturing the impact of external interference on HPC application performance
HPC applications are large software packages with high computation and storage requirements. To meet these requirements, the architectures of supercomputers are continuously evolving and their capabilities are continuously increasing. Present-day supercomputers have achieved petaflops of computational power by utilizing thousands to millions of compute cores, connected through specialized communication networks, and are equipped with petabytes of storage using a centralized I/O subsystem. While fulfilling the high resource demands of HPC applications, such a design also entails its own challenges. Applications running on these systems own the computation resources exclusively, but share the communication interconnect and the I/O subsystem with other concurrently running applications. Simultaneous access to these shared resources causes contention and inter-application interference, leading to degraded application performance.
Inter-application interference is one of the sources of run-to-run variation. While other sources of variation, such as operating system jitter, have been investigated before, this doctoral thesis specifically focuses on inter-application interference and studies it from the perspective of an application. Variation in execution time not only causes uncertainty and affects user expectations (especially during performance analysis), but also causes suboptimal usage of HPC resources. Therefore, this thesis aims to evaluate inter-application interference, establish trends among applications under contention, and approximate the impact of external influences on the runtime of an application.
To this end, this thesis first presents a method to correlate the performance of applications running side-by-side. The method divides the runtime of a system into globally synchronized, fine-grained time slices for which application performance data is recorded separately. The evaluation of the method demonstrates that correlating application performance data can identify inter-application interference. The thesis further uses the method to study I/O interference and shows that file access patterns are a significant factor in determining the interference potential of an application.
This thesis also presents a technique to estimate the impact of external influences on an application run. The technique introduces the concept of intrinsic performance characteristics to cluster similar application execution segments. Anomalies in the cluster are the result of external interference. An evaluation with several benchmarks shows high accuracy in estimating the impact of interference from a single application run.
The contributions of this thesis will help establish interference trends and devise interference mitigation techniques. Similarly, estimating the impact of external interference will restore user expectations and help performance analysts separate application performance from external influence
A lattice of double wells for manipulating pairs of cold atoms
We describe the design and implementation of a 2D optical lattice of double
wells suitable for isolating and manipulating an array of individual pairs of
atoms in an optical lattice. Atoms in the square lattice can be placed in a
double well with any of their four nearest neighbors. The properties of the
double well (the barrier height and relative energy offset of the paired sites)
can be dynamically controlled. The topology of the lattice is phase stable
against phase noise imparted by vibrational noise on mirrors. We demonstrate
the dynamic control of the lattice by showing the coherent splitting of atoms
from single wells into double wells and observing the resulting double-slit
atom diffraction pattern. This lattice can be used to test controlled neutral
atom motion among lattice sites and should allow for testing controlled
two-qubit gates.Comment: 9 pages, 11 figures Accepted for publication in Physical Review
Path Similarity Analysis: a Method for Quantifying Macromolecular Pathways
Diverse classes of proteins function through large-scale conformational
changes; sophisticated enhanced sampling methods have been proposed to generate
these macromolecular transition paths. As such paths are curves in a
high-dimensional space, they have been difficult to compare quantitatively, a
prerequisite to, for instance, assess the quality of different sampling
algorithms. The Path Similarity Analysis (PSA) approach alleviates these
difficulties by utilizing the full information in 3N-dimensional trajectories
in configuration space. PSA employs the Hausdorff or Fr\'echet path
metrics---adopted from computational geometry---enabling us to quantify path
(dis)similarity, while the new concept of a Hausdorff-pair map permits the
extraction of atomic-scale determinants responsible for path differences.
Combined with clustering techniques, PSA facilitates the comparison of many
paths, including collections of transition ensembles. We use the closed-to-open
transition of the enzyme adenylate kinase (AdK)---a commonly used testbed for
the assessment enhanced sampling algorithms---to examine multiple microsecond
equilibrium molecular dynamics (MD) transitions of AdK in its substrate-free
form alongside transition ensembles from the MD-based dynamic importance
sampling (DIMS-MD) and targeted MD (TMD) methods, and a geometrical targeting
algorithm (FRODA). A Hausdorff pairs analysis of these ensembles revealed, for
instance, that differences in DIMS-MD and FRODA paths were mediated by a set of
conserved salt bridges whose charge-charge interactions are fully modeled in
DIMS-MD but not in FRODA. We also demonstrate how existing trajectory analysis
methods relying on pre-defined collective variables, such as native contacts or
geometric quantities, can be used synergistically with PSA, as well as the
application of PSA to more complex systems such as membrane transporter
proteins.Comment: 9 figures, 3 tables in the main manuscript; supplementary information
includes 7 texts (S1 Text - S7 Text) and 11 figures (S1 Fig - S11 Fig) (also
available from journal site
Development of Multiscale Spectro-microscopic Imaging System and Its Applications
A novel multi-modality spectro-microscopic system that combines far-field interferometry based optical microscopy imaging techniques (differential interference contrast microscopy and cross-polarized light microscopy), total internal reflection microscopy (total internal reflection fluorescence and scattering microscopy) and confocal spectroscopy (Raman spectroscopy and photoluminescence spectroscopy) is developed. Home-built post treatment stages (thermal annealing stage and solvent annealing stage) are integrated into the system to realize in situ measurements. Departing from conventional characterization methods in materials science mostly focused on structures on one length scale, the in situ multi-modality characterization system aims to uncover the structural information from the molecular level to the mesoscale. Applications of the system on the characterization of photoactive layers of bulk heterojunction solar cell, two-dimensional materials, gold nanoparticles, fabricated gold nanoparticle arrays and cells samples are shown in this dissertation
Three-dimensional Radial Visualization of High-dimensional Datasets with Mixed Features
We develop methodology for 3D radial visualization (RadViz) of
high-dimensional datasets. Our display engine is called RadViz3D and extends
the classical 2D RadViz that visualizes multivariate data in the 2D plane by
mapping every record to a point inside the unit circle. We show that
distributing anchor points at least approximately uniformly on the 3D unit
sphere provides a better visualization with minimal artificial visual
correlation for data with uncorrelated variables. Our RadViz3D methodology
therefore places equi-spaced anchor points, one for every feature, exactly for
the five Platonic solids, and approximately via a Fibonacci grid for the other
cases. Our Max-Ratio Projection (MRP) method then utilizes the group
information in high dimensions to provide distinctive lower-dimensional
projections that are then displayed using Radviz3D. Our methodology is extended
to datasets with discrete and continuous features where a Gaussianized
distributional transform is used in conjunction with copula models before
applying MRP and visualizing the result using RadViz3D. A R package radviz3d
implementing our complete methodology is available.Comment: 12 pages, 10 figures, 1 tabl
Data exploration process based on the self-organizing map
With the advances in computer technology, the amount of data that is obtained from various sources and stored in electronic media is growing at exponential rates. Data mining is a research area which answers to the challange of analysing this data in order to find useful information contained therein. The Self-Organizing Map (SOM) is one of the methods used in data mining. It quantizes the training data into a representative set of prototype vectors and maps them on a low-dimensional grid. The SOM is a prominent tool in the initial exploratory phase in data mining.
The thesis consists of an introduction and ten publications. In the publications, the validity of SOM-based data exploration methods has been investigated and various enhancements to them have been proposed. In the introduction, these methods are presented as parts of the data mining process, and they are compared with other data exploration methods with similar aims.
The work makes two primary contributions. Firstly, it has been shown that the SOM provides a versatile platform on top of which various data exploration methods can be efficiently constructed. New methods and measures for visualization of data, clustering, cluster characterization, and quantization have been proposed. The SOM algorithm and the proposed methods and measures have been implemented as a set of Matlab routines in the SOM Toolbox software library.
Secondly, a framework for SOM-based data exploration of table-format data - both single tables and hierarchically organized tables - has been constructed. The framework divides exploratory data analysis into several sub-tasks, most notably the analysis of samples and the analysis of variables. The analysis methods are applied autonomously and their results are provided in a report describing the most important properties of the data manifold. In such a framework, the attention of the data miner can be directed more towards the actual data exploration task, rather than on the application of the analysis methods. Because of the highly iterative nature of the data exploration, the automation of routine analysis tasks can reduce the time needed by the data exploration process considerably.reviewe
Recommended from our members
Chemical Vapor Deposition Grown Pristine and Chemically Doped Monolayer Graphene
Chemical vapor deposition growth has been a popular technique to produce large-area, high-quality monolayer graphene on Cu substrates ever since its first demonstration in 2009. Pristine graphene grown in such a way owns the natures of zero charge carriers and zero band gap. As an analogy to semi-conductor studies, substitutional doping with foreign atoms is a powerful way to tailor the electronic properties of this host materials. Within such a context, this thesis focuses on growing and characterizing both pristine and chemically-doped CVD grown monolayer graphene films at microscopic scales. We first synthesized pristine graphene on Cu single crystals in ultra-high-vacuum and subsequently characterized their properties by scanning tunneling microscopy/spectroscopy (STM/S), to learn the effects of Cu substrate crystallinity on the quality of graphene growth and understand the interactions between graphene films and Cu substrates. In the subsequent chapters, we chemically doped graphene with nitrogen (N) and boron (B) atoms, and characterized their topographic and electronic structures via STM/S. We found that both N and B dopants substitionally dope graphene films, and contribute electron and hole carriers, respectively, into graphene at a rate of approximately 0.5 carrier/dopant. Apart from this, we have made comparisons between N- and B-doped graphene films in aspects of topographic features, dopant distribution and electronic perturbations. In the last part of this thesis, we used Raman spectroscopy mapping to investigate the N dopant distribution within and across structural grains. Future experiments are also brief discussed at the end of the thesis
Recommended from our members
Mapping lung cancer epithelial-mesenchymal transition states and trajectories with single-cell resolution.
Elucidating the spectrum of epithelial-mesenchymal transition (EMT) and mesenchymal-epithelial transition (MET) states in clinical samples promises insights on cancer progression and drug resistance. Using mass cytometry time-course analysis, we resolve lung cancer EMT states through TGFβ-treatment and identify, through TGFβ-withdrawal, a distinct MET state. We demonstrate significant differences between EMT and MET trajectories using a computational tool (TRACER) for reconstructing trajectories between cell states. In addition, we construct a lung cancer reference map of EMT and MET states referred to as the EMT-MET PHENOtypic STAte MaP (PHENOSTAMP). Using a neural net algorithm, we project clinical samples onto the EMT-MET PHENOSTAMP to characterize their phenotypic profile with single-cell resolution in terms of our in vitro EMT-MET analysis. In summary, we provide a framework to phenotypically characterize clinical samples in the context of in vitro EMT-MET findings which could help assess clinical relevance of EMT in cancer in future studies
- …