8,173 research outputs found
Measurements over distributed high performance computing and storage systems
A strawman proposal is given for a framework for presenting a common set of metrics for supercomputers, workstations, file servers, mass storage systems, and the networks that interconnect them. Production control and database systems are also included. Though other applications and third part software systems are not addressed, it is important to measure them as well
Exact Geosedics and Shortest Paths on Polyhedral Surface
We present two algorithms for computing distances along a non-convex polyhedral surface. The first algorithm computes exact minimal-geodesic distances and the second algorithm combines these distances to compute exact shortest-path distances along the surface. Both algorithms have been extended to compute the exact minimalgeodesic paths and shortest paths. These algorithms have been implemented and validated on surfaces for which the correct solutions are known, in order to verify the accuracy and to measure the run-time performance, which is cubic or less for each algorithm. The exact-distance computations carried out by these algorithms are feasible for large-scale surfaces containing tens of thousands of vertices, and are a necessary component of near-isometric surface flattening methods that accurately transform curved manifolds into flat representations.National Institute for Biomedical Imaging and Bioengineering (R01 EB001550
Birth/birth-death processes and their computable transition probabilities with biological applications
Birth-death processes track the size of a univariate population, but many
biological systems involve interaction between populations, necessitating
models for two or more populations simultaneously. A lack of efficient methods
for evaluating finite-time transition probabilities of bivariate processes,
however, has restricted statistical inference in these models. Researchers rely
on computationally expensive methods such as matrix exponentiation or Monte
Carlo approximation, restricting likelihood-based inference to small systems,
or indirect methods such as approximate Bayesian computation. In this paper, we
introduce the birth(death)/birth-death process, a tractable bivariate extension
of the birth-death process. We develop an efficient and robust algorithm to
calculate the transition probabilities of birth(death)/birth-death processes
using a continued fraction representation of their Laplace transforms. Next, we
identify several exemplary models arising in molecular epidemiology,
macro-parasite evolution, and infectious disease modeling that fall within this
class, and demonstrate advantages of our proposed method over existing
approaches to inference in these models. Notably, the ubiquitous stochastic
susceptible-infectious-removed (SIR) model falls within this class, and we
emphasize that computable transition probabilities newly enable direct
inference of parameters in the SIR model. We also propose a very fast method
for approximating the transition probabilities under the SIR model via a novel
branching process simplification, and compare it to the continued fraction
representation method with application to the 17th century plague in Eyam.
Although the two methods produce similar maximum a posteriori estimates, the
branching process approximation fails to capture the correlation structure in
the joint posterior distribution
DataHub: Collaborative Data Science & Dataset Version Management at Scale
Relational databases have limited support for data collaboration, where teams
collaboratively curate and analyze large datasets. Inspired by software version
control systems like git, we propose (a) a dataset version control system,
giving users the ability to create, branch, merge, difference and search large,
divergent collections of datasets, and (b) a platform, DataHub, that gives
users the ability to perform collaborative data analysis building on this
version control system. We outline the challenges in providing dataset version
control at scale.Comment: 7 page
Hybrid model for vascular tree structures
This paper proposes a new representation scheme of the cerebral blood
vessels. This model provides information on the semantics of the
vascular structure: the topological relationships between vessels and
the labeling of vascular accidents such as aneurysms and stenoses.
In addition, the model keeps information of the inner surface geometry
as well as of the vascular map volume properties, i.e. the tissue
density, the blood flow velocity and the vessel wall elasticity.
The model can be constructed automatically in a pre-process from a set
of segmented MRA images. Its memory requirements are optimized on the
basis of the sparseness of the vascular structure. It allows fast
queries and efficient traversals and navigations. The visualizations
of the vessel surface can be performed at different levels of
detail. The direct rendering of the volume is fast because the model
provides a natural way to skip over empty data.
The paper analyzes the memory requirements of the model along with the
costs of the most important operations on it.Postprint (published version
Prediction of Emerging Technologies Based on Analysis of the U.S. Patent Citation Network
The network of patents connected by citations is an evolving graph, which
provides a representation of the innovation process. A patent citing another
implies that the cited patent reflects a piece of previously existing knowledge
that the citing patent builds upon. A methodology presented here (i) identifies
actual clusters of patents: i.e. technological branches, and (ii) gives
predictions about the temporal changes of the structure of the clusters. A
predictor, called the {citation vector}, is defined for characterizing
technological development to show how a patent cited by other patents belongs
to various industrial fields. The clustering technique adopted is able to
detect the new emerging recombinations, and predicts emerging new technology
clusters. The predictive ability of our new method is illustrated on the
example of USPTO subcategory 11, Agriculture, Food, Textiles. A cluster of
patents is determined based on citation data up to 1991, which shows
significant overlap of the class 442 formed at the beginning of 1997. These new
tools of predictive analytics could support policy decision making processes in
science and technology, and help formulate recommendations for action
- …