178 research outputs found
Big Data Dimensional Analysis
The ability to collect and analyze large amounts of data is a growing problem
within the scientific community. The growing gap between data and users calls
for innovative tools that address the challenges faced by big data volume,
velocity and variety. One of the main challenges associated with big data
variety is automatically understanding the underlying structures and patterns
of the data. Such an understanding is required as a pre-requisite to the
application of advanced analytics to the data. Further, big data sets often
contain anomalies and errors that are difficult to know a priori. Current
approaches to understanding data structure are drawn from the traditional
database ontology design. These approaches are effective, but often require too
much human involvement to be effective for the volume, velocity and variety of
data encountered by big data systems. Dimensional Data Analysis (DDA) is a
proposed technique that allows big data analysts to quickly understand the
overall structure of a big dataset, determine anomalies. DDA exploits
structures that exist in a wide class of data to quickly determine the nature
of the data and its statical anomalies. DDA leverages existing schemas that are
employed in big data databases today. This paper presents DDA, applies it to a
number of data sets, and measures its performance. The overhead of DDA is low
and can be applied to existing big data systems without greatly impacting their
computing requirements.Comment: From IEEE HPEC 201
RadiX-Net: Structured Sparse Matrices for Deep Neural Networks
The sizes of deep neural networks (DNNs) are rapidly outgrowing the capacity
of hardware to store and train them. Research over the past few decades has
explored the prospect of sparsifying DNNs before, during, and after training by
pruning edges from the underlying topology. The resulting neural network is
known as a sparse neural network. More recent work has demonstrated the
remarkable result that certain sparse DNNs can train to the same precision as
dense DNNs at lower runtime and storage cost. An intriguing class of these
sparse DNNs is the X-Nets, which are initialized and trained upon a sparse
topology with neither reference to a parent dense DNN nor subsequent pruning.
We present an algorithm that deterministically generates RadiX-Nets: sparse DNN
topologies that, as a whole, are much more diverse than X-Net topologies, while
preserving X-Nets' desired characteristics. We further present a
functional-analytic conjecture based on the longstanding observation that
sparse neural network topologies can attain the same expressive power as dense
counterpartsComment: 7 pages, 8 figures, accepted at IEEE IPDPS 2019 GrAPL workshop. arXiv
admin note: substantial text overlap with arXiv:1809.0524
Transperineal prostate biopsy: analysis of a uniform core sampling pattern that yields data on tumor volume limits in negative biopsies
Background
Analyze an approach to distributing transperineal prostate biopsy cores that yields data on the volume of a tumor that might be present when the biopsy is negative, and also increases detection efficiency.
Methods
Basic principles of sampling and probability theory are employed to analyze a transperineal biopsy pattern that uses evenly-spaced parallel cores in order to extract quantitative data on the volume of a small spherical tumor that could potentially be present, even though the biopsy did not detect it, i.e., negative biopsy.
Results
This approach to distributing biopsy cores provides data for the upper limit on the volume of a small, spherical tumor that might be present, and the probability of smaller volumes, when biopsies are negative and provides a quantitative basis for evaluating the effectiveness of different core spacing distances.
Conclusions
Distributing transperineal biopsy cores so they are evenly spaced provides a means to calculate the probability that a tumor of given volume could be present when the biopsy is negative, and can improve detection efficiency
- β¦