35,446 research outputs found
Impact of Digital Video Analytics on Accuracy of Chemobehavioural Phenotyping in Aquatic Toxicology
[Abstract] Chemobehavioural phenotypic analysis using small aquatic model organisms is becoming an important toolbox in aquatic ecotoxicology and neuroactive drug discovery. The analysis of the organismsâ behavior is usually performed by combining digital video recording with animal tracking software. This software detects the organisms in the video frames, and reconstructs their movement trajectory using image processing algorithms. In this work we investigated the impact of video file characteristics, video optimization techniques and differences in animal tracking algorithms on the accuracy of quantitative neurobehavioural endpoints. We employed larval stages of a free-swimming euryhaline crustacean Artemia franciscana,commonly used for marine ecotoxicity testing, as a proxy modelto assess the effects of video analytics on quantitative behavioural parameters. We evaluated parameters such as data processing speed, tracking precision, capability to perform high-throughput batch processing of video files. Using a model toxicant the software algorithms were also finally benchmarked against one another. Our data indicates that variability in video file parameters; such as resolution, frame rate, file containers types, codecs and compression levels, can be a source of experimental biases in behavioural analysis. Similarly, the variability in data outputs between different tracking algorithms should be taken into account when designing standardized behavioral experiments and conducting chemobehavioural phenotyping
Statistics in the Big Data era
It is estimated that about 90% of the currently available data have been produced over the last two years. Of these, only 0.5% is effectively analysed and used. However, this data can be a great wealth, the oil of 21st century, when analysed with the right approach. In this article, we illustrate some specificities of these data and the great interest that they can represent in many fields. Then we consider some challenges to statistical analysis that emerge from their analysis, suggesting some strategies
Is One Hyperparameter Optimizer Enough?
Hyperparameter tuning is the black art of automatically finding a good
combination of control parameters for a data miner. While widely applied in
empirical Software Engineering, there has not been much discussion on which
hyperparameter tuner is best for software analytics. To address this gap in the
literature, this paper applied a range of hyperparameter optimizers (grid
search, random search, differential evolution, and Bayesian optimization) to
defect prediction problem. Surprisingly, no hyperparameter optimizer was
observed to be `best' and, for one of the two evaluation measures studied here
(F-measure), hyperparameter optimization, in 50\% cases, was no better than
using default configurations.
We conclude that hyperparameter optimization is more nuanced than previously
believed. While such optimization can certainly lead to large improvements in
the performance of classifiers used in software analytics, it remains to be
seen which specific optimizers should be applied to a new dataset.Comment: 7 pages, 2 columns, accepted for SWAN1
Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies
We explore the trade-offs of performing linear algebra using Apache Spark,
compared to traditional C and MPI implementations on HPC platforms. Spark is
designed for data analytics on cluster computing platforms with access to local
disks and is optimized for data-parallel tasks. We examine three widely-used
and important matrix factorizations: NMF (for physical plausability), PCA (for
its ubiquity) and CX (for data interpretability). We apply these methods to
TB-sized problems in particle physics, climate modeling and bioimaging. The
data matrices are tall-and-skinny which enable the algorithms to map
conveniently into Spark's data-parallel model. We perform scaling experiments
on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide
tuning guidance to obtain high performance
- âŠ