52 research outputs found
Scalable Solutions for Automated Single Pulse Identification and Classification in Radio Astronomy
Data collection for scientific applications is increasing exponentially and
is forecasted to soon reach peta- and exabyte scales. Applications which
process and analyze scientific data must be scalable and focus on execution
performance to keep pace. In the field of radio astronomy, in addition to
increasingly large datasets, tasks such as the identification of transient
radio signals from extrasolar sources are computationally expensive. We present
a scalable approach to radio pulsar detection written in Scala that
parallelizes candidate identification to take advantage of in-memory task
processing using Apache Spark on a YARN distributed system. Furthermore, we
introduce a novel automated multiclass supervised machine learning technique
that we combine with feature selection to reduce the time required for
candidate classification. Experimental testing on a Beowulf cluster with 15
data nodes shows that the parallel implementation of the identification
algorithm offers a speedup of up to 5X that of a similar multithreaded
implementation. Further, we show that the combination of automated multiclass
classification and feature selection speeds up the execution performance of the
RandomForest machine learning algorithm by an average of 54% with less than a
2% average reduction in the algorithm's ability to correctly classify pulsars.
The generalizability of these results is demonstrated by using two real-world
radio astronomy data sets.Comment: In Proceedings of the 47th International Conference on Parallel
Processing (ICPP 2018). ACM, New York, NY, USA, Article 11, 11 page
Do optimization methods in deep learning applications matter?
With advances in deep learning, exponential data growth and increasing model
complexity, developing efficient optimization methods are attracting much
research attention. Several implementations favor the use of Conjugate Gradient
(CG) and Stochastic Gradient Descent (SGD) as being practical and elegant
solutions to achieve quick convergence, however, these optimization processes
also present many limitations in learning across deep learning applications.
Recent research is exploring higher-order optimization functions as better
approaches, but these present very complex computational challenges for
practical use. Comparing first and higher-order optimization functions, in this
paper, our experiments reveal that Levemberg-Marquardt (LM) significantly
supersedes optimal convergence but suffers from very large processing time
increasing the training complexity of both, classification and reinforcement
learning problems. Our experiments compare off-the-shelf optimization
functions(CG, SGD, LM and L-BFGS) in standard CIFAR, MNIST, CartPole and
FlappyBird experiments.The paper presents arguments on which optimization
functions to use and further, which functions would benefit from
parallelization efforts to improve pretraining time and learning rate
convergence
Contextual Bandit Modeling for Dynamic Runtime Control in Computer Systems
Modern operating systems and microarchitectures provide a myriad of mechanisms for monitoring and affecting system operation and resource utilization at runtime. Dynamic runtime control of these mechanisms can tailor system operation to the characteristics and behavior of the current workload, resulting in improved performance. However, developing effective models for system control can be challenging. Existing methods often require extensive manual effort, computation time, and domain knowledge to identify relevant low-level performance metrics, relate low-level performance metrics and high-level control decisions to workload performance, and to evaluate the resulting control models.
This dissertation develops a general framework, based on the contextual bandit, for describing and learning effective models for runtime system control. Random profiling is used to characterize the relationship between workload behavior, system configuration, and performance. The framework is evaluated in the context of two applications of progressive complexity; first, the selection of paging modes (Shadow Paging, Hardware-Assisted Page) in the Xen virtual machine memory manager; second, the utilization of hardware memory prefetching for multi-core, multi-tenant workloads with cross-core contention for shared memory resources, such as the last-level cache and memory bandwidth. The resulting models for both applications are competitive in comparison to existing runtime control approaches. For paging mode selection, the resulting model provides equivalent performance to the state of the art while substantially reducing the computation requirements of profiling. For hardware memory prefetcher utilization, the resulting models are the first to provide dynamic control for hardware prefetchers using workload statistics. Finally, a correlation-based feature selection method is evaluated for identifying relevant low-level performance metrics related to hardware memory prefetching
Implementing Push-Pull Efficiently in GraphBLAS
We factor Beamer's push-pull, also known as direction-optimized
breadth-first-search (DOBFS) into 3 separable optimizations, and analyze them
for generalizability, asymptotic speedup, and contribution to overall speedup.
We demonstrate that masking is critical for high performance and can be
generalized to all graph algorithms where the sparsity pattern of the output is
known a priori. We show that these graph algorithm optimizations, which
together constitute DOBFS, can be neatly and separably described using linear
algebra and can be expressed in the GraphBLAS linear-algebra-based framework.
We provide experimental evidence that with these optimizations, a DOBFS
expressed in a linear-algebra-based graph framework attains competitive
performance with state-of-the-art graph frameworks on the GPU and on a
multi-threaded CPU, achieving 101 GTEPS on a Scale 22 RMAT graph.Comment: 11 pages, 7 figures, International Conference on Parallel Processing
(ICPP) 201
Trilateral research chairs initiative : final report
This project set out to use light to solve environmental issues that are of high importance in Africa and beyond. In addition, the project chairs regarded high quality personnel training as a major component of the research, as the generation of skilled scientists with the knowledge, abilities, and international network to solve long-term problems. Combined, the three research groups utilized their extensive experience in the use of light to trigger photophysical or photochemical processes, which capitalize on their complementary skills in synthesis, nanotechnology and application of photochemistry to health, such as therapeutics or diagnosis, as well as to environmental issues. Following the growing interest in combining nanomaterials with photosensitizers for photodynamic antimicrobial chemotherapy (PACT) and photodegradation of pollutants in water sanitation, the aim of the project, therefore, was to link metallic and/or metal oxide nano/micro-particles to materials that could enhance their environmental performance, for example by decorating them with metal nanostructures, or with porphyrin-type complexes such as metallophthalocyanines and metalloporphyrins to create new hybrid materials for their intelligent use in environmental control. For this, new materials were synthesized and characterized, and their applications on PACT, degradation of pollutants as well as photosterilization of potable water are being explored
- …