1,959 research outputs found
Auto-tuning Distributed Stream Processing Systems using Reinforcement Learning
Fine tuning distributed systems is considered to be a craftsmanship, relying
on intuition and experience. This becomes even more challenging when the
systems need to react in near real time, as streaming engines have to do to
maintain pre-agreed service quality metrics. In this article, we present an
automated approach that builds on a combination of supervised and reinforcement
learning methods to recommend the most appropriate lever configurations based
on previous load. With this, streaming engines can be automatically tuned
without requiring a human to determine the right way and proper time to deploy
them. This opens the door to new configurations that are not being applied
today since the complexity of managing these systems has surpassed the
abilities of human experts. We show how reinforcement learning systems can find
substantially better configurations in less time than their human counterparts
and adapt to changing workloads
Machine learning for property prediction and optimization of polymeric nanocomposites: a state-of-the-art
Recently, the field of polymer nanocomposites has been an area of high scientific and industrial attention due to noteworthy improvements attained in these materials, arising from the synergetic combination of properties of a polymeric matrix and an organic or inorganic nanomaterial. The enhanced performance of those materials typically involves superior mechanical strength, toughness and stiffness, electrical and thermal conductivity, better flame retardancy and a higher barrier to moisture and gases. Nanocomposites can also display unique design possibilities, which provide exceptional advantages in developing multifunctional materials with desired properties for specific applications. On the other hand, machine learning (ML) has been recognized as a powerful predictive tool for data-driven multi-physical modelling, leading to unprecedented insights and an exploration of the system's properties beyond the capability of traditional computational and experimental analyses. This article aims to provide a brief overview of the most important findings related to the application of ML for the rational design of polymeric nanocomposites. Prediction, optimization, feature identification and uncertainty quantification are presented along with different ML algorithms used in the field of polymeric nanocomposites for property prediction, and selected examples are discussed. Finally, conclusions and future perspectives are highlighted
Recommended from our members
Global morphogenetic flow is accurately predicted by the spatial distribution of myosin motors.
During embryogenesis tissue layers undergo morphogenetic flow rearranging and folding into specific shapes. While developmental biology has identified key genes and local cellular processes, global coordination of tissue remodeling at the organ scale remains unclear. Here, we combine in toto light-sheet microscopy of the Drosophila embryo with quantitative analysis and physical modeling to relate cellular flow with the patterns of force generation during the gastrulation process. We find that the complex spatio-temporal flow pattern can be predicted from the measured meso-scale myosin density and anisotropy using a simple, effective viscous model of the tissue, achieving close to 90% accuracy with one time dependent and two constant parameters. Our analysis uncovers the importance of a) spatial modulation of myosin distribution on the scale of the embryo and b) the non-locality of its effect due to mechanical interaction of cells, demonstrating the need for the global perspective in the study of morphogenetic flow
HPC-enabling technologies for high-fidelity combustion simulations
With the increase in computational power in the last decade and the forthcoming Exascale supercomputers, a new horizon in computational modelling and simulation is envisioned in combustion science. Considering the multiscale and multiphysics characteristics of turbulent reacting flows, combustion simulations are considered as one of the most computationally demanding applications running on cutting-edge supercomputers. Exascale computing opens new frontiers for the simulation of combustion systems as more realistic conditions can be achieved with high-fidelity methods. However, an efficient use of these computing architectures requires methodologies that can exploit all levels of parallelism. The efficient utilization of the next generation of supercomputers needs to be considered from a global perspective, that is, involving physical modelling and numerical methods with methodologies based on High-Performance Computing (HPC) and hardware architectures. This review introduces recent developments in numerical methods for large-eddy simulations (LES) and direct-numerical simulations (DNS) to simulate combustion systems, with focus on the computational performance and algorithmic capabilities. Due to the broad scope, a first section is devoted to describe the fundamentals of turbulent combustion, which is followed by a general description of state-of-the-art computational strategies for solving these problems. These applications require advanced HPC approaches to exploit modern supercomputers, which is addressed in the third section. The increasing complexity of new computing architectures, with tightly coupled CPUs and GPUs, as well as high levels of parallelism, requires new parallel models and algorithms exposing the required level of concurrency. Advances in terms of dynamic load balancing, vectorization, GPU acceleration and mesh adaptation have permitted to achieve highly-efficient combustion simulations with data-driven methods in HPC environments. Therefore, dedicated sections covering the use of high-order methods for reacting flows, integration of detailed chemistry and two-phase flows are addressed. Final remarks and directions of future work are given at the end.
}The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under the CoEC project, grant agreement No. 952181 and the CoE RAISE project grant agreement no. 951733.Peer ReviewedPostprint (published version
High variability of perezone content in rhizomes of Acourtia cordata wild plants, environmental factors related, and proteomic analysis
With the aim of exploring the source of the high variability observed in the production of perezone, in Acourtia cordata wild plants, we analyze the influence of soil parameters and phenotypic characteristics on its perezone content. Perezone is a sesquiterpene quinone responsible for several pharmacological effects and the A. cordata plants are the natural source of this metabolite. The chemistry of perezone has been widely studied, however, no studies exist related to its production under natural conditions, nor to its biosynthesis and the environmental factors that affect the yield of this compound in wild plants. We also used a proteomic approach to detect differentially expressed proteins in wild plant rhizomes and compare the profiles of high vs. low perezone-producing plants. Our results show that in perezone-producing rhizomes, the presence of high concentrations of this compound could result from a positive response to the effects of some edaphic factors, such as total phosphorus (Pt), total nitrogen (Nt), ammonium (NH4), and organic matter (O. M.), but could also be due to a negative response to the soil pH value. Additionally, we identified 616 differentially expressed proteins between high and low perezone producers. According to the functional annotation of this comparison, the upregulated proteins were grouped in valine biosynthesis, breakdown of leucine and isoleucine, and secondary metabolism such as terpenoid biosynthesis. Downregulated proteins were grouped in basal metabolism processes, such as pyruvate and purine metabolism and glycolysis/gluconeogenesis. Our results suggest that soil parameters can impact the content of perezone in wild plants. Furthermore, we used proteomic resources to obtain data on the pathways expressed when A. cordata plants produce high and low concentrations of perezone. These data may be useful to further explore the possible relationship between perezone production and abiotic or biotic factors and the molecular mechanisms related to high and low perezone production.This work was supported by the Programa de Mejoramiento del Profesorado PROMEP/103.5/13/6626 and Consejo Nacional de Ciencia y Tecnología CONACyT-Mexico for Ph.D. scholarship 392123/254165. The University of Alicante lab is a member of Proteored, PRB3 and is supported by grant PT17/0019, of the PE I+D+I 2013-2016, funded by ISCIII and ERDF. Roque Bru-Martínez received financial support from the University of Alicante (VIGROB-105)
Recommended from our members
Making Data Storage Efficient in the Era of Cloud Computing
We enter the era of cloud computing in the last decade, as many paradigm shifts are happening on how people write and deploy applications. Despite the advancement of cloud computing, data storage abstractions have not evolved much, causing inefficiencies in performance, cost, and security.
This dissertation proposes a novel approach to make data storage efficient in the era of cloud computing by building new storage abstractions and systems that bridge the gap between cloud computing and data storage and simplify development. We build four systems to address four data inefficiencies in cloud computing.
The first system, Grandet, solves the data storage inefficiency caused by the paradigm shift from upfront provisioning to a variety of pay-as-you-go cloud services. Grandet is an extensible storage system that significantly reduces storage costs for web applications deployed in the cloud. Under the hood, it supports multiple heterogeneous stores and unifies them by placing each data object at the store deemed most economical. Our results show that Grandet reduces their costs by an average of 42.4%, and it is fast, scalable, and easy to use.
The second system, Unic, solves the data inefficiency caused by the paradigm shift from single-tenancy to multi-tenancy. Unic securely deduplicates general computations. It exports a cache service that allows cloud applications running on behalf of mutually distrusting users to memoize and reuse computation results, thereby improving performance. Unic achieves both integrity and secrecy through a novel use of code attestation, and it provides a simple yet expressive API that enables applications to deduplicate their own rich computations. Our results show that Unic is easy to use, speeds up applications by an average of 7.58x, and with little storage overhead.
The third system, Lambdata, solves the data inefficiency caused by the paradigm shift to serverless computing, where developers only write core business logic, and cloud service providers maintain all the infrastructure. Lambdata is a novel serverless computing system that enables developers to declare a cloud function's data intents, including both data read and data written. Once data intents are made explicit, Lambdata performs a variety of optimizations to improve speed, including caching data locally and scheduling functions based on code and data locality. Our results show that Lambdata achieves an average speedup of 1.51x on the turnaround time of practical workloads and reduces monetary cost by 16.5%.
The fourth system, CleanOS, solves the data inefficiency caused by the paradigm shift from desktop computers to smartphones always connected to the cloud. CleanOS is a new Android-based operating system that manages sensitive data rigorously and maintains a clean environment at all times. It identifies and tracks sensitive data, encrypts it with a key, and evicts that key to the cloud when the data is not in active use on the device. Our results show that CleanOS limits sensitive-data exposure drastically while incurring acceptable overheads on mobile networks
On the design of architecture-aware algorithms for emerging applications
This dissertation maps various kernels and applications to a spectrum of programming models and architectures and also presents architecture-aware algorithms for different systems. The kernels and applications discussed in this dissertation have widely varying computational characteristics. For example, we consider both dense numerical computations and sparse graph algorithms. This dissertation also covers emerging applications from image processing, complex network analysis, and computational biology.
We map these problems to diverse multicore processors and manycore accelerators. We also use new programming models (such as Transactional Memory, MapReduce, and Intel TBB) to address the performance and productivity challenges in the problems. Our experiences highlight the importance of mapping applications to appropriate programming models and architectures. We also find several limitations of current system software and architectures and directions to improve those. The discussion focuses on system software and architectural support for nested irregular parallelism, Transactional Memory, and hybrid data transfer mechanisms. We believe that the complexity of parallel programming can be significantly reduced via collaborative efforts among researchers and practitioners from different domains. This dissertation participates in the efforts by providing benchmarks and suggestions to improve system software and architectures.Ph.D.Committee Chair: Bader, David; Committee Member: Hong, Bo; Committee Member: Riley, George; Committee Member: Vuduc, Richard; Committee Member: Wills, Scot
- …