1,718 research outputs found
A Survey on Automatic Parameter Tuning for Big Data Processing Systems
Big data processing systems (e.g., Hadoop, Spark, Storm) contain a vast number of configuration parameters controlling parallelism, I/O behavior, memory settings, and compression. Improper parameter settings can cause significant performance degradation and stability issues. However, regular users and even expert administrators grapple with understanding and tuning them to achieve good performance. We investigate existing approaches on parameter tuning for both batch and stream data processing systems and classify them into six categories: rule-based, cost modeling, simulation-based, experiment-driven, machine learning, and adaptive tuning. We summarize the pros and cons of each approach and raise some open research problems for automatic parameter tuning.Peer reviewe
Generation of metabolites by an automated online metabolism method using human liver microsomes with subsequent identification by LC-MS(n), and metabolism of 11 cathinones
Human liver microsomes (HLMs) are used to simulate human xenobiotic metabolism in vitro. In forensic and clinical toxicology, HLMs are popularly used to study the metabolism of new designer drugs for example. In this work, we present an automated online extraction system we developed for HLM experiments, which was compared to a classical offline approach. Furthermore, we present studies on the metabolism of 11 cathinones; for eight of these, the metabolism has not previously been reported. Metabolites were identified based on MS2 and MS3 scans. Fifty-three substances encompassing various classes of drugs were employed to compare the established offline and the new online methods. The metabolism of each of the following 11 cathinones was studied using the new method: 3,4-methylenedioxy-N-benzylcathinone, benzedrone, butylone, dimethylcathinone, ethylone, flephedrone, methedrone, methylone, methylethylcathinone, naphyrone, and pentylone. The agreement between the offline and the online methods was good; a total of 158 metabolites were identified. Using only the offline method, 156 (98.7%) metabolites were identified, while 151 (95.6%) were identified using only the online method. The metabolic pathways identified for the 11 cathinones included the reduction of the keto group, desalkylation, hydroxylation, and desmethylenation in cathinones containing a methylenedioxy moiety. Our method provides a straightforward approach to identifying metabolites which can then be added to the library utilized by our clinical toxicological screening method. The performance of our method compares well with that of an established offline HLM procedure, but is as automated as possibl
Recommended from our members
The Design and Implementation of Low-Latency Prediction Serving Systems
Machine learning is being deployed in a growing number of applications which demand real- time, accurate, and cost-efficient predictions under heavy query load. These applications employ a variety of machine learning frameworks and models, often composing several models within the same application. However, most machine learning frameworks and systems are optimized for model training and not deployment.In this thesis, I discuss three prediction serving systems designed to meet the needs of modern interactive machine learning applications. The key idea in this work is to utilize a decoupled, layered design that interposes systems on top of training frameworks to build low-latency, scalable serving systems. Velox introduced this decoupled architecture to enable fast online learning and model personalization in response to feedback. Clipper generalized this system architecture to be framework-agnostic and introduced a set of optimizations to reduce and bound prediction latency and improve prediction throughput, accuracy, and robustness without modifying the underlying machine learning frameworks. And InferLine provisions and manages the individual stages of prediction pipelines to minimize cost while meeting end-to-end tail latency constraints
Seamless Service Provisioning for Mobile Crowdsensing: Towards Integrating Forward and Spot Trading Markets
The challenge of exchanging and processing of big data over Mobile
Crowdsensing (MCS) networks calls for the new design of responsive and seamless
service provisioning as well as proper incentive mechanisms. Although
conventional onsite spot trading of resources based on real-time network
conditions and decisions can facilitate the data sharing over MCS networks, it
often suffers from prohibitively long service provisioning delays and
unavoidable trading failures due to its reliance on timely analysis of complex
and dynamic MCS environments. These limitations motivate us to investigate an
integrated forward and spot trading mechanism (iFAST), which entails a new
hybrid service trading protocol over the MCS network architecture. In iFAST,
the sellers (i.e., mobile users with sensing resources) can provide long-term
or temporary sensing services to the buyers (i.e., sensing task owners). iFast
enables signing long-term contracts in advance of future transactions through a
forward trading mode, via analyzing historical statistics of the market, for
which the notion of overbooking is introduced and promoted. iFAST further
enables the buyers with unsatisfying service quality to recruit temporary
sellers through a spot trading mode, upon considering the current
market/network conditions. We analyze the fundamental blocks of iFAST, and
provide a case study to demonstrate its superior performance as compared to
existing methods. Finally, future research directions on reliable service
provisioning for next-generation MCS networks are summarized
- …