38 research outputs found

    A reverse predictive model towards design automation of microfluidic droplet generators

    Get PDF
    This work has been presented in the 10th IWBDA workshop.Droplet-based microfluidic devices in comparison to test tubes can reduce reaction volumes 10^9 times and more due to the encapsulation of reactions in micro-scale droplets [4]. This volume reduction, alongside higher accuracy, higher sensitivity and faster reaction time made droplet microfluidics a superior platform particularly in biology, biomedical, and chemical engineering. However, a high barrier of entry prevents most of life science laboratories to exploit the advantages of microfluidics. There are two main obstacles to the widespread adoption of microfluidics, high fabrication costs, and lack of design automation tools. Recently, low-cost fabrication methods have reduced the cost of fabrication significantly [7]. Still, even with a low-cost fabrication method, due to lack of automation tools, life science research groups are still reliant on a microfluidic expert to develop any new microfluidic device [3, 5]. In this work, we report a framework to develop reverse predictive models that can accurately automate the design process of microfluidic droplet generators. This model takes prescribed performance metrics of droplet generators as the input and provides the geometry of the microfluidic device and the fluid and flow settings that result in the desired performance. We hope this automation tool makes droplet-based microfluidics more accessible, by reducing the time, cost, and knowledge needed for developing a microfluidic droplet generator that meets certain performance requirement

    Fitting Prediction Rule Ensembles with R Package pre

    Get PDF
    Prediction rule ensembles (PREs) are sparse collections of rules, offering highly interpretable regression and classification models. This paper presents the R package pre, which derives PREs through the methodology of Friedman and Popescu (2008). The implementation and functionality of package pre is described and illustrated through application on a dataset on the prediction of depression. Furthermore, accuracy and sparsity of PREs is compared with that of single trees, random forest and lasso regression in four benchmark datasets. Results indicate that pre derives ensembles with predictive accuracy comparable to that of random forests, while using a smaller number of variables for prediction

    New developments on the cheminformatics open workflow environment CDK-Taverna

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The computational processing and analysis of small molecules is at heart of cheminformatics and structural bioinformatics and their application in e.g. metabolomics or drug discovery. Pipelining or workflow tools allow for the Legoâ„¢-like, graphical assembly of I/O modules and algorithms into a complex workflow which can be easily deployed, modified and tested without the hassle of implementing it into a monolithic application. The CDK-Taverna project aims at building a free open-source cheminformatics pipelining solution through combination of different open-source projects such as Taverna, the Chemistry Development Kit (CDK) or the Waikato Environment for Knowledge Analysis (WEKA). A first integrated version 1.0 of CDK-Taverna was recently released to the public.</p> <p>Results</p> <p>The CDK-Taverna project was migrated to the most up-to-date versions of its foundational software libraries with a complete re-engineering of its worker's architecture (version 2.0). 64-bit computing and multi-core usage by paralleled threads are now supported to allow for fast in-memory processing and analysis of large sets of molecules. Earlier deficiencies like workarounds for iterative data reading are removed. The combinatorial chemistry related reaction enumeration features are considerably enhanced. Additional functionality for calculating a natural product likeness score for small molecules is implemented to identify possible drug candidates. Finally the data analysis capabilities are extended with new workers that provide access to the open-source WEKA library for clustering and machine learning as well as training and test set partitioning. The new features are outlined with usage scenarios.</p> <p>Conclusions</p> <p>CDK-Taverna 2.0 as an open-source cheminformatics workflow solution matured to become a freely available and increasingly powerful tool for the biosciences. The combination of the new CDK-Taverna worker family with the already available workflows developed by a lively Taverna community and published on myexperiment.org enables molecular scientists to quickly calculate, process and analyse molecular data as typically found in e.g. today's systems biology scenarios.</p

    Gradient Boosting With Piece-Wise Linear Regression Trees

    Full text link
    Gradient Boosted Decision Trees (GBDT) is a very successful ensemble learning algorithm widely used across a variety of applications. Recently, several variants of GBDT training algorithms and implementations have been designed and heavily optimized in some very popular open sourced toolkits including XGBoost, LightGBM and CatBoost. In this paper, we show that both the accuracy and efficiency of GBDT can be further enhanced by using more complex base learners. Specifically, we extend gradient boosting to use piecewise linear regression trees (PL Trees), instead of piecewise constant regression trees, as base learners. We show that PL Trees can accelerate convergence of GBDT and improve the accuracy. We also propose some optimization tricks to substantially reduce the training time of PL Trees, with little sacrifice of accuracy. Moreover, we propose several implementation techniques to speedup our algorithm on modern computer architectures with powerful Single Instruction Multiple Data (SIMD) parallelism. The experimental results show that GBDT with PL Trees can provide very competitive testing accuracy with comparable or less training time

    Sometimes, Money Does Grow On Trees: Data-Driven Demand Response with DR-Advisor

    Get PDF
    Real-time electricity pricing and demand response has become a clean, reliable and cost-effective way of mitigating peak demand on the electricity grid. We consider the problem of end-user demand response (DR) for large commercial buildings which involves predicting the demand response baseline, evaluating fixed DR strategies and synthesizing DR control actions for load curtailment in return for a financial reward. Using historical data from the building, we build a family of regression trees and learn data-driven models for predicting the power consumption of the building in real-time. We present a method called DR-Advisor called DR-Advisor, which acts as a recommender system for the building\u27s facilities manager and provides suitable control actions to meet the desired load curtailment while maintaining operations and maximizing the economic reward. We evaluate the performance of DR-Advisor for demand response using data from a real office building and a virtual test-bed
    corecore