1,018 research outputs found

    Min-Max MPC based on a computationally efficient upper bound of the worst case cost

    Get PDF
    Min-Max MPC (MMMPC) controllers [P.J. Campo, M. Morari, Robust model predictive control, in: Proc. American Control Conference, June 10–12, 1987, pp. 1021–1026] suffer from a great computational burden which limits their applicability in the industry. Sometimes upper bounds of the worst possible case of a performance index have been used to reduce the computational burden. This paper proposes a computationally efficient MMMPC control strategy in which the worst case cost is approximated by an upper bound based on a diagonalization scheme. The upper bound can be computed with O(n3) operations and using only simple matrix operations. This implies that the algorithm can be coded easily even in non-mathematical oriented programming languages such as those found in industrial embedded control hardware. A simulation example is given in the paper

    Activity Cliff Prediction: Dataset and Benchmark

    Full text link
    Activity cliffs (ACs), which are generally defined as pairs of structurally similar molecules that are active against the same bio-target but significantly different in the binding potency, are of great importance to drug discovery. Up to date, the AC prediction problem, i.e., to predict whether a pair of molecules exhibit the AC relationship, has not yet been fully explored. In this paper, we first introduce ACNet, a large-scale dataset for AC prediction. ACNet curates over 400K Matched Molecular Pairs (MMPs) against 190 targets, including over 20K MMP-cliffs and 380K non-AC MMPs, and provides five subsets for model development and evaluation. Then, we propose a baseline framework to benchmark the predictive performance of molecular representations encoded by deep neural networks for AC prediction, and 16 models are evaluated in experiments. Our experimental results show that deep learning models can achieve good performance when the models are trained on tasks with adequate amount of data, while the imbalanced, low-data and out-of-distribution features of the ACNet dataset still make it challenging for deep neural networks to cope with. In addition, the traditional ECFP method shows a natural advantage on MMP-cliff prediction, and outperforms other deep learning models on most of the data subsets. To the best of our knowledge, our work constructs the first large-scale dataset for AC prediction, which may stimulate the study of AC prediction models and prompt further breakthroughs in AI-aided drug discovery. The codes and dataset can be accessed by https://drugai.github.io/ACNet/

    A Hybrid Machine Learning Framework for Predicting Students’ Performance in Virtual Learning Environment

    Get PDF
    Virtual Learning Environments (VLE), such as Moodle and Blackboard, store vast data to help identify students\u27 performance and engagement. As a result, researchers have been focusing their efforts on assisting educational institutions in providing machine learning models to predict at-risk students and improve their performance. However, it requires an efficient approach to construct a model that can ultimately provide accurate predictions. Consequently, this study proposes a hybrid machine learning framework to predict students\u27 performance using eight classification algorithms and three ensemble methods (Bagging, Boosting, Voting) to determine the best-performing predictive model. In addition, this study used filter-based and wrapper-based feature selection techniques to select the best features of the dataset related to students\u27 performance. The obtained results reveal that the ensemble methods recorded higher predictive accuracy when compared to single classifiers. Furthermore, the accuracy of the models improved due to the feature selection techniques utilized in this study

    The Human Tumor Atlas Network: Charting Tumor Transitions across Space and Time at Single-Cell Resolution

    Get PDF
    Crucial transitions in cancer—including tumor initiation, local expansion, metastasis, and therapeutic resistance—involve complex interactions between cells within the dynamic tumor ecosystem. Transformative single-cell genomics technologies and spatial multiplex in situ methods now provide an opportunity to interrogate this complexity at unprecedented resolution. The Human Tumor Atlas Network (HTAN), part of the National Cancer Institute (NCI) Cancer Moonshot Initiative, will establish a clinical, experimental, computational, and organizational framework to generate informative and accessible three-dimensional atlases of cancer transitions for a diverse set of tumor types. This effort complements both ongoing efforts to map healthy organs and previous large-scale cancer genomics approaches focused on bulk sequencing at a single point in time. Generating single-cell, multiparametric, longitudinal atlases and integrating them with clinical outcomes should help identify novel predictive biomarkers and features as well as therapeutically relevant cell types, cell states, and cellular interactions across transitions. The resulting tumor atlases should have a profound impact on our understanding of cancer biology and have the potential to improve cancer detection, prevention, and therapeutic discovery for better precision-medicine treatments of cancer patients and those at risk for cancer

    A Bi-Criteria Active Learning Algorithm for Dynamic Data Streams

    Get PDF
    Active learning (AL) is a promising way to efficiently building up training sets with minimal supervision. A learner deliberately queries specific instances to tune the classifier’s model using as few labels as possible. The challenge for streaming is that the data distribution may evolve over time and therefore the model must adapt. Another challenge is the sampling bias where the sampled training set does not reflect the underlying data distribution. In presence of concept drift, sampling bias is more likely to occur as the training set needs to represent the whole evolving data. To tackle these challenges, we propose a novel bi-criteria AL approach (BAL) that relies on two selection criteria, namely label uncertainty criterion and density-based cri- terion . While the first criterion selects instances that are the most uncertain in terms of class membership, the latter dynamically curbs the sampling bias by weighting the samples to reflect on the true underlying distribution. To design and implement these two criteria for learning from streams, BAL adopts a Bayesian online learning approach and combines online classification and online clustering through the use of online logistic regression and online growing Gaussian mixture models respectively. Empirical results obtained on standard synthetic and real-world benchmarks show the high performance of the proposed BAL method compared to the state-of-the-art AL method

    Genome-scale Profiling Reveals Noncoding Loci Carry Higher Proportions of Concordant Data

    Get PDF
    Many evolutionary relationships remain controversial despite whole-genome sequencing data. These controversies arise in part due to challenges associated with accurately modeling the complex phylogenetic signal coming from genomic regions experiencing distinct evolutionary forces. Here we examine how different regions of the genome support or contradict well-established hypotheses among three mammal groups using millions of orthologous parsimony-informative biallelic sites [PIBS] distributed across primate, rodent, and Pecora genomes. We compared PIBS concordance percentages among locus types (e.g. coding sequences, introns, intergenic regions), and contrasted PIBS utility over evolutionary timescales. Sites derived from noncoding sequences provided more data and proportionally more concordant sites compared with those from coding sequences [CDS] in all clades. CDS PIBS were also predominant drivers of tree incongruence in two cases of topological conflict. PIBS derived from most locus types provided surprisingly consistent support for splitting events spread across the timescales we examined, although we find evidence that CDS and intronic PIBS may, respectively and to a limited degree, inform disproportionately about older and younger splits. In this era of accessible whole genome sequence data, these results (1) suggest benefits to more intentionally focusing on noncoding loci as robust data for tree inference, and (2) reinforce the importance of accurate modeling, especially when using CDS data

    MuDelta: Delta-Oriented Mutation Testing at Commit Time

    Get PDF
    To effectively test program changes using mutation testing, one needs to use mutants that are relevant to the altered program behaviours. In view of this, we introduce MuDelta, an approach that identifies commit-relevant mutants; mutants that affect and are affected by the changed program behaviours. Our approach uses machine learning applied on a combined scheme of graph and vector-based representations of static code features. Our results, from 50 commits in 21 Coreutils programs, demonstrate a strong prediction ability of our approach; yielding 0.80 (ROC) and 0.50 (PR Curve) AUC values with 0.63 and 0.32 precision and recall values. These predictions are significantly higher than random guesses, 0.20 (PR-Curve) AUC, 0.21 and 0.21 precision and recall, and subsequently lead to strong relevant tests that kill 45%more relevant mutants than randomly sampled mutants (either sampled from those residing on the changed component(s) or from the changed lines). Our results also show that MuDelta selects mutants with 27% higher fault revealing ability in fault introducing commits. Taken together, our results corroborate the conclusion that commit-based mutation testing is suitable and promising for evolving software

    Autonomous supervision and optimization of product quality in a multi-stage manufacturing process based on self-adaptive prediction models.

    Get PDF
    In modern manufacturing facilities, there are basically two essential phases for assuring high production quality with low (or even zero) defects and waste in order to save costs for companies. The first phase concerns the early recognition of potentially arising problems in product quality, the second phase concerns proper reactions upon the recognition of such problems. In this paper, we address a holistic approach for handling both issues consecutively within a predictive maintenance framework at an on-line production system. Thereby, we address multi-stage functionality based on (i) data-driven forecast models for (measure-able) product quality criteria (QCs) at a latter stage, which are established and executed through process values (and their time series trends) recorded at an early stage of production (describing its progress), and (ii) process optimization cycles whose outputs are suggestions for proper reactions at an earlier stage in the case of forecasted downtrends or exceeds of allowed boundaries in product quality. The data-driven forecast models are established through a high-dimensional batch time-series modeling problem. In this, we employ a non-linear version of PLSR (partial least squares regression) by coupling PLS with generalized Takagi–Sugeno fuzzy systems (termed as PLS-fuzzy). The models are able to self-adapt over time based on recursive parameters adaptation and rule evolution functionalities. Two concepts for increased flexibility during model updates are proposed, (i) a dynamic outweighing strategy of older samples with an adaptive update of the forgetting factor (steering forgetting intensity) and (ii) an incremental update of the latent variable space spanned by the directions (loading vectors) achieved through PLS; the whole model update approach is termed as SAFM-IF (self-adaptive forecast models with increased flexibility). Process optimization is achieved through multi-objective optimization using evolutionary techniques, where the (trained and updated) forecast models serve as surrogate models to guide the optimization process to Pareto fronts (containing solution candidates) with high quality. A new influence analysis between process values and QCs is suggested based on the PLS-fuzzy forecast models in order to reduce the dimensionality of the optimization space and thus to guarantee high(er) quality of solutions within a reasonable amount of time (→ better usage in on-line mode). The methodologies have been comprehensively evaluated on real on-line process data from a (micro-fluidic) chip production system, where the early stage comprises the injection molding process and the latter stage the bonding process. The results show remarkable performance in terms of low prediction errors of the PLS-fuzzy forecast models (showing mostly lower errors than achieved by other model architectures) as well as in terms of Pareto fronts with individuals (solutions) whose fitness was close to the optimal values of three most important target QCs (being used for supervision): flatness, void events and RMSEs of the chips. Suggestions could thus be provided to experts/operators how to best change process values and associated machining parameters at the injection molding process in order to achieve significantly higher product quality for the final chips at the end of the bonding process

    Quantifying and Explaining Machine Learning Uncertainty in Predictive Process Monitoring: An Operations Research Perspective

    Full text link
    This paper introduces a comprehensive, multi-stage machine learning methodology that effectively integrates information systems and artificial intelligence to enhance decision-making processes within the domain of operations research. The proposed framework adeptly addresses common limitations of existing solutions, such as the neglect of data-driven estimation for vital production parameters, exclusive generation of point forecasts without considering model uncertainty, and lacking explanations regarding the sources of such uncertainty. Our approach employs Quantile Regression Forests for generating interval predictions, alongside both local and global variants of SHapley Additive Explanations for the examined predictive process monitoring problem. The practical applicability of the proposed methodology is substantiated through a real-world production planning case study, emphasizing the potential of prescriptive analytics in refining decision-making procedures. This paper accentuates the imperative of addressing these challenges to fully harness the extensive and rich data resources accessible for well-informed decision-making

    Evaluation of Neuro-Evolution Algorithms for Tactic Volatility Aware Processes

    Get PDF
    Our society is increasingly evolving to rely on computer mechanisms that perform a variety of tasks. From a self-driving car to a satellite in space relaying data from Mars rovers, we need these systems to perform optimally and without failure. One such point of failure these systems can encounter is tactic volatility of an adaptation tactic. Adaptation tactics are defined workflows that allow systems to navigate their environment. Tactic volatility is the variance in the behavior in the attribute of a tactic, such as cost and latency and/or the combination of the two. Current systems consider these tactic attributes to be static. Studies have shown that not accounting for tactic volatility can adversely affect a system\u27s ability to operate effectively and resiliently. To support self-adaptive systems and address their limitations, this paper proposes a Tactic Volatility Aware solution that utilizes eRNN (TVA-E) and addresses the limitations of current self-adaptive systems. For this research, we used real-world data that has been made available for use by researchers and academics. This data contains real-world volatility and helps us demonstrate the positive impact TVA-E when used in self-adaptive systems. We also employ the use of uncertainty reduction tactics and how they can assist in accounting for tactic volatility. This work will serve as an evaluation and a comparison of using different machine learning methods to predict and account for tactic volatility. We will study different predictive mechanisms in this paper: Auto-Regressive Moving Average(ARIMA), Evolving Recurrent Neural Network(eRNN), Multi-Layer Perceptron(MLP), and Support Vector Regression(SVR). These methods will be studied with our TVA-E process and we will analyze how they can enhance a self-adaptive system’s performance when it accounts for tactic volatility
    • …
    corecore