19 research outputs found
An ADMM Based Framework for AutoML Pipeline Configuration
We study the AutoML problem of automatically configuring machine learning
pipelines by jointly selecting algorithms and their appropriate
hyper-parameters for all steps in supervised learning pipelines. This black-box
(gradient-free) optimization with mixed integer & continuous variables is a
challenging problem. We propose a novel AutoML scheme by leveraging the
alternating direction method of multipliers (ADMM). The proposed framework is
able to (i) decompose the optimization problem into easier sub-problems that
have a reduced number of variables and circumvent the challenge of mixed
variable categories, and (ii) incorporate black-box constraints along-side the
black-box optimization objective. We empirically evaluate the flexibility (in
utilizing existing AutoML techniques), effectiveness (against open source
AutoML toolkits),and unique capability (of executing AutoML with practically
motivated black-box constraints) of our proposed scheme on a collection of
binary classification data sets from UCI ML& OpenML repositories. We observe
that on an average our framework provides significant gains in comparison to
other AutoML frameworks (Auto-sklearn & TPOT), highlighting the practical
advantages of this framework
PaniniQA: Enhancing Patient Education Through Interactive Question Answering
Patient portal allows discharged patients to access their personalized
discharge instructions in electronic health records (EHRs). However, many
patients have difficulty understanding or memorizing their discharge
instructions. In this paper, we present PaniniQA, a patient-centric interactive
question answering system designed to help patients understand their discharge
instructions. PaniniQA first identifies important clinical content from
patients' discharge instructions and then formulates patient-specific
educational questions. In addition, PaniniQA is also equipped with answer
verification functionality to provide timely feedback to correct patients'
misunderstandings. Our comprehensive automatic and human evaluation results
demonstrate our PaniniQA is capable of improving patients' mastery of their
medical instructions through effective interactionsComment: Accepted to TACL 2023. Equal contribution for the first two authors.
This arXiv version is a pre-MIT Press publication versio
Human-centered design and evaluation of AI-empowered clinical decision support systems: a systematic review
IntroductionArtificial intelligence (AI) technologies are increasingly applied to empower clinical decision support systems (CDSS), providing patient-specific recommendations to improve clinical work. Equally important to technical advancement is human, social, and contextual factors that impact the successful implementation and user adoption of AI-empowered CDSS (AI-CDSS). With the growing interest in human-centered design and evaluation of such tools, it is critical to synthesize the knowledge and experiences reported in prior work and shed light on future work.MethodsFollowing the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we conducted a systematic review to gain an in-depth understanding of how AI-empowered CDSS was used, designed, and evaluated, and how clinician users perceived such systems. We performed literature search in five databases for articles published between the years 2011 and 2022. A total of 19874 articles were retrieved and screened, with 20 articles included for in-depth analysis.ResultsThe reviewed studies assessed different aspects of AI-CDSS, including effectiveness (e.g., improved patient evaluation and work efficiency), user needs (e.g., informational and technological needs), user experience (e.g., satisfaction, trust, usability, workload, and understandability), and other dimensions (e.g., the impact of AI-CDSS on workflow and patient-provider relationship). Despite the promising nature of AI-CDSS, our findings highlighted six major challenges of implementing such systems, including technical limitation, workflow misalignment, attitudinal barriers, informational barriers, usability issues, and environmental barriers. These sociotechnical challenges prevent the effective use of AI-based CDSS interventions in clinical settings.DiscussionOur study highlights the paucity of studies examining the user needs, perceptions, and experiences of AI-CDSS. Based on the findings, we discuss design implications and future research directions
GEMv2 : Multilingual NLG benchmarking in a single line of code
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.Peer reviewe
GEMv2 : Multilingual NLG benchmarking in a single line of code
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.Peer reviewe
Multi-Information Flow CNN and Attribute-Aided Reranking for Person Reidentification
This paper presents a multi-information flow convolutional neural network (MiF-CNN) model for person reidentification (re-id). It contains several specific multilayer convolutional structures, where the input and output of a convolutional layer are concatenated together on channel dimension. With this idea, layers of model can go deeper and feature maps can be reused by each subsequent layer. Inspired by an image caption, a person attribute recognition network is proposed based on long-short-term memory network and attention mechanism. By fusing identification results of MiF-CNN and attribute recognition, this paper introduces the attribute-aided reranking algorithm to improve the accuracy of person re-id further. Experiments on VIPeR, CUHK01, and Market1501 datasets verify the proposed MiF-CNN can be trained sufficiently with small-scale datasets and obtain outstanding accuracy of person re-id. Contrast experiments also confirm the availability of the attribute-assisted reranking algorithm
Modeling and Optimization of the Drug Extraction Production Process
Optimized control of the drug extraction production process (DEPP) aims to reduce production costs and improve economic benefit while meeting quality requirements. However, optimization of DEPP is hampered by model uncertainty. Thus, in this paper, a strategy that considers model uncertainty is proposed. Mechanistic modeling of DEPP is first discussed in the context of previous work. The predictive model used for optimization is then developed by simplifying the mechanism. Optimization for a single extraction process is first implemented, but this is found to lead to serious wastage of herbs. Hence, the optimization of a multiextraction process is then conducted. To manage the uncertainty in the model, a data-driven iterative learning control method is introduced to improve the economic benefit by adjusting the operating variables. Finally, fuzzy parameter adjustment is adopted to enhance the convergence rate of the algorithm. The effectiveness of the proposed modeling and optimization strategy is validated through a series of simulations