142 research outputs found
Retro-BLEU: Quantifying Chemical Plausibility of Retrosynthesis Routes through Reaction Template Sequence Analysis
Computer-assisted methods have emerged as valuable tools for retrosynthesis
analysis. However, quantifying the plausibility of generated retrosynthesis
routes remains a challenging task. We introduce Retro-BLEU, a statistical
metric adapted from the well-established BLEU score in machine translation, to
evaluate the plausibility of retrosynthesis routes based on reaction template
sequences analysis. We demonstrate the effectiveness of Retro-BLEU by applying
it to a diverse set of retrosynthesis routes generated by state-of-the-art
algorithms and compare the performance with other evaluation metrics. The
results show that Retro-BLEU is capable of differentiating between plausible
and implausible routes. Furthermore, we provide insights into the strengths and
weaknesses of Retro-BLEU, paving the way for future developments and
improvements in this field
Explainable Automated Debugging via Large Language Model-driven Scientific Debugging
Automated debugging techniques have the potential to reduce developer effort
in debugging, and have matured enough to be adopted by industry. However, one
critical issue with existing techniques is that, while developers want
rationales for the provided automatic debugging results, existing techniques
are ill-suited to provide them, as their deduction process differs
significantly from that of human developers. Inspired by the way developers
interact with code when debugging, we propose Automated Scientific Debugging
(AutoSD), a technique that given buggy code and a bug-revealing test, prompts
large language models to automatically generate hypotheses, uses debuggers to
actively interact with buggy code, and thus automatically reach conclusions
prior to patch generation. By aligning the reasoning of automated debugging
more closely with that of human developers, we aim to produce intelligible
explanations of how a specific patch has been generated, with the hope that the
explanation will lead to more efficient and accurate developer decisions. Our
empirical analysis on three program repair benchmarks shows that AutoSD
performs competitively with other program repair baselines, and that it can
indicate when it is confident in its results. Furthermore, we perform a human
study with 20 participants, including six professional developers, to evaluate
the utility of explanations from AutoSD. Participants with access to
explanations could judge patch correctness in roughly the same time as those
without, but their accuracy improved for five out of six real-world bugs
studied: 70% of participants answered that they wanted explanations when using
repair tools, while 55% answered that they were satisfied with the Scientific
Debugging presentation
LambdaOpt: Learn to Regularize Recommender Models in Finer Levels
Recommendation models mainly deal with categorical variables, such as
user/item ID and attributes. Besides the high-cardinality issue, the
interactions among such categorical variables are usually long-tailed, with the
head made up of highly frequent values and a long tail of rare ones. This
phenomenon results in the data sparsity issue, making it essential to
regularize the models to ensure generalization. The common practice is to
employ grid search to manually tune regularization hyperparameters based on the
validation data. However, it requires non-trivial efforts and large computation
resources to search the whole candidate space; even so, it may not lead to the
optimal choice, for which different parameters should have different
regularization strengths. In this paper, we propose a hyperparameter
optimization method, LambdaOpt, which automatically and adaptively enforces
regularization during training. Specifically, it updates the regularization
coefficients based on the performance of validation data. With LambdaOpt, the
notorious tuning of regularization hyperparameters can be avoided; more
importantly, it allows fine-grained regularization (i.e. each parameter can
have an individualized regularization coefficient), leading to better
generalized models. We show how to employ LambdaOpt on matrix factorization, a
classical model that is representative of a large family of recommender models.
Extensive experiments on two public benchmarks demonstrate the superiority of
our method in boosting the performance of top-K recommendation.Comment: Accepted by KDD 201
Leveraging Reaction-aware Substructures for Retrosynthesis Analysis
Retrosynthesis analysis is a critical task in organic chemistry central to
many important industries. Previously, various machine learning approaches have
achieved promising results on this task by representing output molecules as
strings and autoregressively decoded token-by-token with generative models.
Text generation or machine translation models in natural language processing
were frequently utilized approaches. The token-by-token decoding approach is
not intuitive from a chemistry perspective because some substructures are
relatively stable and remain unchanged during reactions. In this paper, we
propose a substructure-level decoding model, where the substructures are
reaction-aware and can be automatically extracted with a fully data-driven
approach. Our approach achieved improvement over previously reported models,
and we find that the performance can be further boosted if the accuracy of
substructure extraction is improved. The substructures extracted by our
approach can provide users with better insights for decision-making compared to
existing methods. We hope this work will generate interest in this fast growing
and highly interdisciplinary area on retrosynthesis prediction and other
related topics.Comment: Work in progres
LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models
Creating graphic layouts is a fundamental step in graphic designs. In this
work, we present a novel generative model named LayoutDiffusion for automatic
layout generation. As layout is typically represented as a sequence of discrete
tokens, LayoutDiffusion models layout generation as a discrete denoising
diffusion process. It learns to reverse a mild forward process, in which
layouts become increasingly chaotic with the growth of forward steps and
layouts in the neighboring steps do not differ too much. Designing such a mild
forward process is however very challenging as layout has both categorical
attributes and ordinal attributes. To tackle the challenge, we summarize three
critical factors for achieving a mild forward process for the layout, i.e.,
legality, coordinate proximity and type disruption. Based on the factors, we
propose a block-wise transition matrix coupled with a piece-wise linear noise
schedule. Experiments on RICO and PubLayNet datasets show that LayoutDiffusion
outperforms state-of-the-art approaches significantly. Moreover, it enables two
conditional layout generation tasks in a plug-and-play manner without
re-training and achieves better performance than existing methods.Comment: Accepted by ICCV2023, project page: https://layoutdiffusion.github.i
FANDA: A Novel Approach to Perform Follow-up Query Analysis
Recent work on Natural Language Interfaces to Databases (NLIDB) has attracted
considerable attention. NLIDB allow users to search databases using natural
language instead of SQL-like query languages. While saving the users from
having to learn query languages, multi-turn interaction with NLIDB usually
involves multiple queries where contextual information is vital to understand
the users' query intents. In this paper, we address a typical contextual
understanding problem, termed as follow-up query analysis. In spite of its
ubiquity, follow-up query analysis has not been well studied due to two primary
obstacles: the multifarious nature of follow-up query scenarios and the lack of
high-quality datasets. Our work summarizes typical follow-up query scenarios
and provides a new FollowUp dataset with query triples on 120 tables.
Moreover, we propose a novel approach FANDA, which takes into account the
structures of queries and employs a ranking model with weakly supervised
max-margin learning. The experimental results on FollowUp demonstrate the
superiority of FANDA over multiple baselines across multiple metrics.Comment: Accepted by AAAI 201
- …