Search CORE

111,762 research outputs found

Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word Understanding of Language Models

Author: Merlo Paola
Schütze Hinrich
Senel Lutfi Kerem
Tiedemann Jörg
Tsarfaty Reut
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/04/2021
Field of study

Recent progress in pretraining language models on large corpora has resulted in significant performance gains on many NLP tasks. These large models acquire linguistic knowledge during pretraining, which helps to improve performance on downstream tasks via fine-tuning. To assess what kind of knowledge is acquired, language models are commonly probed by querying them with ‘fill in the blank’ style cloze questions. Existing probing datasets mainly focus on knowledge about relations between words and entities. We introduce WDLMPro (Word Definitions Language Model Probing) to evaluate word understanding directly using dictionary definitions of words. In our experiments, three popular pretrained language models struggle to match words and their definitions. This indicates that they understand many words poorly and that our new probing task is a difficult challenge that could help guide research on LMs in the future

Probabilistic Generative Transformer Language models for Generative Design of Molecules

Author: Fu Nihang
Hu Jianjun
Song Yuqi
Wang Qian
Wei Lai
Publication venue
Publication date: 19/09/2022
Field of study

Self-supervised neural language models have recently found wide applications in generative design of organic molecules and protein sequences as well as representation learning for downstream structure classification and functional prediction. However, most of the existing deep learning models for molecule design usually require a big dataset and have a black-box architecture, which makes it difficult to interpret their design logic. Here we propose Generative Molecular Transformer (GMTransformer), a probabilistic neural network model for generative design of molecules. Our model is built on the blank filling language model originally developed for text processing, which has demonstrated unique advantages in learning the "molecules grammars" with high-quality generation, interpretability, and data efficiency. Benchmarked on the MOSES datasets, our models achieve high novelty and Scaf compared to other baselines. The probabilistic generation steps have the potential in tinkering molecule design due to their capability of recommending how to modify existing molecules with explanation, guided by the learned implicit molecule chemistry. The source code and datasets can be accessed freely at https://github.com/usccolumbia/GMTransformerComment: 13 page

arXiv.org e-Print Archive

Directory of Open Access Journals