590 research outputs found
Learning to Diversify Neural Text Generation via Degenerative Model
Neural language models often fail to generate diverse and informative texts,
limiting their applicability in real-world problems. While previous approaches
have proposed to address these issues by identifying and penalizing undesirable
behaviors (e.g., repetition, overuse of frequent words) from language models,
we propose an alternative approach based on an observation: models primarily
learn attributes within examples that are likely to cause degeneration
problems. Based on this observation, we propose a new approach to prevent
degeneration problems by training two models. Specifically, we first train a
model that is designed to amplify undesirable patterns. We then enhance the
diversity of the second model by focusing on patterns that the first model
fails to learn. Extensive experiments on two tasks, namely language modeling
and dialogue generation, demonstrate the effectiveness of our approach.Comment: IJCNLP-AACL2023 Findings, 10 page
Recommended from our members
Evolutionary and Functional Diversity of Regulatory Factors and Sequences that Coordinate Gene Expression
Bacteria regulate gene expression through coordinated interactions between cis-regulatory sequences and trans-regulatory factors. Understanding the molecular basis for the functions of these regulatory components is not only essential for deciphering complex biological processes in diverse bacteria but also critical for rational engineering of microbial phenotypes. However, systematically dissecting the sequence-function relationships of cis and trans regulatory components that underly gene expression is still a key challenge. Recent technological advances have provided novel tools and methods for mapping sequence-function relationships in high-throughput. This dissertation focuses on applying novel methods enabled through increased throughput and scalability of DNA synthesis and sequencing to elucidate the sequence-function relationships of cis and trans components that underlie bacterial gene regulation.
In Chapter 2, evolutionary and functional diversity of primary σ70, a universally conserved global regulator in bacteria, is studied through comparative genomics, saturation mutagenesis, and transcriptomics. Through the combined efforts of these approaches, we demonstrate that sequence diversity of σ70 factors reflects functional differences which have been shaped by evolutionary constraints from co-evolving regulatory sequence targets during evolution. Chapter 3 discusses systematically mapping transcriptional activities of cis-regulatory sequences from Biosynthetic Gene Clusters (BGCs). Using a Streptomyces as a host, we found key regulatory features that affected gene expression, such as GC content, transcription start sites, and sequence motifs. We further explored regulation of BGC derived regulatory sequences by expressing global regulatory factors and screening for regulator sequences with altered expression levels. Finally, Chapter 4 highlights recent studies that made key contributions towards elucidating and modulating bacterial gene regulatory networks and reviews the current state of microbial systems biology and gene regulation. Together, the results and discussions presented in this dissertation seeks to further advance the current knowledge of sequence-function relationships of microbial regulatory components to enable better understanding, modeling, and rational engineering of bacterial gene regulation
F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax
Despite recent advances in neural text generation, encoding the rich
diversity in human language remains elusive. We argue that the sub-optimal text
generation is mainly attributable to the imbalanced token distribution, which
particularly misdirects the learning model when trained with the
maximum-likelihood objective. As a simple yet effective remedy, we propose two
novel methods, F^2-Softmax and MefMax, for a balanced training even with the
skewed frequency distribution. MefMax assigns tokens uniquely to frequency
classes, trying to group tokens with similar frequencies and equalize frequency
mass between the classes. F^2-Softmax then decomposes a probability
distribution of the target token into a product of two conditional
probabilities of (i) frequency class, and (ii) token from the target frequency
class. Models learn more uniform probability distributions because they are
confined to subsets of vocabularies. Significant performance gains on seven
relevant metrics suggest the supremacy of our approach in improving not only
the diversity but also the quality of generated texts.Comment: EMNLP 202
Recommended from our members
Clustering as a precursor to efficient and near-optimal solution of small instancesof the Traveling Salesperson Problem (TSP)
Humans efficiently find near-optimal (i.e., near-minimum-length) tours when solving small instances of the TravelingSalesperson Problem (TSP), a problem hard for computers. We hypothesize that this is possible because they use thefollowing strategy: cluster the points, solve the smaller TSPs for each cluster, and then solve the TSP defined by theclusters. This study focused on the antecedent process of human clustering. 42 participants clustered 56 sets of 15 to 40points on two occasions. We found that human clustering is generally reliable (M Fowlkes-Mallows Index = 0.75) forall problem sizes. Reliability was higher for problems that showed statistical evidence of cluster structure versus no suchstructure, and was not affected when the problem was flipped for the second presentation. Thus, humans are sensitive tocluster structure, and clustering is a stable foundation for solving TSP instances. This sets the stage for future research onclustering-based TSP strategies
L’impact des TICE sur la formation des enseignants en Corée
Depuis la fin des années 90, la Corée a entrepris d’améliorer la qualité de l’enseignement en promouvant l’utilisation des TICE, vues comme donnant aux enseignants la possibilité d’améliorer leur maîtrise de l’information. À mesure que les technologies progressaient, la formation des enseignants s’est transformée pour adapter les nouvelles technologies à la situation de classe et pour développer les aptitudes des enseignants. Le « Plan TICE », depuis 2002, s’adresse à plus de 30 % des enseignants, ainsi qu’aux chefs d’établissements et autres personnels d’éducation. Les nouveaux programmes de formation aux TICE prennent en compte toute la durée de la carrière des enseignants et les nouvelles technologies.Since the end of the 1990s, Korea has been striving to improve teaching quality by promoting the use of educational ICT (EICT), seen as offering teachers the possibility to improve their mastery of their subjects. As technology has progressed, teacher training has been transformed in order to adapt such new technologies to classroom situations and to develop teachers’ skills. The “EICT Plan”, operational since 2002, concerns over 30% of teachers, heads of institutions and other education personnel. The new EICT training programmes take the teachers’ length of experience and the new technologies into account.Desde finales de los años 90, Corea se ha propuesto mejorar la calidad de la enseñanza promoviendo el uso de las TICE, considerando que ofrecen a los profesores la posibilidad de mejorar su dominio de la información. A medida que iban progresando las tecnologías, la formación de los profesores se ha ido transformando para adaptar las nuevas tecnologías a la situación de las clases y para desarrollar las aptitudes de los profesores. El “Plan TICE”, lanzado desde 2002, se dirige a más del 30% de los profesores, así como a los directores de establecimientos y a otros personales de la educación. Los nuevos programas de formación sobre las TICE tienen en cuenta todo el periodo de la carrera de los profesores y las nuevas tecnologías
- …