19 research outputs found
Pseudointelligence: A Unifying Framework for Language Model Evaluation
With large language models surpassing human performance on an increasing
number of benchmarks, we must take a principled approach for targeted
evaluation of model capabilities. Inspired by pseudorandomness, we propose
pseudointelligence, which captures the maxim that "(perceived) intelligence
lies in the eye of the beholder". That is, that claims of intelligence are
meaningful only when their evaluator is taken into account. Concretely, we
propose a complexity-theoretic framework of model evaluation cast as a dynamic
interaction between a model and a learned evaluator. We demonstrate that this
framework can be used to reason about two case studies in language model
evaluation, as well as analyze existing evaluation methods.Comment: EMNLP 2023 Finding
Grokking of Hierarchical Structure in Vanilla Transformers
For humans, language production and comprehension is sensitive to the
hierarchical structure of sentences. In natural language processing, past work
has questioned how effectively neural sequence models like transformers capture
this hierarchical structure when generalizing to structurally novel inputs. We
show that transformer language models can learn to generalize hierarchically
after training for extremely long periods -- far beyond the point when
in-domain accuracy has saturated. We call this phenomenon \emph{structural
grokking}. On multiple datasets, structural grokking exhibits inverted U-shaped
scaling in model depth: intermediate-depth models generalize better than both
very deep and very shallow transformers. When analyzing the relationship
between model-internal properties and grokking, we find that optimal depth for
grokking can be identified using the tree-structuredness metric of
\citet{murty2023projections}. Overall, our work provides strong evidence that,
with extended training, vanilla transformers discover and use hierarchical
structure.Comment: ACL 202
Characterizing Intrinsic Compositionality in Transformers with Tree Projections
When trained on language data, do transformers learn some arbitrary
computation that utilizes the full capacity of the architecture or do they
learn a simpler, tree-like computation, hypothesized to underlie compositional
meaning systems like human languages? There is an apparent tension between
compositional accounts of human language understanding, which are based on a
restricted bottom-up computational process, and the enormous success of neural
models like transformers, which can route information arbitrarily between
different parts of their input. One possibility is that these models, while
extremely flexible in principle, in practice learn to interpret language
hierarchically, ultimately building sentence representations close to those
predictable by a bottom-up, tree-structured model. To evaluate this
possibility, we describe an unsupervised and parameter-free method to
\emph{functionally project} the behavior of any transformer into the space of
tree-structured networks. Given an input sentence, we produce a binary tree
that approximates the transformer's representation-building process and a score
that captures how "tree-like" the transformer's behavior is on the input. While
calculation of this score does not require training any additional models, it
provably upper-bounds the fit between a transformer and any tree-structured
approximation. Using this method, we show that transformers for three different
tasks become more tree-like over the course of training, in some cases
unsupervisedly recovering the same trees as supervised parsers. These trees, in
turn, are predictive of model behavior, with more tree-like models generalizing
better on tests of compositional generalization.Comment: Fixed title and metadat
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models
Recursion is a prominent feature of human language, and fundamentally
challenging for self-attention due to the lack of an explicit recursive-state
tracking mechanism. Consequently, Transformer language models poorly capture
long-tail recursive structure and exhibit sample-inefficient syntactic
generalization. This work introduces Pushdown Layers, a new self-attention
layer that models recursive state via a stack tape that tracks estimated depths
of every token in an incremental parse of the observed prefix. Transformer LMs
with Pushdown Layers are syntactic language models that autoregressively and
synchronously update this stack tape as they predict new tokens, in turn using
the stack tape to softly modulate attention over tokens -- for instance,
learning to "skip" over closed constituents. When trained on a corpus of
strings annotated with silver constituency parses, Transformers equipped with
Pushdown Layers achieve dramatically better and 3-5x more sample-efficient
syntactic generalization, while maintaining similar perplexities. Pushdown
Layers are a drop-in replacement for standard self-attention. We illustrate
this by finetuning GPT2-medium with Pushdown Layers on an automatically parsed
WikiText-103, leading to improvements on several GLUE text classification
tasks.Comment: Accepted at EMNLP 2023 (Long Papers
Deciphering the antagonistic effect of Streptomyces spp. and host-plant resistance induction against charcoal rot of sorghum
Two strains each of Streptomyces albus (CAI-17 and KAI-27) and Streptomyces griseus (KAI-26 and MMA-32)
and one strain of Streptomyces cavourensis (SAI-13) previously reported to have plant growth-promotion activity in chickpea,
rice and sorghum were evaluated for their antagonistic potential against Macrophomina phaseolina, which causes charcoal
rot in sorghum. The antagonistic potential of these strains against M. phaseolina was assessed through dual culture assay,
metabolite production assay, blotter paper assay in greenhouse and field disease screens. In both dual culture and metabolite
production assays, the selected strains significantly inhibited the growth of M. phaseolina (63–74%). In the blotter paper
assay, all the five strains of Streptomyces spp. inhibited the pathogen (80–90%). When these five strains were tested for their
antagonistic potential under the greenhouse (two times) and field (two seasons) conditions by toothpick method of inoculation,
significant differences were observed for charcoal rot severity. Principal component analysis capturing 91.3% phenotypic
variations, revealed that the shoot samples treated with both Streptomyces and the pathogen exhibited significantly enhanced
antioxidant parameters including superoxide dismutase, catalase, ascorbate peroxidase, guaiacol peroxidase, glutathione
reductase, phenylalanine ammonia-lyase, polyphenol oxidase, and total phenolic contents when compared to shoot samples
treated with only M. phaseolina. Scanning electron microscope analysis revealed that the phloem and xylem tissues of the
Streptomyces treated stem samples were intact compared to that of pathogen inoculated plants. This study indicated that the
selected strains of Streptomyces spp. have the potential for biological control of charcoal rot disease in sorghum
Identification and Characterization of a Streptomyces albus Strain and Its Secondary Metabolite Organophosphate against Charcoal Rot of Sorghum
Streptomycesalbus strain CAI-21 has been previously reported to have plant growth-promotion
abilities in chickpea, pigeonpea, rice, and sorghum. The strain CAI-21 and its secondary metabolite
were evaluated for their biocontrol potential against charcoal rot disease in sorghum caused
by Macrophomina phaseolina. Results exhibited that CAI-21 significantly inhibited the growth of
the pathogen, M. phaseolina, in dual-culture (15 mm; zone of inhibition), metabolite production
(74% inhibition), and blotter paper (90% inhibition) assays. When CAI-21 was tested for its biocontrol
potential under greenhouse and field conditions following inoculation of M. phaseolina by toothpick
method, it significantly reduced the number of internodes infected (75% and 45% less, respectively)
and length of infection (75% and 51% less, respectively) over the positive control (only M. phaseolina
inoculated) plants. Under greenhouse conditions, scanning electron microscopic analysis showed
that the phloem and xylem tissues of the CAI-21-treated shoot samples were intact compared to
those of the diseased stem samples. The culture filtrate of the CAI-21 was purified by various
chromatographic techniques, and the active compound was identified as “organophosphate” by
NMR and MS. The e�cacy of organophosphate was found to inhibit the growth of M. phaseolina
in the poisoned food technique. This study indicates that S. albus CAI-21 and its active metabolite
organophosphate have the potential to control charcoal rot in sorghum
Discovering the Language of Actions
This thesis takes a look at discovering language-like discrete infinities for actions. How can a stream of continuous data be parsed into skills/concepts and can we tie the decision of what may be the right set of skills with the problem of generating plans over a continuous action space as in the original stream of data? Can we utilize supervision from aligning parallel language instructions to scaffold the discovery of these named primitives of actions from interactions? Here, we present a framework for learning hierarchical policies from demonstrations, using sparse natural language annotations to guide the discovery of reusable skills for autonomous decision-making. It is formulated as a generative model of action sequences in which goals generate sequences of high-level subtask descriptions, and these descriptions generate sequences of low-level actions. The thesis describes how to train this model using primarily unannotated demonstrations by parsing demonstrations into sequences of named high-level subtasks, using only a small number of seed annotations to ground language in action. In trained models, the space of natural language commands indexes a combinatorial library of skills; agents can use these skills to plan by generating high-level instruction sequences tailored to novel goals. The approach is evaluated in the ALFRED household simulation environment, providing natural language annotations for only 10% of demonstrations. It completes more than twice as many tasks as a standard approach to learning from demonstrations, matching the performance of instruction following models with access to ground-truth plans during both training and evaluation. 1
1Code, data, and additional visualizations are available at https://sites.google.com/view/ skill-induction-latent-lang/.S.M
Pneumatosis intestinalis: Cost paid for rheumatoid arthritis treatment
A 75-year-old woman with rheumatoid arthritis on rituximab presented with a 1-week history of constipation and abdominal distension. Subsequent workup showed presence of air in the bowel wall without perforation initially. Due to positive blood cultures, worsening leucocytosis and high suspicion for perforation, an exploratory laparotomy was performed revealing necrotic bowel, walled off perforation and abscess. Patient underwent right hemicolectomy with diversion loop ileostomy. Clinicians must recognise that monoclonal antibodies like rituximab can mask signs of inflammation and therefore should maintain a high index of suspicion for intestinal perforation when evaluating patients with minimal symptoms and pneumatosis intestinalis
Short-term effects of renal transplantation on coronary artery calcification: A prospective study
Cardiovascular disease is a leading cause of mortality in renal transplant recipients. Coronary artery calcification (CAC) has been found to have good correlation with atherosclerosis and cardiovascular morbidity. The objective of our study was to assess the prevalence of CAC and the long-term effects of renal transplantation on CAC and carotid intima-medial thickness (CIMT) in Indian renal transplant recipients. Twenty-eight renal transplant recipients were included in this prospective study. Dual-source computed tomography and calcium scoring using Agatston′s method and CIMT measurement were performed at the time of transplant and then repeated at six and 12 months after transplantation. The prevalence of CAC in our study patients was low (32%), probably because they were young, had been on dialysis for a short duration and had undergone live-related renal transplant. An overall improvement in biochemical parameters was observed after transplantation. Patients with zero baseline calcium score did not show progression. Patients with baseline calcium score more than zero showed initial progression at 6 months and no further progression afterwards. There was good correlation between CIMT and CAC score. Our study suggests that renal transplantation does not reverse the calcification but appears to decrease the rate of progression in the long term