12 research outputs found
Overcoming Barriers to Skill Injection in Language Modeling: Case Study in Arithmetic
Through their transfer learning abilities, highly-parameterized large
pre-trained language models have dominated the NLP landscape for a multitude of
downstream language tasks. Though linguistically proficient, the inability of
these models to incorporate the learning of non-linguistic entities (numerals
and arithmetic reasoning) limits their usage for tasks that require numeric
comprehension or strict mathematical reasoning. However, as we illustrate in
this paper, building a general purpose language model that also happens to be
proficient in mathematical reasoning is not as straight-forward as training it
on a numeric dataset. In this work, we develop a novel framework that enables
language models to be mathematically proficient while retaining their
linguistic prowess. Specifically, we offer information-theoretic interventions
to overcome the catastrophic forgetting of linguistic skills that occurs while
injecting non-linguistic skills into language models.Comment: NeurIPS 2022: Math-AI Worksho
Steering a Historical Disease Forecasting Model Under a Pandemic: Case of Flu and COVID-19
Forecasting influenza in a timely manner aids health organizations and
policymakers in adequate preparation and decision making. However, effective
influenza forecasting still remains a challenge despite increasing research
interest. It is even more challenging amidst the COVID pandemic, when the
influenza-like illness (ILI) counts are affected by various factors such as
symptomatic similarities with COVID-19 and shift in healthcare seeking patterns
of the general population. Under the current pandemic, historical influenza
models carry valuable expertise about the disease dynamics but face
difficulties adapting. Therefore, we propose CALI-Net, a neural transfer
learning architecture which allows us to 'steer' a historical disease
forecasting model to new scenarios where flu and COVID co-exist. Our framework
enables this adaptation by automatically learning when it should emphasize
learning from COVID-related signals and when it should learn from the
historical model. Thus, we exploit representations learned from historical ILI
data as well as the limited COVID-related signals. Our experiments demonstrate
that our approach is successful in adapting a historical forecasting model to
the current pandemic. In addition, we show that success in our primary goal,
adaptation, does not sacrifice overall performance as compared with
state-of-the-art influenza forecasting approaches.Comment: Appears in AAAI-2
Large Multi-Modal Models (LMMs) as Universal Foundation Models for AI-Native Wireless Systems
Large language models (LLMs) and foundation models have been recently touted
as a game-changer for 6G systems. However, recent efforts on LLMs for wireless
networks are limited to a direct application of existing language models that
were designed for natural language processing (NLP) applications. To address
this challenge and create wireless-centric foundation models, this paper
presents a comprehensive vision on how to design universal foundation models
that are tailored towards the deployment of artificial intelligence (AI)-native
networks. Diverging from NLP-based foundation models, the proposed framework
promotes the design of large multi-modal models (LMMs) fostered by three key
capabilities: 1) processing of multi-modal sensing data, 2) grounding of
physical symbol representations in real-world wireless systems using causal
reasoning and retrieval-augmented generation (RAG), and 3) enabling
instructibility from the wireless environment feedback to facilitate dynamic
network adaptation thanks to logical and mathematical reasoning facilitated by
neuro-symbolic AI. In essence, these properties enable the proposed LMM
framework to build universal capabilities that cater to various cross-layer
networking tasks and alignment of intents across different domains. Preliminary
results from experimental evaluation demonstrate the efficacy of grounding
using RAG in LMMs, and showcase the alignment of LMMs with wireless system
designs. Furthermore, the enhanced rationale exhibited in the responses to
mathematical questions by LMMs, compared to vanilla LLMs, demonstrates the
logical and mathematical reasoning capabilities inherent in LMMs. Building on
those results, we present a sequel of open questions and challenges for LMMs.
We then conclude with a set of recommendations that ignite the path towards
LMM-empowered AI-native systems
Post salpingectomy intraluminal endometriosis in a premenopausal lady - an incidental finding often paid less attention to
Endometriosis of the fallopian tube is often incidentally picked up in hysterectomy specimens that are sent for histopathological examination for other obvious pathological conditions. Post-salpingectomy endometriosis is one such entity that is known to occur in the tip of the proximal stump of the fallopian tube years after tubal ligation. As mere visualization of the endometriotic lesions is inadequate for an accurate diagnosis, histopathologic analysis of the biopsy samples becomes mandatory for confirmation. We report a case of post salpingectomy endometriosis which was incidentally discovered in a peri menopausal lady who was operated for multiple fibroids of the uterus. This case not only highlights an entity which is challenging to visualize radiologically and suspect clinically, but is also underrecognized, as very little attention is given to the fallopian tube during routine grossing.
Evaluation of FluSight influenza forecasting in the 2021–22 and 2022–23 seasons with a new target laboratory-confirmed influenza hospitalizations
Accurate forecasts can enable more effective public health responses during seasonal influenza epidemics. For the 2021–22 and 2022–23 influenza seasons, 26 forecasting teams provided national and jurisdiction-specific probabilistic predictions of weekly confirmed influenza hospital admissions for one-to-four weeks ahead. Forecast skill is evaluated using the Weighted Interval Score (WIS), relative WIS, and coverage. Six out of 23 models outperform the baseline model across forecast weeks and locations in 2021–22 and 12 out of 18 models in 2022–23. Averaging across all forecast targets, the FluSight ensemble is the 2nd most accurate model measured by WIS in 2021–22 and the 5th most accurate in the 2022–23 season. Forecast skill and 95% coverage for the FluSight ensemble and most component models degrade over longer forecast horizons. In this work we demonstrate that while the FluSight ensemble was a robust predictor, even ensembles face challenges during periods of rapid change
Learning Non-linguistic Skills without Sacrificing Linguistic Proficiency
The field of Math-NLP has witnessed significant growth in recent years,
motivated by the desire to expand LLM performance to the learning of
non-linguistic notions (numerals, and subsequently, arithmetic reasoning).
However, non-linguistic skill injection typically comes at a cost for LLMs: it
leads to catastrophic forgetting of core linguistic skills, a consequence that
often remains unaddressed in the literature. As Math-NLP has been able to
create LLMs that can closely approximate the mathematical skills of a
grade-schooler or the arithmetic reasoning skills of a calculator, the
practicality of these models fail if they concomitantly shed their linguistic
capabilities. In this work, we take a closer look into the phenomena of
catastrophic forgetting as it pertains to LLMs and subsequently offer a novel
framework for non-linguistic skill injection for LLMs based on information
theoretic interventions and skill-specific losses that enable the learning of
strict arithmetic reasoning. Our model outperforms the state-of-the-art both on
injected non-linguistic skills and on linguistic knowledge retention, and does
so with a fraction of the non-linguistic training data (1/4) and zero
additional synthetic linguistic training data.Comment: Accepted to ACL 2023's main conferenc
Expression of apoptosis regulating proteins p53 and bcl-2 in psoriasis
Background: Dysfunctional apoptosis has an important role in the development of several skin diseases. Psoriatic keratinocytes possess an enhanced ability to resist apoptosis, which might be one of the key pathogenetic mechanisms in psoriasis. P53 and bcl-2 are two proteins which control apoptosis. Several studies have evaluated the expression of these two proteins in the psoriatic skin, but the results are controversial. Methods: Fifty-eight cases of psoriatic skin biopsies were studied, and the grade of p53 and bcl-2 immunostaining was correlated with the histopathological indices of severity. Results: Bcl-2 expression in the epidermis strongly correlated with the expression in the basal cells and lymphocytes (P – 0.001 and 0.035). There was no correlation with epidermal hyperplasia or with p53 expression in the three compartments. Bcl-2 expression in the basal layer correlated with the p53 expression in the epidermis (P – 0.027), basal layer (P – 0.015) and the lymphocytes (P – 0.034). There was a strong correlation among the p53 expression in all the compartments. There was also a weak correlation of the p53 expression in the epidermis with the epidermal hyperplasia (P – 0.042). Conclusions: Bcl-2 does not appear to play an important role in the apoptotic process in psoriasis. In contrast, it is likely that p53 has a far more important role to play. Mutation analysis of the p53 protein is necessary to evaluate if the protein has mutated or if it is of the wild type