100 research outputs found
How did the discussion go: Discourse act classification in social media conversations
We propose a novel attention based hierarchical LSTM model to classify
discourse act sequences in social media conversations, aimed at mining data
from online discussion using textual meanings beyond sentence level. The very
uniqueness of the task is the complete categorization of possible pragmatic
roles in informal textual discussions, contrary to extraction of
question-answers, stance detection or sarcasm identification which are very
much role specific tasks. Early attempt was made on a Reddit discussion
dataset. We train our model on the same data, and present test results on two
different datasets, one from Reddit and one from Facebook. Our proposed model
outperformed the previous one in terms of domain independence; without using
platform-dependent structural features, our hierarchical LSTM with word
relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively
to predict discourse roles of comments in Reddit and Facebook discussions.
Efficiency of recurrent and convolutional architectures in order to learn
discursive representation on the same task has been presented and analyzed,
with different word and comment embedding schemes. Our attention mechanism
enables us to inquire into relevance ordering of text segments according to
their roles in discourse. We present a human annotator experiment to unveil
important observations about modeling and data annotation. Equipped with our
text-based discourse identification model, we inquire into how heterogeneous
non-textual features like location, time, leaning of information etc. play
their roles in charaterizing online discussions on Facebook
: A Simple Society of Language Models Solves Complex Reasoning
Despite demonstrating emergent reasoning abilities, Large Language Models
(LLMS) often lose track of complex, multi-step reasoning. Existing studies show
that providing guidance via decomposing the original question into multiple
subproblems elicits more robustness in LLM reasoning -- a decomposer generates
the subproblems, and a solver solves each of these subproblems. However, these
techniques fail to accommodate coordination between the decomposer and the
solver modules (either in a single model or different specialized ones) -- the
decomposer does not keep track of the ability of the solver to follow the
decomposed reasoning. In this paper, we propose LM2 to address these
challenges. LM2 modularizes the decomposition, solution, and verification into
three different language models. The decomposer module identifies the key
concepts necessary to solve the problem and generates step-by-step subquestions
according to the reasoning requirement. The solver model generates the solution
to the subproblems that are then checked by the verifier module; depending upon
the feedback from the verifier, the reasoning context is constructed using the
subproblems and the solutions. These models are trained to coordinate using
policy learning. Exhaustive experimentation suggests the superiority of LM2
over existing methods on in- and out-domain reasoning problems, outperforming
the best baselines by on MATH, on JEEBench, and on
MedQA problems (code available at
https://github.com/LCS2-IIITD/Language_Model_Multiplex)
Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks
Large Language Models (LLMs) have transformed NLP with their remarkable
In-context Learning (ICL) capabilities. Automated assistants based on LLMs are
gaining popularity; however, adapting them to novel tasks is still challenging.
While colossal models excel in zero-shot performance, their computational
demands limit widespread use, and smaller language models struggle without
context. This paper investigates whether LLMs can generalize from labeled
examples of predefined tasks to novel tasks. Drawing inspiration from
biological neurons and the mechanistic interpretation of the Transformer
architecture, we explore the potential for information sharing across tasks. We
design a cross-task prompting setup with three LLMs and show that LLMs achieve
significant performance improvements despite no examples from the target task
in the context. Cross-task prompting leads to a remarkable performance boost of
107% for LLaMA-2 7B, 18.6% for LLaMA-2 13B, and 3.2% for GPT 3.5 on average
over zero-shot prompting, and performs comparable to standard in-context
learning. The effectiveness of generating pseudo-labels for in-task examples is
demonstrated, and our analyses reveal a strong correlation between the effect
of cross-task examples and model activation similarities in source and target
input tokens. This paper offers a first-of-its-kind exploration of LLMs'
ability to solve novel tasks based on contextual signals from different task
examples.Comment: Accepted at ACL 2024 Mai
Prospective observational study of vancomycin injection in SLED patient of ethnic Indians
As the Vancomycin is itself a nephrotoxic antibiotics, so it is sometime recommended to the Slow-low Efficiency Dialysis (SLED) patients against highly resisted infection. In this case, the dose monitoring is strictly maintained after Intravenous injection. The collected blood was analyzed for its concentration in HPLC for 11 patients and the half life was evaluated to study Therapeutic drug monitoring. The T1/2 of evaluated vancomycin is 39.12+ 6.81 hrs. The mean of the systemic clearance is 16.91+6.99 and mean Vd is 0.57+ 0.147. Comparatively the reported study of Mean + SD of half-life, volume of distribution, and systemic clearance were 43.1 + 21.6 hours, 0.84 L/kg + 0.17 L/kg, and 24.3 mL/min + 8.39 mL/min respectively. Thus the t-test of the means was 0.5828, degree of freedom (df) was 20, standard error of difference was 6.829 and so, the two-tailed P value is 0.5665 i.e. P > 0.5. In ethnic Indian SLED patients, T1/2 of mean + SD of 39.12 + 6.81 hrs was compared to the Caucasian patients i.e, 43.1 + 21.6 hrs. And the t-test and P-value is 0.5828 & 0.5665 respectively. Thus it was concluded that the half-life of ethnic Indian patients is less in compare to Caucasians but this difference is not so significant. The half-life of ethnic 8 patients is less than 40 out of 11 patients.Keywords: Vancomycin assay; Slow-low efficiency dialysis; Pharmacokinetic analysis; Ethnic indian
How to think step-by-step: A mechanistic understanding of chain-of-thought reasoning
Despite superior reasoning prowess demonstrated by Large Language Models
(LLMs) with Chain-of-Thought (CoT) prompting, a lack of understanding prevails
around the internal mechanisms of the models that facilitate CoT generation.
This work investigates the neural sub-structures within LLMs that manifest CoT
reasoning from a mechanistic point of view. From an analysis of Llama-2 7B
applied to multistep reasoning over fictional ontologies, we demonstrate that
LLMs deploy multiple parallel pathways of answer generation for step-by-step
reasoning. These parallel pathways provide sequential answers from the input
question context as well as the generated CoT. We observe a functional rift in
the middle layers of the LLM. Token representations in the initial half remain
strongly biased towards the pretraining prior, with the in-context prior taking
over in the later half. This internal phase shift manifests in different
functional components: attention heads that write the answer token appear in
the later half, attention heads that move information along ontological
relationships appear in the initial half, and so on. To the best of our
knowledge, this is the first attempt towards mechanistic investigation of CoT
reasoning in LLMs
-set problem in graphs
A subset of a graph is a -set if every
vertex is adjacent to at least but not more than
vertices in D. The cardinality of a minimum -set of , denoted as
, is called the -domination number of . Given a
graph and an integer , the decision version of the -set
problem is to decide whether has a -set of cardinality at most .
In this paper, we first obtain an upper bound on using
probabilistic methods, for bounded minimum and maximum degree graphs. Our bound
is constructive, by the randomized algorithm of Moser and Tardos [MT10], We
also show that the - set problem is NP-complete for chordal graphs.
Finally, we design two algorithms for finding of a tree
and a split graph, for any fixed , which answers an open question posed in
[CHHM13]
- …
