Search CORE

11 research outputs found

Memory Injections: Correcting Multi-Hop Reasoning Failures during Inference in Transformer-Based Language Models

Author: Ajith Aswathy
Bauer André
Chard Kyle
Foster Ian
Grzenda Daniel
Hudson Nathaniel
Khan Arham
Sakarvadia Mansi
Publication venue
Publication date: 12/09/2023
Field of study

Answering multi-hop reasoning questions requires retrieving and synthesizing information from diverse sources. Large Language Models (LLMs) struggle to perform such reasoning consistently. Here we propose an approach to pinpoint and rectify multi-hop reasoning failures through targeted memory injections on LLM attention heads. First, we analyze the per-layer activations of GPT-2 models in response to single and multi-hop prompts. We then propose a mechanism that allows users to inject pertinent prompt-specific information, which we refer to as "memories," at critical LLM locations during inference. By thus enabling the LLM to incorporate additional relevant information during inference, we enhance the quality of multi-hop prompt completions. We show empirically that a simple, efficient, and targeted memory injection into a key attention layer can often increase the probability of the desired next token in multi-hop tasks, by up to 424%

arXiv.org e-Print Archive

Attention Lens: A Tool for Mechanistically Interpreting the Attention Head Information Retrieval Mechanism

Author: Ajith Aswathy
Bauer André
Chard Kyle
Foster Ian
Grzenda Daniel
Hudson Nathaniel
Khan Arham
Sakarvadia Mansi
Publication venue
Publication date: 24/10/2023
Field of study

Transformer-based Large Language Models (LLMs) are the state-of-the-art for natural language tasks. Recent work has attempted to decode, by reverse engineering the role of linear layers, the internal mechanisms by which LLMs arrive at their final predictions for text completion tasks. Yet little is known about the specific role of attention heads in producing the final token prediction. We propose Attention Lens, a tool that enables researchers to translate the outputs of attention heads into vocabulary tokens via learned attention-head-specific transformations called lenses. Preliminary findings from our trained lenses indicate that attention heads play highly specialized roles in language models. The code for Attention Lens is available at github.com/msakarvadia/AttentionLens

arXiv.org e-Print Archive

Investigation Into the Antidiabetic Effects of a Developed Polyherbal Nanosuspension and Its Assessment

Author: C Aswathy
Chandran Ajith
H Swathi
Ka Raheena
Kv Afnan
P Lameesa Banu
P Shafin
Panicker Preetha S
U Naseena
Publication venue: ASSOC ADVANCEMENT ZOOLOGY , AZADANAGAR COLONY RUSTAMPUR, GORAKHPUR, INDIA, 273001
Publication date: 12/10/2023
Field of study

This study focuses on the development and evaluation of a nanosuspension containing ethanolic extracts of Tinospora cordifolia and Syzygium cumini for managing Diabetes mellitus. The main objective is to create an effective polyherbal nanosuspension by combining Tinospora cordifolia and Syzygium cumini with an optimal concentration of chitosan polymer to address Diabetes mellitus. Furthermore, both in vitro and in vivo assessments of the synthesized nanosuspensions were conducted to determine the best formulation. Methods and Findings: The ethanolic extracts of the mentioned plants were obtained using a maceration technique, followed by preliminary phytochemical screening, HPTLC analysis, and FTIR-based incompatibility assessments. The nanosuspension was prepared using the ionic gelation method by varying the chitosan polymer concentration. Comprehensive in vitro assessments were carried out, including measurements of pH, viscosity, drug content, entrapment efficiency, loading capacity, and in vitro release profiles for different formulations. The formulation with the highest drug content and optimal release characteristics was selected for further analysis of particle size, zeta potential, and surface morphology. Subsequently, the antidiabetic efficacy of the polyherbal nanosuspension was evaluated using wistar albino rats. Discussion: FTIR analysis indicated no significant interaction between the drug and the polymer. The in vitro drug release and kinetic analyses suggested that the F5 formulation exhibited superior drug release and an improved release mechanism. The particle size was determined to be approximately 420nm, and SEM imaging revealed particles that were nearly spherical in shape. Stability assessments of formulation F5 demonstrated consistent physical and chemical parameters over time

Journal Of Advanced Zoology

Automated Detection and Classification of Meningioma Tumor from MR Images Using Sea Lion Optimization and Deep Learning Models

Author: Ajith Abraham
Aswathy Sukumaran
Publication venue: MDPI AG
Publication date: 01/12/2021
Field of study

Meningiomas are the most prevalent benign intracranial life-threatening brain tumors, with a life expectancy of a few months in the later stages, so this type of tumor in the brain image should be recognized and detected efficiently. The source of meningiomas is unknown. Radiation exposure, particularly during childhood, is the sole recognized environmental risk factor for meningiomas. The imaging technique of magnetic resonance imaging (MRI) is commonly used to detect most tumor forms as it is a non-invasive and painless method. This study introduces a CNN-HHO integrated automated identification model, which makes use of SeaLion optimization methods for improving overall network optimization. In addition to these techniques, various CNN models such as Resnet, VGG, and DenseNet have been utilized to give an overall influence of CNN with SeaLion in each methodology. Each model is tested on our benchmark dataset for accuracy, specificity, dice coefficient, MCC, and sensitivity, with DenseNet outperforming the other models with a precision of 98%. The proposed methods outperform existing alternatives in the detection of brain tumors, according to the existing experimental findings

Directory of Open Access Journals

Deep Learning-Based BoVW–CRNN Model for Lung Tumor Detection in Nano-Segmented CT Images

Author: Ajith Abraham
Aswathy S. U.
Divya Stephen
Fathimathul Rajeena P. P.
Publication venue: MDPI AG
Publication date: 01/12/2022
Field of study

One of the most common oncologies analyzed among people worldwide is lung malignancy. Early detection of lung malignancy helps find a suitable treatment for saving human lives. Due to its high resolution, greater transparency, and low noise and distortions, Computed Tomography (CT) images are most commonly used for processing. In this context, this research work mainly focused on the multifaceted nature of lung cancer diagnosis, a quintessential, fascinating, and risky subject of oncology. The input used here has been nano-image, enhanced with a Gabor filter and modified color-based histogram equalization. Then, the image of lung cancer was segmented by using the Guaranteed Convergence Particle Swarm Optimization (GCPSO) algorithm. A graphical user interface nano-measuring tool was designed to classify the tumor region. The Bag of Visual Words (BoVW) and a Convolutional Recurrent Neural Network (CRNN) were employed for image classification and feature extraction processes. In terms of findings, we achieved the average precision of 96.5%, accuracy of 99.35%, sensitivity of 97%, specificity of 99% and F1 score of 95.5%. With the proposed solution, the overall time required for the segmentation of images was much smaller than the existing solutions. It is also remarkable that biocompatible-based nanotechnology was developed to distinguish the malignancy region on a nanometer scale and has to be evaluated automatically. That novel method succeeds in producing a proficient, robust, and precise segmentation of lesions in nano-CT images

Directory of Open Access Journals

ScholarBERT: Bigger is Not Always Better

Author: Ajith Aswathy
Chard Kyle
Duede Eamon
Foster Ian
Hong Zhi
Magoulas Roger
Malamud Carl
Pauloski Gregory
Publication venue
Publication date: 23/05/2022
Field of study

Transformer-based masked language models trained on general corpora, such as BERT and RoBERTa, have shown impressive performance on various downstream tasks. Increasingly, researchers are "finetuning" these models to improve performance on domain-specific tasks. Here, we report a broad study in which we applied 14 transformer-based models to 11 scientific tasks in order to evaluate how downstream performance is affected by changes along various dimensions (e.g., training data, model size, pretraining time, finetuning length). In this process, we created the largest and most diverse scientific language model to date, ScholarBERT, by training a 770M-parameter BERT model on an 221B token scientific literature dataset spanning many disciplines. Counterintuitively, our evaluation of the 14 BERT-based models (seven versions of ScholarBERT, five science-specific large language models from the literature, BERT-Base, and BERT-Large) reveals little difference in performance across the 11 science-focused tasks, despite major differences in model size and training data. We argue that our results establish an upper bound for the performance achievable with BERT-based architectures on tasks from the scientific domain.Comment: 16 pages. 4 figures. 8 table

arXiv.org e-Print Archive

Use of Statins Among Patients Taking Levothyroxine: an Observational Drug Utilization Study Across Sites

Author: Ajith Aswathy
Bianco Antonio C.
Casula Sabina
Ettleson Matthew
Fernandes Fernando
Idrees Thaer
Johnson Julie
Maciel Rui M. B.
Mayampurath Anoop
Narchi Flavia A. Andreotti
Prieto Wesley H.
Russo Pedro S. T.
Publication venue: Endocrine Soc
Publication date: 06/03/2021
Field of study

Context: Treatment with levothyroxine (LT4) that normalize serum thyrotropin (TSH) is expected to restore lipid metabolism. Objective: To assess statin utilization in LT4-treated patients through an observational drug utilization study. Methods: Three sites were involved: (1) 10 723 outpatients placed on LT4 during 2006-2019 identified from the Clinical Research Data Warehouse of the University of Chicago; (2) similar to 1.4 million LT4 prescriptions prepared by primary care physicians during January-December 2018, identified from the IQVIA (TM) database of medical prescriptions in Brazil; (30 similar to 5.4 million patient interviews during 2009-2019, including similar to 0.32 million patients on LT4, identified from the Fleury Group database in Brazil. Results: On site 1, initiation of therapy with LT4 increased the frequency of statin utilization (19.1% vs 24.6%), which occurred similar to 1.5 years later (median 76 weeks) and, among those patients that were on statins, increased intensity of treatment by 33%, despite normalization of serum TSH levels; on site 2, after matching for sex and age, the frequency of statins prescription was higher for those patients using LT4: females, 2.1 vs 3.4% (odds ratio [OR] 1.656 [1.639-1.673]); males, 3.1 vs 4.4% (OR 1.435 [1.409-1.462]); and, on site 3, after matching for sex and age, the frequency of statin utilization was higher in those patients using LT4: females, 10 vs 18% (OR 2.02 [2.00-2.04]); males, 15 vs 25% (OR 1.92 [1.88-1.96]); all P values were <.0001. Conclusion: Prescription and utilization of statins were higher in patients taking LT4. The reasons for this association should be addressed in future studies

PubMed Central

University of Miami: Scholarship Miami

Facile preparation of Silver nanoparticles from Vitex negundo leaf

Author: Amrita Anil
Aswathy. S. Murali
Athulya Pillai
Bindhya. S. Vinod
Das Athulya
H Sreehari
Mittal J
Nikhila P S
Parvathy Ajith
S Smitha Chandran
Sankar Sarma
Singh A
World Health Organization
Yoon S
Publication venue: 'IOP Publishing'
Publication date
Field of study

Crossref

Abnormal vital signs after laparoscopic colorectal surgery: More common than you think

Author: Almeida
Anoop Mayampurath
Aswathy Ajith
Benjamin D. Shogan
Branagan
Buchs
Chambers
Erb
Garcia-Granero
Giaccaglia
Harmon
Hirst
Hyman
Kelly Twohig
Kostić
Lagoutte
Marres
McDermott
Midura
Murrell
Neil Hyman
Nikolian
Reisinger
Rhodes
Su’a
Waterland
Wichmann
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref