Search CORE

9 research outputs found

Exploring Large Language Models for Code Explanation

Author: Bhattacharya Paheli
Chakraborty Manojit
Dindorkar Ishan
Gupta Rishabh
Palepu Kartheek N S N
Pandey Vikas
Rajpurohit Rakesh
Publication venue
Publication date: 25/10/2023
Field of study

Automating code documentation through explanatory text can prove highly beneficial in code understanding. Large Language Models (LLMs) have made remarkable strides in Natural Language Processing, especially within software engineering tasks such as code generation and code summarization. This study specifically delves into the task of generating natural-language summaries for code snippets, using various LLMs. The findings indicate that Code LLMs outperform their generic counterparts, and zero-shot methods yield superior results when dealing with datasets with dissimilar distributions between training and testing sets.Comment: Accepted at the Forum for Information Retrieval Evaluation 2023 (IRSE Track

arXiv.org e-Print Archive

Ensemble Classifier based approach for Code-Mixed Cross-Script Question Classification [Team IINTU]

Author: Debjyoti Bhattacharjee
Paheli Bhattacharya
Publication venue
Publication date: 11/04/2020
Field of study

ABSTRACT With an increasing popularity of social-media, people post updates that aid other users in finding answers to their questions. Most of the user-generated data on social-media are in code-mixed or multi-script form, where the words are represented phonetically in a non-native script. We address the problem of Question-Classfication on social-media data. We propose an ensemble classifier based approach towards question classification when the questions are written in mixedscript, specifically, the Roman script for the Bengali language. We separately train Random Forests, One-Vs-Rest and k-NN classifiers and then build an ensemble classifier that combines the best from the three worlds. We achieve an accuracy of 82% approximately, suggesting that the method works well in the task

CiteSeerX

Ensemble classifier based approach for code-mixed cross-script question classification

Author: Bhattacharjee Debjyoti
Bhattacharya Paheli
Publication venue
Publication date: 01/01/2016
Field of study

With an increasing popularity of social-media, people post updates that aid other users in finding answers to their questions. Most of the user-generated data on social-media are in code-mixed or multi-script form, where the words are represented phonetically in a non-native script. We address the problem of Question-Classfication on social-media data. We propose an ensemble classifier based approach towards question classification when the questions are written in mixedscript, specifically, the Roman script for the Bengali language. We separately train Random Forests, One-Vs-Rest and k-NN classifiers and then build an ensemble classifier that combines the best from the three worlds. We achieve an accuracy of 82% approximately, suggesting that the method works well in the task.Published versio

DR-NTU (Digital Repository of NTU)

Efficient hole transport material formed by atmospheric pressure plasma functionalization of Spiro-OMeTAD

Author: Bhattacharya Debabrata
Bowen James
Braithwaite Nicholas St John
Ghosh Paheli
Ivaturi Aruna
Kowal Jan
Krishnamurthy Satheesh
Nixon Tony
Publication venue: 'Elsevier BV'
Publication date: 01/09/2020
Field of study

A technique to increase the conductivity of Spiro-OMeTAD using an easily scalable, non-thermal atmospheric pressure plasma jet (APPJ) is reported. An investigation of plasma functionalization demonstrated an enhancement in hole conductivity by over an order of magnitude from 9.4 × 10−7 S cm−1 for the pristine film to 1.15 × 10−5 S cm−1 for films after 5 minutes of plasma treatment. The conductivity value after plasma functionalization was comparable to that reported for 10–25% Li-TFSI-doped Spiro-OMeTAD. The increase in conductivity was correlated with a reduction in phase value observed using electrostatic force microscopy. Kelvin probe force microscopy showed an increase in work function after plasma exposure corresponding to the p-type nature of the doping. X-ray photoelectron spectroscopy revealed surface oxidation of plasma-functionalized films, as well as variation in nitrogen chemistry, with the formation of a higher binding energy quaternary nitrogen tail. Oxidation of Spiro-OMeTAD was also confirmed by the appearance of the 500 nm absorption peak using UV–vis spectroscopy. The synergistic contribution of increase in charge density in Spiro-OMeTAD due to the energetic species in the plasma jet coupled with improvement in π-π stacking of the molecules is thought to underlie the conductivity enhancement. The enhancement in positive charges can also be attributed to the formation of quinoid structures with quaternary nitrogen +N=C formed due to loss of methyl groups during plasma surface interaction. This work opens up the possibility of using an atmospheric pressure plasma jet as a simple and effective technique for doping and functionalizing Spiro-OMeTAD thin films to circumvent the detrimental issues associated with chemical dopants

University of Strathclyde Institutional Repository

Open Research Online

Legal IR and NLP: The History, Challenges, and State-of-the-Art

Author: Bhattacharya Paheli
Conrad Jack G.
Ganguly Debasis
Ghosh Kripabandhu
Ghosh Saptarshi
Goyal Pawan
Nigam Shubham Kumar
Paul Shounak
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/03/2023
Field of study

Artificial Intelligence (AI), Machine Learning (ML), Information Retrieval (IR) and Natural Language Processing (NLP) are transforming the way legal professionals and law firms approach their work. The significant potential for the application of AI to Law, for instance, by creating computational solutions for legal tasks, has intrigued researchers for decades. This appeal has only been amplified with the advent of Deep Learning (DL). It is worth noting that working with legal text is far more challenging as compared to the other subdomains of IR/NLP, mainly due to the typical characteristics of legal text, such as considerably longer documents, complex language and lack of large-scale annotated datasets. In this tutorial, we introduce the audience to these characteristics of legal text, and with it, the challenges associated with processing the legal documents. We touch upon the history of AI and Law research, and how it has evolved over the years from relatively simpler approaches to more complex ones, such as those involving DL. We organize the tutorial as follows. First, we provide a brief introduction to state-of-the-art research in the general domain of IR and NLP. We then discuss in more detail IR/NLP tasks specific to the legal domain. We outline the methodologies (both from an academic and industry perspective), and the available tools and datasets to evaluate the methodologies. This is then followed by a hands-on coding/demo session

Enlighten