60 research outputs found
InSaAF: Incorporating Safety through Accuracy and Fairness | Are LLMs ready for the Indian Legal Domain?
Recent advancements in language technology and Artificial Intelligence have
resulted in numerous Language Models being proposed to perform various tasks in
the legal domain ranging from predicting judgments to generating summaries.
Despite their immense potential, these models have been proven to learn and
exhibit societal biases and make unfair predictions. In this study, we explore
the ability of Large Language Models (LLMs) to perform legal tasks in the
Indian landscape when social factors are involved. We present a novel metric,
-weighted \textit{Legal Safety Score (LSS_{\beta})}, which
encapsulates both the fairness and accuracy aspects of the LLM. We assess LLMs'
safety by considering its performance in the task and its fairness exhibition with respect to various axes of
disparities in the Indian society. Task performance and fairness scores of
LLaMA and LLaMA--2 models indicate that the proposed metric can
effectively determine the readiness of a model for safe usage in the legal
sector. We also propose finetuning pipelines, utilising specialised legal
datasets, as a potential method to mitigate bias and improve model safety. The
finetuning procedures on LLaMA and LLaMA--2 models increase the ,
improving their usability in the Indian legal domain. Our code is publicly
released
Quantitative assessment of rice crop damage post titli cyclone in Srikakulam, Andhra Pradesh using geo-spatial techniques
Mapping the extent of damage due to natural calamities remains one of the thrust areas in monitoring resource inventory through geo-spatial techniques. The effect of the cyclone \u27Titli\u27 and heavy rains during first fortnight of October 2018 in Srikakulam district, Andhra Pradesh State has been demonstrated using geo-spatial technology in terms of flood inundated rice area and corresponding yield and production loss. The pre- and post-cyclone (5 and 13 October 2018) flood inundation maps were generated using Sentinel-1A and TerraSAR-X Synthetic Aperture Radar (SAR) data respectively. The pre-cyclone rice area estimates were derived from multi-temporal Sentinel-1A SAR data, while yield forecast is based on the combination of satellite observations and yield simulation using ORYZA crop growth model. An intensive ground truth data collection had been carried out for the validation of satellite-derived rice area estimation of pre-cyclone event. An accuracy assessment has been carried out for district, mandal and village level. An overall accuracy of 96% with kappa coefficient 0.92 has been achieved. With the help map flood inundation and rice area maps, mandal-wise flood affected rice area and corresponding yield loss have been estimated. The post-cyclone ground truth data had been collected for quantitative assessment of crop damaged area. An overall accuracy of the flood affected rice map was 85% with kappa coefficient 0.70. It was estimated that rice crop damage assessment with SAR data indicated 53312 ha out of 205174 ha were affected and corresponding estimated yield as well as production are 0.8 t/ha and 189160 t respectively
Benchmarks Underestimate the Readiness of Multi-lingual Dialogue Agents
Creating multilingual task-oriented dialogue (TOD) agents is challenging due
to the high cost of training data acquisition. Following the research trend of
improving training data efficiency, we show for the first time, that in-context
learning is sufficient to tackle multilingual TOD.
To handle the challenging dialogue state tracking (DST) subtask, we break it
down to simpler steps that are more compatible with in-context learning where
only a handful of few-shot examples are used. We test our approach on the
multilingual TOD dataset X-RiSAWOZ, which has 12 domains in Chinese, English,
French, Korean, Hindi, and code-mixed Hindi-English. Our turn-by-turn DST
accuracy on the 6 languages range from 55.6% to 80.3%, seemingly worse than the
SOTA results from fine-tuned models that achieve from 60.7% to 82.8%; our BLEU
scores in the response generation (RG) subtask are also significantly lower
than SOTA.
However, after manual evaluation of the validation set, we find that by
correcting gold label errors and improving dataset annotation schema, GPT-4
with our prompts can achieve (1) 89.6%-96.8% accuracy in DST, and (2) more than
99% correct response generation across different languages. This leads us to
conclude that current automatic metrics heavily underestimate the effectiveness
of in-context learning
Novel derivative of aminobenzenesulfonamide (3c) induces apoptosis in colorectal cancer cells through ROS generation and inhibits cell migration
Background: Colorectal cancer (CRC) is the 3rd most common type of cancer worldwide. New anti-cancer agents
are needed for treating late stage colorectal cancer as most of the deaths occur due to cancer metastasis. A
recently developed compound, 3c has shown to have potent antitumor effect; however the mechanism underlying
the antitumor effect remains unknown.
Methods: 3c-induced inhibition of proliferation was measured in the absence and presence NAC using MTT in
HT-29 and SW620 cells and xCELLigence RTCA DP instrument. 3c-induced apoptotic studies were performed using
flow cytometry. 3c-induced redox alterations were measured by ROS production using fluorescence plate reader
and flow cytometry and mitochondrial membrane potential by flow cytometry; NADPH and GSH levels were
determined by colorimetric assays. Bcl2 family protein expression and cytochrome c release and PARP activation
was done by western blotting. Caspase activation was measured by ELISA. Cell migration assay was done using the
real time xCELLigence RTCA DP system in SW620 cells and wound healing assay in HT-29.
Results: Many anticancer therapeutics exert their effects by inducing reactive oxygen species (ROS). In this study,
we demonstrate that 3c-induced inhibition of cell proliferation is reversed by the antioxidant, N-acetylcysteine,
suggesting that 3c acts via increased production of ROS in HT-29 cells. This was confirmed by the direct
measurement of ROS in 3c-treated colorectal cancer cells. Additionally, treatment with 3c resulted in decreased
NADPH and glutathione levels in HT-29 cells. Further, investigation of the apoptotic pathway showed increased
release of cytochrome c resulting in the activation of caspase-9, which in turn activated caspase-3 and −6. 3c also
(i) increased p53 and Bax expression, (ii) decreased Bcl2 and BclxL expression and (iii) induced PARP cleavage in
human colorectal cancer cells. Confirming our observations, NAC significantly inhibited induction of apoptosis, ROS
production, cytochrome c release and PARP cleavage. The results further demonstrate that 3c inhibits cell migration
by modulating EMT markers and inhibiting TGFβ-induced phosphorylation of Smad2 and Samd3.
Conclusions: Our findings thus demonstrate that 3c disrupts redox balance in colorectal cancer cells and support
the notion that this agent may be effective for the treatment of colorectal cancer
‘Will I Regret for This Tweet?’—Twitter User’s Behavior Analysis System for Private Data Disclosure
Abstract
Twitter is an extensively used micro-blogging site for publishing user’s views on recent happenings. This wide reachability of messages over large audience poses a threat, as the degree of personally identifiable information disclosed might lead to user regrets. The Tweet-Scan-Post system scans the tweets contextually for sensitive messages. The tweet repository was generated using cyber-keywords for personal, professional and health tweets. The Rules of Sensitivity and Contextuality was defined based on standards established by various national regulatory bodies. The naive sensitivity regression function uses the Bag-of-Words model built from short text messages. The imbalanced classes in dataset result in misclassification with 25% of sensitive and 75% of insensitive tweets. The system opted stacked classification to combat the problem of imbalanced classes. The system initially applied various state-of-art algorithms and predicted 26% of the tweets to be sensitive. The proposed stacked classification approach increased the overall proportion of sensitive tweets to 35%. The system contributes a vocabulary set of 201 Sensitive Privacy Keyword using the boosting approach for three tweet categories. Finally, the system formulates a sensitivity scaling called TSP’s Tweet Sensitivity Scale based on Senti-Cyber features composed of Sensitive Privacy Keywords, Cyber-keywords with Non-Sensitive Privacy Keywords and Non-Cyber-keywords to detect the degree of disclosed sensitive information.</jats:p
Tweet-scan-post: a system for analysis of sensitive private data disclosure in online social media
On the origin of an unusual dependence of (bio)chemical reactivity of ferric hydroxides on nanoparticle size
A survey of data visualization tools for analyzing large volume of data in big data platform
Effect of Coadsorption of Electrolyte Ions on the Stability of Inner-Sphere Complexes
Coadsorption of electrolyte ions can have a marked influence on the adsorption and hence interfacial reactivity of multicharged ions but has generally been overlooked in previous density functional theory (DFT) studies. The impact of this effect is demonstrated for the DFT-based interpretation of in situ FTIR spectra of the controversial binding form of weakly adsorbed carbonate on hydrated hematite nanoparticles and ferrihydrite. Using a new methodology, we show that addition of hydronium or sodium ions in the DFT models leads to assigning the weakly adsorbed carbonate to an inner-sphere monodentate mononuclear complex. FTIR data and the “bond length−vibrational frequency” correlations established by DFT suggest that the adsorption affinity of ferrihydrite toward carbonate is lower than that of 7 nm hematite NPs. As opposed to the case of carbonate, the outer-sphere adsorption form of sulfate is demonstrated to be stable in the presence of hydronium. These findings have important implications for DFT modeling of the solid−water interfaces, which has become a major complementary tool to interpret spectroscopic and macroscopic observations of adsorption phenomena
- …
