Search CORE

19 research outputs found

Retrievability in an Integrated Retrieval System: An Extended Study

Author: Carevic Zeljko
Mayr Philipp
Roy Dwaipayan
Publication venue
Publication date: 27/03/2023
Field of study

Retrievability measures the influence a retrieval system has on the access to information in a given collection of items. This measure can help in making an evaluation of the search system based on which insights can be drawn. In this paper, we investigate the retrievability in an integrated search system consisting of items from various categories, particularly focussing on datasets, publications \ijdl{and variables} in a real-life Digital Library (DL). The traditional metrics, that is, the Lorenz curve and Gini coefficient, are employed to visualize the diversity in retrievability scores of the \ijdl{three} retrievable document types (specifically datasets, publications, and variables). Our results show a significant popularity bias with certain items being retrieved more often than others. Particularly, it has been shown that certain datasets are more likely to be retrieved than other datasets in the same category. In contrast, the retrievability scores of items from the variable or publication category are more evenly distributed. We have observed that the distribution of document retrievability is more diverse for datasets as compared to publications and variables.Comment: To appear in International Journal on Digital Libraries (IJDL). arXiv admin note: substantial text overlap with arXiv:2205.0093

arXiv.org e-Print Archive

A Comparative Analysis of Retrievability and PageRank Measures

Author: Mall Priyanshu Raj
Roy Dwaipayan
Sinha Aman
Publication venue
Publication date: 17/11/2023
Field of study

The accessibility of documents within a collection holds a pivotal role in Information Retrieval, signifying the ease of locating specific content in a collection of documents. This accessibility can be achieved via two distinct avenues. The first is through some retrieval model using a keyword or other feature-based search, and the other is where a document can be navigated using links associated with them, if available. Metrics such as PageRank, Hub, and Authority illuminate the pathways through which documents can be discovered within the network of content while the concept of Retrievability is used to quantify the ease with which a document can be found by a retrieval model. In this paper, we compare these two perspectives, PageRank and retrievability, as they quantify the importance and discoverability of content in a corpus. Through empirical experimentation on benchmark datasets, we demonstrate a subtle similarity between retrievability and PageRank particularly distinguishable for larger datasets.Comment: Accepted at FIRE 202

arXiv.org e-Print Archive

Automated Attribute Extraction from Legal Proceedings

Author: Adhikary Subinay
Das Sagnik
Ghosh Kripabandhu
Roy Dwaipayan
Saha Sagnik
Sen Procheta
Publication venue
Publication date: 18/10/2023
Field of study

The escalating number of pending cases is a growing concern world-wide. Recent advancements in digitization have opened up possibilities for leveraging artificial intelligence (AI) tools in the processing of legal documents. Adopting a structured representation for legal documents, as opposed to a mere bag-of-words flat text representation, can significantly enhance processing capabilities. With the aim of achieving this objective, we put forward a set of diverse attributes for criminal case proceedings. We use a state-of-the-art sequence labeling framework to automatically extract attributes from the legal documents. Moreover, we demonstrate the efficacy of the extracted attributes in a downstream task, namely legal judgment prediction.Comment: Presented in Mining and Learning in the Legal Domain (MLLD) workshop 202

arXiv.org e-Print Archive

Leveraging hierarchical self-assembly pathways for realizing colloidal photonic crystals

Author: Chakrabarti Dwaipayan
Johnston Roy L
Morphew Daniel
Neophytou Andreas
Rao Abhishek B
Sciortino Francesco
Shaw James
Publication venue
Publication date: 01/01/2020
Field of study

Colloidal open crystals are attractive materials, especially for their photonic applications. Self-assembly appeals as a bottom-up route for structure fabrication, but self-assembly of colloidal open crystals has proven to be elusive for their mechanical instability due to being low-coordinated. For such a bottom-up route to yield a desired colloidal open crystal, the target structure is required to be thermodynamically favored for designer building blocks and also kinetically accessible via self- assembly pathways in preference to metastable structures. Additionally, the selection of a particular polymorph poses a challenge for certain much sought-after colloidal open crystals for their applications as photonic crystals. Here, we devise hierarchical self-assembly pathways, which, starting from designer triblock patchy particles, yield in a cascade of well-separated associations first tetrahedral clusters and then tetrastack crystals. The designed pathways avoid trapping into an amorphous phase. Our analysis reveals how such a two-stage self-assembly pathway via tetrahedral clusters promotes crystallization by suppressing five- and seven-membered rings that hinder the emergence of the ordered structure. We also find that slow annealing promotes a bias toward the cubic polymorph relative to the hexagonal counterpart. Finally, we calculate the photonic band structures, showing that the cubic polymorph exhibits a complete photonic band gap for the dielectric filling fraction directly realizable from the designer triblock patchy particles. Unexpectedly, we find that the hexagonal polymorph also supports a complete photonic band gap, albeit only for an increased filling fraction, which can be realized via postassembly processing

University of Birmingham Research Portal

Archivio della ricerca- Università di Roma La Sapienza

LeDA: a system for legal data annotation

Author: Adhikary Subinay
Ganguly Debasis
Ghosh Kripabandhu
Kumar Guha Shouvik
Roy Dwaipayan
Publication venue: IOS Press
Publication date: 07/12/2023
Field of study

This paper presents LeDA, a system for Legal Data Annotation. The system offers the functionality of annotating and categorising text spans representing legal concepts that capture the topic of a document, and also supports a meta-annotator to adjudicate the ground truth created by different annotators. Notably, our system supports a dynamic update of the ontology by enabling the creation of new legal concepts. Currently employed to annotate key legal concepts, LeDA aims to construct concept-based semantic representations for tasks such as similar case retrieval, and judgment prediction

Enlighten

Genome wide association study of uric acid in Indian population and interaction of identified variants with type 2 diabetes

Author: Banerjee Priyanka
Bharadwaj Dwaipayan
Chakraborty Shraddha
Ghosh Saurabh
Giri Anil K.
Kauser Yasmeen
Parekatt Vaisak
Roy Suki
Tandon Nikhil
Undru Aditya
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Abnormal level of Serum Uric Acid (SUA) is an important marker and risk factor for complex diseases including Type 2 diabetes. Since genetic determinant of uric acid in Indians is totally unexplored, we tried to identify common variants associated with SUA in Indians using Genome Wide Association Study (GWAS). Association of five known variants in SLC2A9 and SLC22A11 genes with SUA level in 4,834 normoglycemics (1,109 in discovery and 3,725 in validation phase) was revealed with different effect size in Indians compared to other major ethnic population of the world. Combined analysis of 1,077 T2DM subjects (772 in discovery and 305 in validation phase) and normoglycemics revealed additional GWAS signal in ABCG2 gene. Differences in effect sizes of ABCG2 and SLC2A9 gene variants were observed between normoglycemics and T2DM patients. We identified two novel variants near long non-coding RNA genes AL356739.1 and AC064865.1 with nearly genome wide significance level. Meta-analysis and in silico replication in 11,745 individuals from AUSTWIN consortium improved association for rs12206002 in AL356739.1 gene to sub-genome wide association level. Our results extends association of SLC2A9, SLC22A11 and ABCG2 genes with SUA level in Indians and enrich the assemblages of evidence for SUA level and T2DM interrelationship