Search CORE

106 research outputs found

Addressing the Impact of Localized Training Data in Graph Neural Networks

Author: Akansha Singh
Publication venue
Publication date: 24/07/2023
Field of study

Graph Neural Networks (GNNs) have achieved notable success in learning from graph-structured data, owing to their ability to capture intricate dependencies and relationships between nodes. They excel in various applications, including semi-supervised node classification, link prediction, and graph generation. However, it is important to acknowledge that the majority of state-of-the-art GNN models are built upon the assumption of an in-distribution setting, which hinders their performance on real-world graphs with dynamic structures. In this article, we aim to assess the impact of training GNNs on localized subsets of the graph. Such restricted training data may lead to a model that performs well in the specific region it was trained on but fails to generalize and make accurate predictions for the entire graph. In the context of graph-based semi-supervised learning (SSL), resource constraints often lead to scenarios where the dataset is large, but only a portion of it can be labeled, affecting the model's performance. This limitation affects tasks like anomaly detection or spam detection when labeling processes are biased or influenced by human subjectivity. To tackle the challenges posed by localized training data, we approach the problem as an out-of-distribution (OOD) data issue by by aligning the distributions between the training data, which represents a small portion of labeled data, and the graph inference process that involves making predictions for the entire graph. We propose a regularization method to minimize distributional discrepancies between localized training data and graph inference, improving model performance on OOD data. Extensive tests on popular GNN models show significant performance improvement on three citation GNN benchmark datasets. The regularization approach effectively enhances model adaptation and generalization, overcoming challenges posed by OOD data.Comment: 6 pages, 4 figure

arXiv.org e-Print Archive

Over-Squashing in Graph Neural Networks: A Comprehensive survey

Author: Akansha Singh
Publication venue
Publication date: 04/09/2023
Field of study

Graph Neural Networks (GNNs) have emerged as a revolutionary paradigm in the realm of machine learning, offering a transformative approach to dissect intricate relationships inherent in graph-structured data. The foundational architecture of most GNNs involves the dissemination of information through message aggregation and transformation among interconnected nodes, a mechanism that has demonstrated remarkable efficacy across diverse applications encompassing node classification, link prediction, and recommendation systems. Nonetheless, their potential prowess encounters a restraint intrinsic to scenarios necessitating extensive contextual insights. In certain contexts, accurate predictions hinge not only upon a node's immediate local surroundings but also on interactions spanning far-reaching domains. This intricate demand for long-range information dissemination exposes a pivotal challenge recognized as "over-squashing," wherein the fidelity of information flow from distant nodes becomes distorted. This phenomenon significantly curtails the efficiency of message-passing mechanisms, particularly for tasks reliant on intricate long-distance interactions. In this comprehensive article, we illuminate the prevalent constraint of over-squashing pervading GNNs. Our exploration entails a meticulous exposition of the ongoing efforts by researchers to improve the ramifications posed by this limitation. Through systematic elucidation, we delve into strategies, methodologies, and innovations proposed thus far, all aimed at mitigating the detriments of over-squashing. By shedding light on this intricately woven issue, we aim to contribute to a nuanced understanding of the challenges within the GNN landscape and the evolving solutions designed to surmount them.Comment: 8 page

arXiv.org e-Print Archive

Recommended from our members

Data-Driven Control, Modeling, and Forecasting for Residential Solar Power

Author: Bansal Akansha Singh
Publication venue: ScholarWorks@UMass Amherst
Publication date: 18/03/2022
Field of study

Distributed solar generation is rising rapidly due to a continuing decline in the cost of solar modules. Most residential solar deployments today are grid-tied, enabling them to draw power from the grid when their local demand exceeds solar generation and feed power into the grid when their local solar generation exceeds demand. The electric grid was not designed to support such decentralized and intermittent energy generation by millions of individual users. This dramatic increase in solar power is placing increasing stress on the grid, which must continue to balance its supply and demand despite the potential for large solar fluctuations. To address the problem, this thesis proposes new data-driven techniques for better controlling, modeling, and forecasting residential solar power. The grid currently exercises no direct control over its connected solar capacity, but instead indirectly controls it by regulating new solar connections. This approach is highly inefficient and wastes much of the grid\u27s potential to transmit solar. Instead, we propose Software-defined Solar-powered (SDS) systems that dynamically regulate solar flow rates into the grid and design an SDS prototype, called SunShade. Specifically, we introduce a new class of Weighted Power Point Tracking (WPPT) algorithms, inspired by Maximum Power Point Tracking (MPPT), capable of dynamically enforcing both hard and relative caps on solar power, which enables the grid to decouple rate control from admission control. In contrast, to avoid grid regulations entirely, homes can also partially or entirely defect from the grid to fully utilize their solar power without restrictions. We present a switching architecture that enables homes to dynamically switch between a local generator, battery, and solar to co-optimize their cost, carbon footprint, switching frequency, and reliability. We introduce switching policies that reveal a tradeoff between solar utilization and reliability, such that higher solar utilization requires more switching, which can lead to lower reliability. Enabling better control of intermittent solar also requires improving solar performance models, which infer solar output based on current environmental conditions. Recent solar models primarily leverage data from ground-based weather stations, which may be far from solar sites and thus inaccurate. In addition, these weather stations report cloud cover---the most important metric for solar modeling---in coarse units of oktas. Instead, we propose developing solar models based on data from a new generation of Geostationary Operational Environmental Satellites (GOES-16 and GOES-17) that began launching in late 2017. We develop physical and machine learning (ML) models for solar performance modeling using both derived data products released by the National Oceanic and Atmospheric Administration (NOAA), as well as the satellites\u27 raw multispectral data. We find that ML-based models using the raw multispectral data are significantly more accurate than both physical models using derived data products, such as Downward Shortwave Radiation (DSR), and prior okta-based solar models. The raw multispectral data is also beneficial since it is available at much higher spatial and temporal resolutions---1km^2 and every 5 minutes---than oktas---25km^2 and every hour. The accuracy of our ML-based models on multispectral data is also better regardless of whether they are locally trained using data only from a particular solar site or globally trained using data from many solar sites. Since global models can be trained once but used anywhere, they can also enable accurate modeling for sites with limited data, e.g., newly installed solar sites. Solar forecasting models, which predict future solar output based on environmental conditions also help in better solar control. Accurate near-term solar forecasts on the order of minutes to an hour are particularly important because homes and the grid must be able to adapt to large sudden changes in solar output. Current solar forecasting techniques, which primarily use Numerical Weather Predictions (NWP) algorithms, mostly leverage physics-based modeling. These physics-based models are most appropriate for forecast horizons on the order of hours to days and not near-term forecasts on the order of minutes to an hour. While there is some recent work on analyzing images from ground-based sky cameras for accurate near-term solar forecasting, it requires installing additional infrastructure. We instead propose a general model for solar nowcasting from abundant and readily available multispectral satellite data using self-supervised learning. Specifically, we develop deep auto-regressive models using convolutional neural networks (CNN) and long short-term memory networks (LSTM) that are globally trained across multiple locations to predict raw future observations of the spatio-temporal data collected by the recently launched GOES-R series of satellites. Our model estimates a location\u27s future solar irradiance based on satellite observations, which we feed to a regression model trained on smaller site-specific solar data to provide near-term solar photovoltaic (PV) forecasts that account for site-specific characteristics

ScholarWorks@UMass Amherst

Polymorphism in Bi-based perovskite oxides: a first-principles study

Author: Canadell Enric
Diéguez Oswaldo
Singh Akansha
Singh Viveka N.
Íñiguez Jorge
Publication venue: 'American Physical Society (APS)'
Publication date: 06/04/2018
Field of study

Under normal conditions, bulk crystals of BiScO

_3

, BiCrO

_3

, BiMnO

_3

, BiFeO

_3

, and BiCoO

_3

present three very different variations of the perovskite structure: an antipolar phase, a rhombohedral phase with a large polarization along the space diagonal of the pseudocubic unit cell, and a supertetragonal phase with even larger polarization. With the aim of understanding the causes for this variety, we have used a genetic algorithm to search for minima in the surface energy of these materials. Our results show that the number of these minima is very large when compared to that of typical ferroelectric perovskites like BaTiO

_3

and PbTiO

_3

, and that a fine energy balance between them results in the large structural differences seen. As byproducts of our search we have identified charge-ordering structures with low energy in BiMnO

_3

, and several phases with energies that are similar to that of the ground state of BiCrO

_3

. We have also found that a inverse supertetragonal phase exists in bulk, likely to be favored in films epitaxially grown at large values of tensile misfit strain

arXiv.org e-Print Archive

Digital.CSIC

Continuities and changes in spatial patterns of under-five mortality at the district level in India (1991–2011)

Author: Masquelier Bruno
Singh Akansha
Publication venue: BioMed Central
Publication date: 01/01/2018
Field of study

Background India has the largest number of under-five deaths globally, and large variations in under-five mortality persist between states and districts. Relationships between under-five mortality and numerous socioeconomic, development and environmental health factors have been explored at the national and state levels, but the possible spatial heterogeneity in these relationships has seldom been investigated at the district level. This study seeks to unravel local variation in key determinants of under-five mortality based on the 1991 and 2011 censuses. Methods Using geocoded district-level data from the last two census rounds (1991 and 2011) and ordinary least squares and geographically weighted regressions, we identify district-specific relationships between under-five mortality rate and a series of determinants for two periods separated by 20 years (1986–1987 and 2006–2007). To identify spatial groupings of coefficients, we perform a cluster analysis based on t-values of the geographically weighted regression. Results The geographically weighted regression analysis shows that relationships between the under-five mortality rate and factors for socioeconomic, development, and environmental health factors vary spatially in terms of direction, strength, and extent when considering: female literacy and labor force participation; share of scheduled castes and scheduled tribes; access to electricity; safe water and sanitation; road infrastructure; and medical facilities. This spatial heterogeneity is accompanied by significant changes over time in the roles that these factors play in under-five mortality. Important local determinants of under-five mortality in 2011 were female literacy, female labor force participation, access to sanitation facilities and electricity; while the key local determinants in 1991 were road infrastructure, safe water, and medical facilities. We identify six different clusters based on geographically weighted regression coefficients that broadly encompass the same districts in both periods; but these clusters do not follow the regional boundaries suggested by the previous studies. In particular, the high mortality states of India that are often typically classified as high focus states were classified into three different clusters based on the relationship of the factors associated with under-five mortality. Conclusion This study demonstrates the utility of combining geographically weighted regression and cluster analyses as a methodological approach to study local-level variation in public health indicators, and it could be applied in any country using aggregate-level information from census or survey data. Identifying local predictors of under-five mortality is important for designing interventions in specific districts. Additional reduction in under-five mortality will only be possible with intervention programs designed at the local level, which take into consideration local level determinants of under-five mortality

Durham Research Online

Directory of Open Access Journals

DIAL UCLouvain

L'archive ouverte de L'Ined

Identification of group specific motifs in Beta-lactamase family of proteins

Author: Saxena Akansha
Singh Harpreet
Singh Reema
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Beta-lactamases are one of the most serious threats to public health. In order to combat this threat we need to study the molecular and functional diversity of these enzymes and identify signatures specific to these enzymes. These signatures will enable us to develop inhibitors and diagnostic probes specific to lactamases. The existing classification of beta-lactamases was developed nearly 30 years ago when few lactamases were available. DLact database contain more than 2000 beta-lactamase, which can be used to study the molecular diversity and to identify signatures specific to this family. Methods A set of 2020 beta-lactamase proteins available in the DLact database <url>http://59.160.102.202/DLact</url> were classified using graph-based clustering of Best Bi-Directional Hits. Non-redundant (> 90 percent identical) protein sequences from each group were aligned using T-Coffee and annotated using information available in literature. Motifs specific to each group were predicted using PRATT program. Results The graph-based classification of beta-lactamase proteins resulted in the formation of six groups (Four major groups containing 191, 726, 774 and 73 proteins while two minor groups containing 50 and 8 proteins). Based on the information available in literature, we found that each of the four major groups correspond to the four classes proposed by Ambler. The two minor groups were novel and do not contain molecular signatures of beta-lactamase proteins reported in literature. The group-specific motifs showed high sensitivity (> 70%) and very high specificity (> 90%). The motifs from three groups (corresponding to class A, C and D) had a high level of conservation at DNA as well as protein level whereas the motifs from the fourth group (corresponding to class B) showed conservation at only protein level. Conclusion The graph-based classification of beta-lactamase proteins corresponds with the classification proposed by Ambler, thus there is no need for formulating a new classification. However, further characterization of two small groups may require updating the existing classification scheme. Better sensitivity and specificity of group-specific motifs identified in this study, as compared to PROSITE motifs, and their proximity to the active site indicates that these motifs represents group-specific signature of beta-lactamases and can be further developed into diagnostics and therapeutics.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Data Visualization and Techniques

Author: Sharma Akansha
Singh Prem Pratap
Publication venue: 'Advanced Research Journals'
Publication date: 20/02/2015
Field of study

Data visualization is the graphical representation of information. Bar charts scatter graphs, and maps are examples of simple data visualizations that have been used for decades. Information technology combines the principles of visualization with powerful applications and large data sets to create sophisticated images and animations. A tag cloud, for instance, uses text size to indicate the relative frequency of use of a set of terms. In many cases, the data that feed a tag cloud come from thousands of Web pages, representing perhaps millions ofusers. All of this information is contained in a simple image that you can understand quickly and easily. More complex visualizations sometimes generate animations that demonstrate how data change over time. In an application called Gap minder, bubbles represent the countriesof the world, with each nationÊs population reflected in the size of its bubble. You can set the x and y axes to compare life expectancy with per capita income, for example, and the tool will show how each nationÊs bubble moves on the graph over time. You can see that higher income generallycorrelates with longer life expectancy, but the visualization also clearly shows that China doesnÊt follow this trend·in 1975, the country had one of the lowest per capita incomes but one of the longer life expectancies. The animation also shows the steep drop in life expectancy in many sub-Saharan African countries starting in the early 1990s (corresponding to the AIDS epidemic in that part of the world) and the plummeting of life expectancy in Rwanda at the time of that nationÊs genocide

Advanced Research Journals

Binary Journal of Data Mining & Networking

A REVIEW ON CARISSA CARANDASǧPHYTOCHEMISTRY,ETHNOǧPHARMACOLOGY, AND MICROPROPAGATION AS CONSERVATION STRATEGY

Author: Singh Akansha
Uppal Gursimran Kaur
Publication venue: 'Innovare Academic Sciences Pvt Ltd'
Publication date: 01/05/2015
Field of study

Carissa carandas is a useful food and medicinal plant of India, found to be widely distributed throughout subtropical and topical regions. The planthas been used as a traditional medicinal plant over thousands of years in the Ayurvedic, Unani, and Homoeopathic system of medicine. Traditionally,whole plant and its parts were used in the treatment of various ailments. The major bioactive constituents, which impart medicinal value to the herb,are alkaloids, flavonoids, saponins and large amounts of cardiac glycosides, triterpenoids, phenolic compounds and tannins. Roots were reported tocontain volatile principles including 2-acetyl phenol, lignan, carinol, sesquiterpenes (carissone, carindone), lupeol, Î²-sitosterol, 16Î²-hydroxybetulinicacid, Î±-amyrin, Î²-sitosterol glycoside, and des-Nmethylnoracronycine, whereas leaves were reported to contain triterpenoid constitutes as wellas tannins. While, fruits have been reported to contain carisol, epimer of Î±-amyrin, linalool, Î²-caryophyllene, carissone, carissic acid, carindone,ursolic acid, carinol, ascorbic acid, lupeol, and Î²-sitosterol. Ethnopharmacological significance of the plant has been ascribed due to anti-cancer,anti-convulsant, anti-oxidant, analgesic, anti-inflammatoryAQ1, anti-ulcer, anthelmintic activity, cardiovascular, anti-nociceptive, anti-diabetic,antipyretic, hepatoprotective, neuropharmacological, and diuretic activities, antimicrobial activities and cytotoxic potentials, in-vitro anti-oxidant,and DNA damage inhibition, and constipation and diarrheal activities. The review also dealt with describing micropropagation strategies for effectiveconservation of this important food and medicinal plant. The review has been written with the aim to provide a direction for further clinical researchto promote safe and effective herbal treatments to cure a number of diseases

Innovare Academic Sciences: E-Journals

Age- and Sex-Specific Burden of Morbidity and Disability in India: A Current Scenario

Author: Singh Akansha
Yadav Ajit Kumar
Publication venue: 'IntechOpen'
Publication date: 11/03/2020
Field of study

India is the second most populous country in the world with a population of 1.3 billion; any change in its morbidity and disability pattern is bound to bring change at the Asia level, which is a matter of concern for the developing countries. Disability-free life expectancy (DFLE) and disability-adjusted life years (DALYs) provide summary measures of health across characteristics. The assessments of epidemiological patterns and health system performance of any place and time period display its progress towards the goal of sustainable development goals (SDGs). The main aim of this study is to assess the age and sex pattern of the burden of diseases (mortality and morbidity) and disability in India. The information on disease and deaths was extracted from the 71st round of the National Sample Surveys (NSS) conducted in 2014 (NSS 2014) and the Causes of Deaths Study conducted in the 2010–2013 (RGI 2010–2013) and disability from Census of India 2011 (ORG 2011)

IntechOpen

Crossref

Periodicals and Nation-Building: The Public Sphere, Modernity, and Modernism in Modern Review and Visva Bharati Quarterly

Author: Singh Akansha
Publication venue: Wydawnictwo Uniwersytetu Łódzkiego
Publication date: 20/12/2023
Field of study

The paper analyzes selections from Modern Review and Visva Bharati Quarterly, to study the complex act of nation-building taking place in India during the first half of the twentieth century. Through these periodicals, it discusses three interconnected occurrences that contributed to the envisioning of new India: firstly, the construction of a politically aware public sphere through nationalistic sentiments and anti-imperial internationalism; secondly, India’s localization of modernity as oscillating between the colonial subjects’ reactionary modernity and the colonially administered modernity of domination; and thirdly, the emergence of a modernism that was more immersed in restructuring social and political systems of power than being restricted to formal and aesthetic novelty. Thus, drawing on writings published in Modern Review and Visva Bharati Quarterly, the paper assesses the degree to which the two periodicals realized the identity of new India

Repozytorium Uniwersytetu Łódzkiego (University of Lodz Repository)