29 research outputs found

    Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR

    Full text link
    Improving ASR systems is necessary to make new LLM-based use-cases accessible to people across the globe. In this paper, we focus on Indian languages, and make the case that diverse benchmarks are required to evaluate and improve ASR systems for Indian languages. To address this, we collate Vistaar as a set of 59 benchmarks across various language and domain combinations, on which we evaluate 3 publicly available ASR systems and 2 commercial systems. We also train IndicWhisper models by fine-tuning the Whisper models on publicly available training datasets across 12 Indian languages totalling to 10.7K hours. We show that IndicWhisper significantly improves on considered ASR systems on the Vistaar benchmark. Indeed, IndicWhisper has the lowest WER in 39 out of the 59 benchmarks, with an average reduction of 4.1 WER. We open-source all datasets, code and models.Comment: Accepted in INTERSPEECH 202

    A microbiota-directed food intervention for undernourished children

    Get PDF
    BACKGROUND: More than 30 million children worldwide have moderate acute malnutrition. Current treatments have limited effectiveness, and much remains unknown about the pathogenesis of this condition. Children with moderate acute malnutrition have perturbed development of their gut microbiota. METHODS: In this study, we provided a microbiota-directed complementary food prototype (MDCF-2) or a ready-to-use supplementary food (RUSF) to 123 slum-dwelling Bangladeshi children with moderate acute malnutrition between the ages of 12 months and 18 months. The supplementation was given twice daily for 3 months, followed by 1 month of monitoring. We obtained weight-for-length, weight-for-age, and length-for-age z scores and mid-upper-arm circumference values at baseline and every 2 weeks during the intervention period and at 4 months. We compared the rate of change of these related phenotypes between baseline and 3 months and between baseline and 4 months. We also measured levels of 4977 proteins in plasma and 209 bacterial taxa in fecal samples. RESULTS: A total of 118 children (59 in each study group) completed the intervention. The rates of change in the weight-for-length and weight-for-age z scores are consistent with a benefit of MDCF-2 on growth over the course of the study, including the 1-month follow-up. Receipt of MDCF-2 was linked to the magnitude of change in levels of 70 plasma proteins and of 21 associated bacterial taxa that were positively correlated with the weight-for-length z score (P\u3c0.001 for comparisons of both protein and bacterial taxa). These proteins included mediators of bone growth and neurodevelopment. CONCLUSIONS: These findings provide support for MDCF-2 as a dietary supplement for young children with moderate acute malnutrition and provide insight into mechanisms by which this targeted manipulation of microbiota components may be linked to growth. (Supported by the Bill and Melinda Gates Foundation and the National Institutes of Health; ClinicalTrials.gov number, NCT04015999.)

    Iron Behaving Badly: Inappropriate Iron Chelation as a Major Contributor to the Aetiology of Vascular and Other Progressive Inflammatory and Degenerative Diseases

    Get PDF
    The production of peroxide and superoxide is an inevitable consequence of aerobic metabolism, and while these particular "reactive oxygen species" (ROSs) can exhibit a number of biological effects, they are not of themselves excessively reactive and thus they are not especially damaging at physiological concentrations. However, their reactions with poorly liganded iron species can lead to the catalytic production of the very reactive and dangerous hydroxyl radical, which is exceptionally damaging, and a major cause of chronic inflammation. We review the considerable and wide-ranging evidence for the involvement of this combination of (su)peroxide and poorly liganded iron in a large number of physiological and indeed pathological processes and inflammatory disorders, especially those involving the progressive degradation of cellular and organismal performance. These diseases share a great many similarities and thus might be considered to have a common cause (i.e. iron-catalysed free radical and especially hydroxyl radical generation). The studies reviewed include those focused on a series of cardiovascular, metabolic and neurological diseases, where iron can be found at the sites of plaques and lesions, as well as studies showing the significance of iron to aging and longevity. The effective chelation of iron by natural or synthetic ligands is thus of major physiological (and potentially therapeutic) importance. As systems properties, we need to recognise that physiological observables have multiple molecular causes, and studying them in isolation leads to inconsistent patterns of apparent causality when it is the simultaneous combination of multiple factors that is responsible. This explains, for instance, the decidedly mixed effects of antioxidants that have been observed, etc...Comment: 159 pages, including 9 Figs and 2184 reference

    Phenotypical Analysis of Tumor Microenvironment

    No full text

    The Need for Artificial Intelligence Based Risk Factor Analysis for Age-Related Macular Degeneration: A Review

    No full text
    In epidemiology, a risk factor is a variable associated with increased disease risk. Understanding the role of risk factors is significant for developing a strategy to improve global health. There is strong evidence that risk factors like smoking, alcohol consumption, previous cataract surgery, age, high-density lipoprotein (HDL) cholesterol, BMI, female gender, and focal hyper-pigmentation are independently associated with age-related macular degeneration (AMD). Currently, in the literature, statistical techniques like logistic regression, multivariable logistic regression, etc., are being used to identify AMD risk factors by employing numerical/categorical data. However, artificial intelligence (AI) techniques have not been used so far in the literature for identifying risk factors for AMD. On the other hand, artificial intelligence (AI) based tools can anticipate when a person is at risk of developing chronic diseases like cancer, dementia, asthma, etc., in providing personalized care. AI-based techniques can employ numerical/categorical and/or image data thus resulting in multimodal data analysis, which provides the need for AI-based tools to be used for risk factor analysis in ophthalmology. This review summarizes the statistical techniques used to identify various risk factors and the higher benefits that AI techniques provide for AMD-related disease prediction. Additional studies are required to review different techniques for risk factor identification for other ophthalmic diseases like glaucoma, diabetic macular edema, retinopathy of prematurity, cataract, and diabetic retinopathy

    An ImageJ macro tool for OCTA-based quantitative analysis of Myopic Choroidal neovascularization.

    No full text
    Myopic Choroidal neovascularization (mCNV) is one of the most common vision-threatening com- plications of pathological myopia among many retinal diseases. Optical Coherence Tomography Angiography (OCTA) is an emerging newer non-invasive imaging technique and is recently being included in the investigation and treatment of mCNV. However, there exists no standard tool for time-efficient and dependable analysis of OCTA images of mCNV. In this study, we propose a customizable ImageJ macro that automates the OCTA image processing and lets users measure nine mCNV biomarkers. We developed a three-stage image processing pipeline to process the OCTA images using the macro. The images were first manually delineated, and then denoised using a Gaussian Filter. This was followed by the application of the Frangi filter and Local Adaptive thresholding. Finally, skeletonized images were obtained using the Mexican Hat filter. Nine vascular biomarkers including Junction Density, Vessel Diameter, and Fractal Dimension were then computed from the skeletonized images. The macro was tested on a 26 OCTA image dataset for all biomarkers. Two trends emerged in the computed biomarker values. First, the lesion-size dependent parameters (mCNV Area (mm2) Mean = 0.65, SD = 0.46) showed high variation, whereas normalized parameters (Junction Density(n/mm): Mean = 10.24, SD = 0.63) were uniform throughout the dataset. The computed values were consistent with manual measurements within existing literature. The results illustrate our ImageJ macro to be a convenient alternative for manual OCTA image processing, including provisions for batch processing and parameter customization, providing a systematic, reliable analysis of mCNV

    The Need for Artificial Intelligence Based Risk Factor Analysis for Age-Related Macular Degeneration: A Review

    No full text
    In epidemiology, a risk factor is a variable associated with increased disease risk. Understanding the role of risk factors is significant for developing a strategy to improve global health. There is strong evidence that risk factors like smoking, alcohol consumption, previous cataract surgery, age, high-density lipoprotein (HDL) cholesterol, BMI, female gender, and focal hyper-pigmentation are independently associated with age-related macular degeneration (AMD). Currently, in the literature, statistical techniques like logistic regression, multivariable logistic regression, etc., are being used to identify AMD risk factors by employing numerical/categorical data. However, artificial intelligence (AI) techniques have not been used so far in the literature for identifying risk factors for AMD. On the other hand, artificial intelligence (AI) based tools can anticipate when a person is at risk of developing chronic diseases like cancer, dementia, asthma, etc., in providing personalized care. AI-based techniques can employ numerical/categorical and/or image data thus resulting in multimodal data analysis, which provides the need for AI-based tools to be used for risk factor analysis in ophthalmology. This review summarizes the statistical techniques used to identify various risk factors and the higher benefits that AI techniques provide for AMD-related disease prediction. Additional studies are required to review different techniques for risk factor identification for other ophthalmic diseases like glaucoma, diabetic macular edema, retinopathy of prematurity, cataract, and diabetic retinopathy

    Machine Learning-Based Diagnosis and Ranking of Risk Factors for Diabetic Retinopathy in Population-Based Studies from South India

    No full text
    This paper discusses the importance of investigating DR using machine learning and a computational method to rank DR risk factors by importance using different machine learning models. The dataset was collected from four large population-based studies conducted in India between 2001 and 2010 on the prevalence of DR and its risk factors. We deployed different machine learning models on the dataset to rank the importance of the variables (risk factors). The study uses a t-test and Shapely additive explanations (SHAP) to rank the risk factors. Then, it uses five machine learning models (K-Nearest Neighbor, Decision Tree, Support Vector Machines, Logistic Regression, and Naive Bayes) to identify the unimportant risk factors based on the area under the curve criterion to predict DR. To determine the overall significance of risk variables, a weighted average of each classifier’s importance is used. The ranking of risk variables is provided to machine learning models. To construct a model for DR prediction, the combination of risk factors with the highest AUC is chosen. The results show that the risk factors glycosylated hemoglobin and systolic blood pressure were present in the top three risk factors for DR in all five machine learning models when the t-test was used for ranking. Furthermore, the risk factors, namely, systolic blood pressure and history of hypertension, were present in the top five risk factors for DR in all the machine learning models when SHAP was used for ranking. Finally, when an ensemble of the five machine learning models was employed, independently with both the t-test and SHAP, systolic blood pressure and diabetes mellitus duration were present in the top four risk factors for diabetic retinopathy. Decision Tree and K-Nearest Neighbor resulted in the highest AUCs of 0.79 (t-test) and 0.77 (SHAP). Moreover, K-Nearest Neighbor predicted DR with 82.6% (t-test) and 78.3% (SHAP) accuracy

    Capturing variations in nuclear phenotypes

    No full text
    International audienceRelating genotypes with phenotypes is important to understand diseases like cancer, but extremelychallenging, given the underlying biological variability and levels of phenotypes. 3D quantitative toolsare increasingly used to provide robust inferences pertaining to variations across collections of cells.We especially focus on the changes wrought to the nucleus of specific genotypes. Fibroblasts in thetumor microenvironment of mammary epithelial tissue serve as our model system and provide the con-text, although our methods are applicable to a broader range of biological systems. Using an imagebased approach, we analyze in 3D and compare phenotypes at nuclear level using estimates of texture,morphology and spatial context based on confocal images.Our data demonstrates that deletion of TP53 in stromal fibroblasts results in reorganization of chro-matin content across the nucleus, especially the nuclear periphery, while simultaneously reducingnuclear size and making it more spindly. No such shape change was observed for PTEN-deleted genotype,although there were some differences in distribution of chromatin and an increase in the local nucleardensity.The relative changes in phenotypes are in line with the larger role that the TP53 plays in tumor initiationand progression.These findings play an important role in uncovering the relationships of those genes withthe subcellular phenotypes, as well as formulating new hypotheses, especially pertaining to the relativeimpact of genes in specific pathways. More importantly, they demonstrate the efficacy of methodologyof analyzing a large number of cellular phenotypes
    corecore