32 research outputs found

    XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages

    Get PDF
    Data scarcity is a crucial issue for the development of highly multilingual NLP systems. Yet for many under-represented languages (ULs)-languages for which NLP research is particularly far behind in meeting user needs-it is feasible to annotate small amounts of data. Motivated by this, we propose XTREME-UP, a benchmark defined by: its focus on the scarce-data scenario rather than zero-shot; its focus on user-centric tasks-tasks with broad adoption by speakers of high-resource languages; and its focus on under-represented languages where this scarce-data scenario is most realistic. XTREME-UP evaluates the capabilities of language models across 88 under-represented languages over 9 key user-centric technologies including ASR, OCR, MT, and information access tasks that are of general utility. We create new datasets for OCR, autocomplete, question answering, semantic parsing, and transliteration, and build on and refine existing datasets for other tasks. XTREME-UP provides a methodology for evaluating many modeling scenarios including text-only, multi-modal (vision, audio, and text), supervised parameter tuning, and in-context learning. We evaluate commonly used models on the benchmark

    XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages

    Full text link
    Data scarcity is a crucial issue for the development of highly multilingual NLP systems. Yet for many under-represented languages (ULs) -- languages for which NLP re-search is particularly far behind in meeting user needs -- it is feasible to annotate small amounts of data. Motivated by this, we propose XTREME-UP, a benchmark defined by: its focus on the scarce-data scenario rather than zero-shot; its focus on user-centric tasks -- tasks with broad adoption by speakers of high-resource languages; and its focus on under-represented languages where this scarce-data scenario tends to be most realistic. XTREME-UP evaluates the capabilities of language models across 88 under-represented languages over 9 key user-centric technologies including ASR, OCR, MT, and information access tasks that are of general utility. We create new datasets for OCR, autocomplete, semantic parsing, and transliteration, and build on and refine existing datasets for other tasks. XTREME-UP provides methodology for evaluating many modeling scenarios including text-only, multi-modal (vision, audio, and text),supervised parameter tuning, and in-context learning. We evaluate commonly used models on the benchmark. We release all code and scripts to train and evaluate model

    MasakhaNER 2.0: Africa-centric Transfer Learning for Named Entity Recognition

    Get PDF
    African languages are spoken by over a billion people, but are underrepresented in NLP research and development. The challenges impeding progress include the limited availability of annotated datasets, as well as a lack of understanding of the settings where current methods are effective. In this paper, we make progress towards solutions for these challenges, focusing on the task of named entity recognition (NER). We create the largest human-annotated NER dataset for 20 African languages, and we study the behavior of state-of-the-art cross-lingual transfer methods in an Africa-centric setting, demonstrating that the choice of source language significantly affects performance. We show that choosing the best transfer language improves zero-shot F1 scores by an average of 14 points across 20 languages compared to using English. Our results highlight the need for benchmark datasets and models that cover typologically-diverse African languages

    Development of a kinetic metabolic model: application to Catharanthus roseus hairy root

    Get PDF
    A kinetic metabolic model describing Catharanthus roseus hairy root growth and nutrition was developed. The metabolic network includes glycolysis, pentose-phosphate pathway, TCA cycle and the catabolic reactions leading to cell building blocks such as amino acids, organic acids, organic phosphates, lipids and structural hexoses. The central primary metabolic network was taken at pseudo-steady state and metabolic flux analysis technique allowed reducing from 31 metabolic fluxes to 20 independent pathways. Hairy root specific growth rate was described as a function of intracellular concentration in cell building blocks. Intracellular transport and accumulation kinetics for major nutrients were included. The model uses intracellular nutrients as well as energy shuttles to describe metabolic regulation. Model calibration was performed using experimental data obtained from batch and medium exchange liquid cultures of C. roseus hairy root using a minimal medium in Petri dish. The model is efficient in estimating the growth rate

    A case-control study to determine the relationship between lipid profile and different types of anaemia

    No full text
    Aim: To study the relationship between lipid profile and types of Anaemia.Methods: This case (N=154) control (N=154) study was carried out in the Department of General Medicine, Mahatma Gandhi Medical College, Jaipur from January 2019 to June 2020. This includes All proven cases of Anaemia of age >18yrs and Hb<12gm% irrespective of sex. A detailed history was obtained from the subjects of the study, with special emphasis on age, sex and occupation; non-specific symptoms of anaemia. Fasting venous blood sample (> 12 hours) was obtained for estimation of lipid profile. T3 and T4 levels, fasting and post prandial (two hours after an oral dose of 75gms of glucose) blood sugar levels, and bone marrow aspiration cytology was done in selected cases based on clinical assessment. Results: The cases and Controls are matched for Age, majority of the cases (35.1%) and Controls (33%) are in the age group of >50 years. 61.7% of cases are males and 38.3% are females, and 57.8% of Controls are Males and 42.2% are Females. Most common presenting symptom is Fatigue, which was present in 50% of the cases. The most common finding on general physical examination was pallor, which was present in 58.4% cases. Mean Total Cholesterol level is more in IDA (136.6±17.9 mg/dl) compared to other types of Anaemia, mean HDL levels is more in Vit B12 Deficiency Anaemia (31.67±5.4 mg/dl), mean LDL levels is more in Vit B12 deficiency Anaemia (84.7±14.9), mean VLDL levels is more in Dimorphic Anaemia (52.4±25.8 mg/dl), mean TG levels is more in Dimorphic Anaemia. Conclusion: Anaemia is associated with significant hypocholesterolaemia, with lowering in all lipid subfractions. The extent of hypocholesterolaemia is proportional to the severity of anaemia. The type of anaemia has no effect on the hypocholesterolaemia seen in anaemia

    Allopurinol reduces the severity of peritoneal adhesions in mice

    No full text
    A study was designed to investigate the possibility of reducing peritoneal adhesion formation in mice by pretreatment with allopurinol. Allopurinol, at a dose of 35 mg/kg of body weight/d significantly reduced the severity of peritoneal adhesions (P < .001), and also the neutrophil response to ischemia (P < .05). Tissue myeloperoxidase activity at the site of ischemic injury was significantly lower in the allopurinol-treated mice at the end of 2 weeks (P < .001). However, xanthine oxidase was undetectable in both control and allopurinol-treated mice. These observations suggest that allopurinol reduces the severity of peritoneal adhesion formation in mice, possibly by reducing the neutrophil response to ischemia

    Optimizing bioconversion pathways through systems analysis and metabolic engineering

    No full text
    We demonstrate a general approach for metabolic engineering of biocatalytic systems comprising the uses of a chemostat for strain improvement and radioisotopic tracers for the quantification of pathway fluxes. Flux determination allows the identification of target pathways for modification as validated by subsequent overexpression of the corresponding gene. We demonstrate this method in the indene bioconversion network of Rhodococcus modified for the overproduction of 1,2-indandiol, a key precursor for the AIDS drug Crixivan
    corecore