142 research outputs found
Revisiting Low Resource Status of Indian Languages in Machine Translation
Indian language machine translation performance is hampered due to the lack
of large scale multi-lingual sentence aligned corpora and robust benchmarks.
Through this paper, we provide and analyse an automated framework to obtain
such a corpus for Indian language neural machine translation (NMT) systems. Our
pipeline consists of a baseline NMT system, a retrieval module, and an
alignment module that is used to work with publicly available websites such as
press releases by the government. The main contribution towards this effort is
to obtain an incremental method that uses the above pipeline to iteratively
improve the size of the corpus as well as improve each of the components of our
system. Through our work, we also evaluate the design choices such as the
choice of pivoting language and the effect of iterative incremental increase in
corpus size. Our work in addition to providing an automated framework also
results in generating a relatively larger corpus as compared to existing
corpora that are available for Indian languages. This corpus helps us obtain
substantially improved results on the publicly available WAT evaluation
benchmark and other standard evaluation benchmarks.Comment: 10 pages, few figures, Preprint under revie
Profiling allele-specific gene expression in brains from individuals with autism spectrum disorder reveals preferential minor allele usage.
One fundamental but understudied mechanism of gene regulation in disease is allele-specific expression (ASE), the preferential expression of one allele. We leveraged RNA-sequencing data from human brain to assess ASE in autism spectrum disorder (ASD). When ASE is observed in ASD, the allele with lower population frequency (minor allele) is preferentially more highly expressed than the major allele, opposite to the canonical pattern. Importantly, genes showing ASE in ASD are enriched in those downregulated in ASD postmortem brains and in genes harboring de novo mutations in ASD. Two regions, 14q32 and 15q11, containing all known orphan C/D box small nucleolar RNAs (snoRNAs), are particularly enriched in shifts to higher minor allele expression. We demonstrate that this allele shifting enhances snoRNA-targeted splicing changes in ASD-related target genes in idiopathic ASD and 15q11-q13 duplication syndrome. Together, these results implicate allelic imbalance and dysregulation of orphan C/D box snoRNAs in ASD pathogenesis
Missing Data in Randomized Clinical Trials for Weight Loss: Scope of the Problem, State of the Field, and Performance of Statistical Methods
BACKGROUND: Dropouts and missing data are nearly-ubiquitous in obesity randomized controlled trails, threatening validity and generalizability of conclusions. Herein, we meta-analytically evaluate the extent of missing data, the frequency with which various analytic methods are employed to accommodate dropouts, and the performance of multiple statistical methods. METHODOLOGY/PRINCIPAL FINDINGS: We searched PubMed and Cochrane databases (2000-2006) for articles published in English and manually searched bibliographic references. Articles of pharmaceutical randomized controlled trials with weight loss or weight gain prevention as major endpoints were included. Two authors independently reviewed each publication for inclusion. 121 articles met the inclusion criteria. Two authors independently extracted treatment, sample size, drop-out rates, study duration, and statistical method used to handle missing data from all articles and resolved disagreements by consensus. In the meta-analysis, drop-out rates were substantial with the survival (non-dropout) rates being approximated by an exponential decay curve (e(-lambdat)) where lambda was estimated to be .0088 (95% bootstrap confidence interval: .0076 to .0100) and t represents time in weeks. The estimated drop-out rate at 1 year was 37%. Most studies used last observation carried forward as the primary analytic method to handle missing data. We also obtained 12 raw obesity randomized controlled trial datasets for empirical analyses. Analyses of raw randomized controlled trial data suggested that both mixed models and multiple imputation performed well, but that multiple imputation may be more robust when missing data are extensive. CONCLUSION/SIGNIFICANCE: Our analysis offers an equation for predictions of dropout rates useful for future study planning. Our raw data analyses suggests that multiple imputation is better than other methods for handling missing data in obesity randomized controlled trials, followed closely by mixed models. We suggest these methods supplant last observation carried forward as the primary method of analysis
Contribution of Chondroitin Sulfate A to the Binding of Complement Proteins to Activated Platelets
Exposure of chondroitin sulfate A (CS-A) on the surface of activated platelets is well established. The aim of the present study was to investigate to what extent CS-A contributes to the binding of the complement recognition molecule C1q and the complement regulators C1 inhibitor (C1INH), C4b-binding protein (C4BP), and factor H to platelets.Human blood serum was passed over Sepharose conjugated with CS-A, and CS-A-specific binding proteins were identified by Western blotting and mass spectrometric analysis. C1q was shown to be the main protein that specifically bound to CS-A, but C4BP and factor H were also shown to interact. Binding of C1INH was dependent of the presence of C1q and then not bound to CS-A from C1q-depleted serum. The specific interactions observed of these proteins with CS-A were subsequently confirmed by surface plasmon resonance analysis using purified proteins. Importantly, C1q, C4BP, and factor H were also shown to bind to activated platelets and this interaction was inhibited by a CS-A-specific monoclonal antibody, thereby linking the binding of C1q, C4BP, and factor H to exposure of CS-A on activated platelets. CS-A-bound C1q was also shown to amplify the binding of model immune complexes to both microtiter plate-bound CS-A and to activated platelets.This study supports the concept that CS-A contributes to the binding of C1q, C4BP, and factor H to platelets, thereby adding CS-A to the previously reported binding sites for these proteins on the platelet surface. CS-A-bound C1q also seems to amplify the binding of immune complexes to activated platelets, suggesting a role for this molecule in immune complex diseases
Life Events, Coping, and Posttraumatic Stress Symptoms among Chinese Adolescents Exposed to 2008 Wenchuan Earthquake, China
PURPOSE: To examine the relationship between negative life events, coping styles, and symptoms of post-traumatic stress disorder (PTSD) among adolescent survivors exposed to 2008 Wenchuan Earthquake, China. METHODS: A survey was conducted in a sample of 2250 adolescent students from two schools in Dujiangyan District, a seriously damaged area, 20 kilometers away from the epicenter, 6 months after the earthquake. Participants completed a self-administered questionnaire including demographics, negative life events, coping styles, and PTSD symptoms. RESULTS: Academic pressure was the strongest predictor of adolescents' PTSD symptoms among all negative life events. Main effects of negative life events, positive coping and negative coping on PTSD symptoms were significant in both younger adolescents and older adolescents, while the moderator effects of two coping styles were found significant only within older adolescents. CONCLUSIONS: Coping may play a role to moderate the relationship between post-earthquake negative life events and PTSD symptom, but the function seems to depend on the age of participants. Psychosocial coping skills training may be important in the prevention and intervention of mental health problems in adolescent survivors of traumatic earthquake
Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models
Deep learning recommendation models (DLRMs) are used across many
business-critical services at Facebook and are the single largest AI
application in terms of infrastructure demand in its data-centers. In this
paper we discuss the SW/HW co-designed solution for high-performance
distributed training of large-scale DLRMs. We introduce a high-performance
scalable software stack based on PyTorch and pair it with the new evolution of
Zion platform, namely ZionEX. We demonstrate the capability to train very large
DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup
in terms of time to solution over previous systems. We achieve this by (i)
designing the ZionEX platform with dedicated scale-out network, provisioned
with high bandwidth, optimal topology and efficient transport (ii) implementing
an optimized PyTorch-based training stack supporting both model and data
parallelism (iii) developing sharding algorithms capable of hierarchical
partitioning of the embedding tables along row, column dimensions and load
balancing them across multiple workers; (iv) adding high-performance core
operators while retaining flexibility to support optimizers with fully
deterministic updates (v) leveraging reduced precision communications,
multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we
develop and briefly comment on distributed data ingestion and other supporting
services that are required for the robust and efficient end-to-end training in
production environments
Leveraging Rural Energy Investment for Parasitic Disease Control: Schistosome Ova Inactivation and Energy Co-Benefits of Anaerobic Digesters in Rural China
Cooking and heating remain the most energy intensive activities among the world's poor, and thus improved access to clean energies for these tasks has been highlighted as a key requirement of attaining the major objectives of the UN Millennium Development Goals. A move towards clean energy technologies such as biogas systems (which produce methane from human and animal waste) has the potential to provide immediate benefits for the control of neglected tropical diseases. Here, an assessment of the parasitic disease and energy benefits of biogas systems in Sichuan Province, China, is presented, highlighting how the public health sector can leverage the proliferation of rural energy projects for infectious disease control. ova) counted at the influent of two biogas systems were removed in the systems when adjusted for system residence time, an approximate 1-log removal attributable to sedimentation. Combined, these inactivation/removal processes underscore the promise of biogas infrastructure for reducing parasite contamination resulting from nightsoil use. When interviewed an average of 4 years after construction, villagers attributed large changes in fuel usage to the installation of biogas systems. Household coal usage decreased by 68%, wood by 74%, and crop waste by 6%. With reported energy savings valued at roughly 600 CNY per year, 2–3 years were required to recoup the capital costs of biogas systems. In villages without subsidies, no new biogas systems were implemented.Sustainable strategies that integrate rural energy needs and sanitation offer tremendous promise for long-term control of parasitic diseases, while simultaneously reducing energy costs and improving quality of life. Government policies can enhance the financial viability of such strategies by introducing fiscal incentives for joint sanitation/sustainable energy projects, along with their associated public outreach and education programs
Total prostatectomy within 6 weeks of a prostate biopsy: is it safe?
PURPOSE: Many urologists recommend a six-week time interval between a prostate biopsy and a total prostatectomy (TP) to allow the biopsy induced inflammation to subside. Our aim was to assess whether the time interval between prostate biopsy and TP has an impact on the surgical outcome. MATERIALS AND METHODS: A retrospective analysis was performed on data from patients who underwent a TP by a single surgeon from 1992 to 2008. The patients were divided into two groups according to the time interval between biopsy and TP, Group 1 ≤ 6 weeks and Group 2 > 6 weeks. Relevant perioperative variables and outcome were analyzed. RESULTS: 923 patients were included. There was a significant difference between the two groups in the surgeons' ability to perform a bilateral nerve sparing procedure. Those who had a TP within six weeks of the biopsy were less likely to have a bilateral nerve sparing procedure. No significant difference was noted in the other variables, which included Gleason score, surgical margin status, estimated blood loss, post-operative infection, incontinence, erectile function, and biochemical recurrence. CONCLUSIONS: TP can be safely performed without any increase in complications within 6 weeks of a prostate biopsy. However, a TP within six weeks of a biopsy significantly reduced the surgeon's perception of whether a bilateral nerve sparing procedure was performed
Author Correction: Elucidating causative gene variants in hereditary Parkinson’s disease in the Global Parkinson’s Genetics Program (GP2)
Correction to: npj Parkinson’s Disease, published online 27 June 2023 In this article the Global Parkinson’s Genetics Program (GP2) members names and affiliations were missing in the main author list of the Original article which are listed in the below
Piloting Upfront Xpert MTB/RIF Testing on Various Specimens under Programmatic Conditions for Diagnosis of TB & DR-TB in Paediatric Population
India accounts for one-fifth of the global TB incidence. While the exact burden of childhood TB is not known, TB remains one of the leading causes of childhood mortality in India. Bacteriological confirmation of TB in children is challenging due to difficulty in obtaining quality specimens, in the absence of which diagnosis is largely based on clinical judgement. While testing multiple specimens can potentially contribute to higher proportion of laboratory confirmed paediatric TB cases, lack of high sensitivity tests adds to the diagnostic challenge. We describe here our experiences in piloting upfront Xpert MTB/RIF testing, for diagnosis of TB in paediatric population in respiratory and extra pulmonary specimens, as recently recommended by WHO.Xpert MTB/RIF testing was offered to all paediatric (0-14 years) presumptive TB cases (both pulmonary and extra-pulmonary) seeking care at public and private health facilities in the project areas covering 4 cities of India.Under this pilot project, 8,370 paediatric presumptive TB & presumptive DR-TB cases were tested between April and-November 2014. Overall, 9,149 specimens were tested, of which 4,445 (48.6%) were non-sputum specimens. Xpert MTB/RIF gave 9,083 (99.2%, CI 99.0-99.4) valid results. Of the 8,143 presumptive TB cases enrolled, 517 (6.3%, CI 5.8-6.9) were bacteriologically confirmed. TB detection rates were two fold higher with Xpert MTB/RIF as compared to smear microscopy. Further, a total of 60 rifampicin resistant TB cases were detected, of which 38 were detected among 512 presumptive TB cases while 22 were detected amongst 227 presumptive DR-TB cases tested under the project.Xpert MTB/RIF with advantages of quick turnaround testing-time, high proportion of interpretable results and feasibility of rapid rollout, substantially improved the diagnosis of bacteriologically confirmed TB in children, while simultaneously detecting rifampicin resistance
- …