38 research outputs found
De novo identification of viral pathogens from cell culture hologenomes
<p>Abstract</p> <p>Background</p> <p>Fast, specific identification and surveillance of pathogens is the cornerstone of any outbreak response system, especially in the case of emerging infectious diseases and viral epidemics. This process is generally tedious and time-consuming thus making it ineffective in traditional settings. The added complexity in these situations is the non-availability of pure isolates of pathogens as they are present as mixed genomes or hologenomes. Next-generation sequencing approaches offer an attractive solution in this scenario as it provides adequate depth of sequencing at fast and affordable costs, apart from making it possible to decipher complex interactions between genomes at a scale that was not possible before. The widespread application of next-generation sequencing in this field has been limited by the non-availability of an efficient computational pipeline to systematically analyze data to delineate pathogen genomes from mixed population of genomes or hologenomes.</p> <p>Findings</p> <p>We applied next-generation sequencing on a sample containing mixed population of genomes from an epidemic with appropriate processing and enrichment. The data was analyzed using an extensive computational pipeline involving mapping to reference genome sets and <it>de-novo </it>assembly. In depth analysis of the data generated revealed the presence of sequences corresponding to <it>Japanese encephalitis </it>virus. The genome of the virus was also independently <it>de-novo </it>assembled. The presence of the virus was in addition, verified using standard molecular biology techniques.</p> <p>Conclusions</p> <p>Our approach can accurately identify causative pathogens from cell culture hologenome samples containing mixed population of genomes and in principle can be applied to patient hologenome samples without any background information. This methodology could be widely applied to identify and isolate pathogen genomes and understand their genomic variability during outbreaks.</p
Neurodevelopmental disorders in children aged 2-9 years: Population-based burden estimates across five regions in India.
BACKGROUND: Neurodevelopmental disorders (NDDs) compromise the development and attainment of full social and economic potential at individual, family, community, and country levels. Paucity of data on NDDs slows down policy and programmatic action in most developing countries despite perceived high burden. METHODS AND FINDINGS: We assessed 3,964 children (with almost equal number of boys and girls distributed in 2-<6 and 6-9 year age categories) identified from five geographically diverse populations in India using cluster sampling technique (probability proportionate to population size). These were from the North-Central, i.e., Palwal (N = 998; all rural, 16.4% non-Hindu, 25.3% from scheduled caste/tribe [SC-ST] [these are considered underserved communities who are eligible for affirmative action]); North, i.e., Kangra (N = 997; 91.6% rural, 3.7% non-Hindu, 25.3% SC-ST); East, i.e., Dhenkanal (N = 981; 89.8% rural, 1.2% non-Hindu, 38.0% SC-ST); South, i.e., Hyderabad (N = 495; all urban, 25.7% non-Hindu, 27.3% SC-ST) and West, i.e., North Goa (N = 493; 68.0% rural, 11.4% non-Hindu, 18.5% SC-ST). All children were assessed for vision impairment (VI), epilepsy (Epi), neuromotor impairments including cerebral palsy (NMI-CP), hearing impairment (HI), speech and language disorders, autism spectrum disorders (ASDs), and intellectual disability (ID). Furthermore, 6-9-year-old children were also assessed for attention deficit hyperactivity disorder (ADHD) and learning disorders (LDs). We standardized sample characteristics as per Census of India 2011 to arrive at district level and all-sites-pooled estimates. Site-specific prevalence of any of seven NDDs in 2-<6 year olds ranged from 2.9% (95% CI 1.6-5.5) to 18.7% (95% CI 14.7-23.6), and for any of nine NDDs in the 6-9-year-old children, from 6.5% (95% CI 4.6-9.1) to 18.5% (95% CI 15.3-22.3). Two or more NDDs were present in 0.4% (95% CI 0.1-1.7) to 4.3% (95% CI 2.2-8.2) in the younger age category and 0.7% (95% CI 0.2-2.0) to 5.3% (95% CI 3.3-8.2) in the older age category. All-site-pooled estimates for NDDs were 9.2% (95% CI 7.5-11.2) and 13.6% (95% CI 11.3-16.2) in children of 2-<6 and 6-9 year age categories, respectively, without significant difference according to gender, rural/urban residence, or religion; almost one-fifth of these children had more than one NDD. The pooled estimates for prevalence increased by up to three percentage points when these were adjusted for national rates of stunting or low birth weight (LBW). HI, ID, speech and language disorders, Epi, and LDs were the common NDDs across sites. Upon risk modelling, noninstitutional delivery, history of perinatal asphyxia, neonatal illness, postnatal neurological/brain infections, stunting, LBW/prematurity, and older age category (6-9 year) were significantly associated with NDDs. The study sample was underrepresentative of stunting and LBW and had a 15.6% refusal. These factors could be contributing to underestimation of the true NDD burden in our population. CONCLUSIONS: The study identifies NDDs in children aged 2-9 years as a significant public health burden for India. HI was higher than and ASD prevalence comparable to the published global literature. Most risk factors of NDDs were modifiable and amenable to public health interventions
Global age-sex-specific mortality, life expectancy, and population estimates in 204 countries and territories and 811 subnational locations, 1950–2021, and the impact of the COVID-19 pandemic: a comprehensive demographic analysis for the Global Burden of Disease Study 2021
BACKGROUND: Estimates of demographic metrics are crucial to assess levels and trends of population health outcomes. The profound impact of the COVID-19 pandemic on populations worldwide has underscored the need for timely estimates to understand this unprecedented event within the context of long-term population health trends. The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 provides new demographic estimates for 204 countries and territories and 811 additional subnational locations from 1950 to 2021, with a particular emphasis on changes in mortality and life expectancy that occurred during the 2020–21 COVID-19 pandemic period. METHODS: 22 223 data sources from vital registration, sample registration, surveys, censuses, and other sources were used to estimate mortality, with a subset of these sources used exclusively to estimate excess mortality due to the COVID-19 pandemic. 2026 data sources were used for population estimation. Additional sources were used to estimate migration; the effects of the HIV epidemic; and demographic discontinuities due to conflicts, famines, natural disasters, and pandemics, which are used as inputs for estimating mortality and population. Spatiotemporal Gaussian process regression (ST-GPR) was used to generate under-5 mortality rates, which synthesised 30 763 location-years of vital registration and sample registration data, 1365 surveys and censuses, and 80 other sources. ST-GPR was also used to estimate adult mortality (between ages 15 and 59 years) based on information from 31 642 location-years of vital registration and sample registration data, 355 surveys and censuses, and 24 other sources. Estimates of child and adult mortality rates were then used to generate life tables with a relational model life table system. For countries with large HIV epidemics, life tables were adjusted using independent estimates of HIV-specific mortality generated via an epidemiological analysis of HIV prevalence surveys, antenatal clinic serosurveillance, and other data sources. Excess mortality due to the COVID-19 pandemic in 2020 and 2021 was determined by subtracting observed all-cause mortality (adjusted for late registration and mortality anomalies) from the mortality expected in the absence of the pandemic. Expected mortality was calculated based on historical trends using an ensemble of models. In location-years where all-cause mortality data were unavailable, we estimated excess mortality rates using a regression model with covariates pertaining to the pandemic. Population size was computed using a Bayesian hierarchical cohort component model. Life expectancy was calculated using age-specific mortality rates and standard demographic methods. Uncertainty intervals (UIs) were calculated for every metric using the 25th and 975th ordered values from a 1000-draw posterior distribution. FINDINGS: Global all-cause mortality followed two distinct patterns over the study period: age-standardised mortality rates declined between 1950 and 2019 (a 62·8% [95% UI 60·5–65·1] decline), and increased during the COVID-19 pandemic period (2020–21; 5·1% [0·9–9·6] increase). In contrast with the overall reverse in mortality trends during the pandemic period, child mortality continued to decline, with 4·66 million (3·98–5·50) global deaths in children younger than 5 years in 2021 compared with 5·21 million (4·50–6·01) in 2019. An estimated 131 million (126–137) people died globally from all causes in 2020 and 2021 combined, of which 15·9 million (14·7–17·2) were due to the COVID-19 pandemic (measured by excess mortality, which includes deaths directly due to SARS-CoV-2 infection and those indirectly due to other social, economic, or behavioural changes associated with the pandemic). Excess mortality rates exceeded 150 deaths per 100 000 population during at least one year of the pandemic in 80 countries and territories, whereas 20 nations had a negative excess mortality rate in 2020 or 2021, indicating that all-cause mortality in these countries was lower during the pandemic than expected based on historical trends. Between 1950 and 2021, global life expectancy at birth increased by 22·7 years (20·8–24·8), from 49·0 years (46·7–51·3) to 71·7 years (70·9–72·5). Global life expectancy at birth declined by 1·6 years (1·0–2·2) between 2019 and 2021, reversing historical trends. An increase in life expectancy was only observed in 32 (15·7%) of 204 countries and territories between 2019 and 2021. The global population reached 7·89 billion (7·67–8·13) people in 2021, by which time 56 of 204 countries and territories had peaked and subsequently populations have declined. The largest proportion of population growth between 2020 and 2021 was in sub-Saharan Africa (39·5% [28·4–52·7]) and south Asia (26·3% [9·0–44·7]). From 2000 to 2021, the ratio of the population aged 65 years and older to the population aged younger than 15 years increased in 188 (92·2%) of 204 nations. INTERPRETATION: Global adult mortality rates markedly increased during the COVID-19 pandemic in 2020 and 2021, reversing past decreasing trends, while child mortality rates continued to decline, albeit more slowly than in earlier years. Although COVID-19 had a substantial impact on many demographic indicators during the first 2 years of the pandemic, overall global health progress over the 72 years evaluated has been profound, with considerable improvements in mortality and life expectancy. Additionally, we observed a deceleration of global population growth since 2017, despite steady or increasing growth in lower-income countries, combined with a continued global shift of population age structures towards older ages. These demographic changes will likely present future challenges to health systems, economies, and societies. The comprehensive demographic estimates reported here will enable researchers, policy makers, health practitioners, and other key stakeholders to better understand and address the profound changes that have occurred in the global health landscape following the first 2 years of the COVID-19 pandemic, and longer-term trends beyond the pandemic. FUNDING: Bill & Melinda Gates Foundation
A Flexible Class of Regenerating Codes for Distributed Storage
In the distributed storage setting introduced by Dimakis et al., B units of data are stored across n nodes in the network in such a way that the data can be recovered by connecting to any k nodes. Additionally one can repair a failed node by connecting to any d nodes while downloading at most beta units of data from each node. In this paper, we introduce a flexible framework in which the data can be recovered by connecting to any number of nodes as long as the total amount of data downloaded is at least B. Similarly, regeneration of a failed node is possible if the new node connects to the network using links whose individual capacity is bounded above by beta(max) and whose sum capacity equals or exceeds a predetermined parameter gamma. In this flexible setting, we obtain the cut-set lower bound on the repair bandwidth along with a constructive proof for the existence of codes meeting this bound for all values of the parameters. An explicit code construction is provided which is optimal in certain parameter regimes
Optimal Exact-Regenerating Codes for Distributed Storage at the MSR and MBR Points via a Product-Matrix Construction
Regenerating codes are a class of distributed storage codes that allow for efficient repair of failed nodes, as compared to traditional erasure codes. An [n, k, d] regenerating code permits the data to be recovered by connecting to any k of the n nodes in the network, while requiring that a failed node be repaired by connecting to any d nodes. The amount of data downloaded for repair is typically much smaller than the size of the source data. Previous constructions of exact-regenerating codes have been confined to the case n = d + 1. In this paper, we present optimal, explicit constructions of (a) Minimum Bandwidth Regenerating (MBR) codes for all values of [n, k, d] and (b) Minimum Storage Regenerating (MSR) codes for all [n, k, d >= 2k - 2], using a new product-matrix framework. The product-matrix framework is also shown to significantly simplify system operation. To the best of our knowledge, these are the first constructions of exact-regenerating codes that allow the number n of nodes in the network, to be chosen independent of the other parameters. The paper also contains a simpler description, in the product-matrix framework, of a previously constructed MSR code with [n = d + 1, k, d >= 2k - 1]
Enabling node repair in any erasure code for distributed storage
Erasure codes are an efficient means of storing data across a network in comparison to data replication, as they tend to reduce the amount of data stored in the network and offer increased resilience in the presence of node failures. The codes perform poorly though, when repair of a failed node is called for, as they typically require the entire file to be downloaded to repair a failed node. A new class of erasure codes, termed as regenerating codes were recently introduced, that do much better in this respect. However, given the variety of efficient erasure codes available in the literature, there is considerable interest in the construction of coding schemes that would enable traditional erasure codes to be used, while retaining the feature that only a fraction of the data need be downloaded for node repair. In this paper, we present a simple, yet powerful, framework that does precisely this. Under this framework, the nodes are partitioned into two types and encoded using two codes in a manner that reduces the problem of node-repair to that of erasure-decoding of the constituent codes. Depending upon the choice of the two codes, the framework can be used to avail one or more of the following advantages: simultaneous minimization of storage space and repair-bandwidth, low complexity of operation, fewer disk reads at helper nodes during repair, and error detection and correction
Regenerating Codes for Distributed Storage Networks
In a storage system where individual storage nodes are prone to failure, the redundant storage of data in a distributed manner across multiple nodes is a must to ensure reliability. Reed-Solomon codes possess the reconstruction property under which the stored data can be recovered by connecting to any k of the n nodes in the network across which data is dispersed. This property can be shown to lead to vastly improved network reliability over simple replication schemes. Also of interest in such storage systems is the minimization of the repair bandwidth, i.e., the amount of data needed to be downloaded from the network in order to repair a single failed node. Reed-Solomon codes perform poorly here as they require the entire data to be downloaded. Regenerating codes are a new class of codes which minimize the repair bandwidth while retaining the reconstruction property. This paper provides an overview of regenerating codes including a discussion on the explicit construction of optimum codes
Explicit and Optimal Exact-Regenerating Codes for the Minimum-Bandwidth Point in Distributed Storage
In the distributed storage setting that we consider, data is stored across n nodes in the network such that the data can be recovered by connecting to any subset of k nodes. Additionally, one can repair a failed node by connecting to any d nodes while downloading beta units of data from each. Dimakis et al. show that the repair bandwidth d beta can be considerably reduced if each node stores slightly more than the minimum required and characterize the tradeoff between the amount of storage per node and the repair bandwidth. In the exact regeneration variation, unlike the functional regeneration, the replacement for a failed node is required to store data identical to that in the failed node. This greatly reduces the complexity of system maintenance. The main result of this paper is an explicit construction of codes for all values of the system parameters at one of the two most important and extreme points of the tradeoff - the Minimum Bandwidth Regenerating point, which performs optimal exact regeneration of any failed node. A second result is a non-existence proof showing that with one possible exception, no other point on the tradeoff can be achieved for exact regeneration
Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions
Regenerating codes are a class of recently developed codes for distributed storage that, like Reed-Solomon codes, permit data recovery from any arbitrary of nodes. However regenerating codes possess in addition, the ability to repair a failed node by connecting to any arbitrary nodes and downloading an amount of data that is typically far less than the size of the data file. This amount of download is termed the repair bandwidth. Minimum storage regenerating (MSR) codes are a subclass of regenerating codes that require the least amount of network storage; every such code is a maximum distance separable (MDS) code. Further, when a replacement node stores data identical to that in the failed node, the repair is termed as exact. The four principal results of the paper are (a) the explicit construction of a class of MDS codes for d = n - 1 >= 2k - 1 termed the MISER code, that achieves the cut-set bound on the repair bandwidth for the exact repair of systematic nodes, (b) proof of the necessity of interference alignment in exact-repair MSR codes, (c) a proof showing the impossibility of constructing linear, exact-repair MSR codes for d < 2k - 3 in the absence of symbol extension, and (d) the construction, also explicit, of high-rate MSR codes for d = k+1. Interference alignment (IA) is a theme that runs throughout the paper: the MISER code is built on the principles of IA and IA is also a crucial component to the nonexistence proof for d < 2k - 3. To the best of our knowledge, the constructions presented in this paper are the first explicit constructions of regenerating codes that achieve the cut-set bound