46 research outputs found
Обработка длинных чтений транскриптомного секвенирования на облачной вычислительной платформе amazon web services
Studies of genomes and transcriptomes are performed using sequencers that read the sequence of nucleotide residues of genomic DNA, RNA,
or complementary DNA (cDNA). The analysis consists of an experimental part (obtaining primary data) and bioinformatic processing of primary
data. The bioinformatics part is performed with different sets of input parameters. The selection of the optimal values of the parameters, as a rule,
requires significant computing power. The article describes a protocol for processing transcriptome data by virtual computers provided by the cloud
platform Amazon Web Services (AWS) using the example of the recently emerging technology of long DNA and RNA sequences (Oxford Nanopore
Technology). As a result, a virtual machine and instructions for its use have been developed, thus allowing a wide range of molecular biologists to
independently process the results obtained using the "Oxford nanopore".Исследования геномов и транскриптомов проводят с помощью секвенаторов, позволяющих считывать последовательность
нуклеотидных остатков геномной ДНК, РНК или комплементарной ДНК. Каждое секвенирование биополимеров состоит из
экспериментальной части (получение первичных данных) и их обработки средствами биоинформатики с использованием различных
наборов входных параметров и значительных вычислительных мощностей. В статье описан протокол обработки транскриптома человека
с применением виртуальных вычислительных машин, предоставляемых облачной платформой Amazon Web Services (AWS). Свободно
и комерчески доступные возможности AWS рассмотрены с учетом требований к вычислительным ресурсам недавно анонсированной
технологии длинных прочтений последовательностей ДНК и РНК («Oxford Nanopore Technology», Великобритания). Как результат нами
был развернута виртуальная вычислительная машина в рамках доступных на AWS систем облачных решений и разработана инструкция
для работы с ней, позволяющая молекулярным биологам самостоятельно адаптировать представленные вычислительные возможности
для обработки результатов, полученных с использованием нанопорового секвенатора
De Sitter Gravity and Liouville Theory
We show that the spectrum of conical defects in three-dimensional de Sitter
space is in one-to-one correspondence with the spectrum of vertex operators in
Liouville conformal field theory. The classical conformal dimensions of vertex
operators are equal to the masses of the classical point particles in dS_3 that
cause the conical defect. The quantum dimensions instead are shown to coincide
with the mass of the Kerr-dS_3 solution computed with the Brown-York stress
tensor. Therefore classical de Sitter gravity encodes the quantum properties of
Liouville theory. The equality of the gravitational and the Liouville stress
tensor provides a further check of this correspondence. The Seiberg bound for
vertex operators translates on the bulk side into an upper mass bound for
classical point particles. Bulk solutions with cosmological event horizons
correspond to microscopic Liouville states, whereas those without horizons
correspond to macroscopic (normalizable) states. We also comment on recent
criticism by Dyson, Lindesay and Susskind, and point out that the
contradictions found by these authors may be resolved if the dual CFT is not
able to capture the thermal nature of de Sitter space. Indeed we find that on
the CFT side, de Sitter entropy is merely Liouville momentum, and thus has no
statistical interpretation in this approach.Comment: 22 pages, LateX2e; added references for section 1 and section 2;
corrected typos; improved discussion in section
Non-AIDS defining cancers in the D:A:D Study-time trends and predictors of survival : a cohort study
BACKGROUND:Non-AIDS defining cancers (NADC) are an important cause of morbidity and mortality in HIV-positive individuals. Using data from a large international cohort of HIV-positive individuals, we described the incidence of NADC from 2004-2010, and described subsequent mortality and predictors of these.METHODS:Individuals were followed from 1st January 2004/enrolment in study, until the earliest of a new NADC, 1st February 2010, death or six months after the patient's last visit. Incidence rates were estimated for each year of follow-up, overall and stratified by gender, age and mode of HIV acquisition. Cumulative risk of mortality following NADC diagnosis was summarised using Kaplan-Meier methods, with follow-up for these analyses from the date of NADC diagnosis until the patient's death, 1st February 2010 or 6 months after the patient's last visit. Factors associated with mortality following NADC diagnosis were identified using multivariable Cox proportional hazards regression.RESULTS:Over 176,775 person-years (PY), 880 (2.1%) patients developed a new NADC (incidence: 4.98/1000PY [95% confidence interval 4.65, 5.31]). Over a third of these patients (327, 37.2%) had died by 1st February 2010. Time trends for lung cancer, anal cancer and Hodgkin's lymphoma were broadly consistent. Kaplan-Meier cumulative mortality estimates at 1, 3 and 5 years after NADC diagnosis were 28.2% [95% CI 25.1-31.2], 42.0% [38.2-45.8] and 47.3% [42.4-52.2], respectively. Significant predictors of poorer survival after diagnosis of NADC were lung cancer (compared to other cancer types), male gender, non-white ethnicity, and smoking status. Later year of diagnosis and higher CD4 count at NADC diagnosis were associated with improved survival. The incidence of NADC remained stable over the period 2004-2010 in this large observational cohort.CONCLUSIONS:The prognosis after diagnosis of NADC, in particular lung cancer and disseminated cancer, is poor but has improved somewhat over time. Modifiable risk factors, such as smoking and low CD4 counts, were associated with mortality following a diagnosis of NADC
The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance
INTRODUCTION
Investment in Africa over the past year with regard to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing has led to a massive increase in the number of sequences, which, to date, exceeds 100,000 sequences generated to track the pandemic on the continent. These sequences have profoundly affected how public health officials in Africa have navigated the COVID-19 pandemic.
RATIONALE
We demonstrate how the first 100,000 SARS-CoV-2 sequences from Africa have helped monitor the epidemic on the continent, how genomic surveillance expanded over the course of the pandemic, and how we adapted our sequencing methods to deal with an evolving virus. Finally, we also examine how viral lineages have spread across the continent in a phylogeographic framework to gain insights into the underlying temporal and spatial transmission dynamics for several variants of concern (VOCs).
RESULTS
Our results indicate that the number of countries in Africa that can sequence the virus within their own borders is growing and that this is coupled with a shorter turnaround time from the time of sampling to sequence submission. Ongoing evolution necessitated the continual updating of primer sets, and, as a result, eight primer sets were designed in tandem with viral evolution and used to ensure effective sequencing of the virus. The pandemic unfolded through multiple waves of infection that were each driven by distinct genetic lineages, with B.1-like ancestral strains associated with the first pandemic wave of infections in 2020. Successive waves on the continent were fueled by different VOCs, with Alpha and Beta cocirculating in distinct spatial patterns during the second wave and Delta and Omicron affecting the whole continent during the third and fourth waves, respectively. Phylogeographic reconstruction points toward distinct differences in viral importation and exportation patterns associated with the Alpha, Beta, Delta, and Omicron variants and subvariants, when considering both Africa versus the rest of the world and viral dissemination within the continent. Our epidemiological and phylogenetic inferences therefore underscore the heterogeneous nature of the pandemic on the continent and highlight key insights and challenges, for instance, recognizing the limitations of low testing proportions. We also highlight the early warning capacity that genomic surveillance in Africa has had for the rest of the world with the detection of new lineages and variants, the most recent being the characterization of various Omicron subvariants.
CONCLUSION
Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve. This is important not only to help combat SARS-CoV-2 on the continent but also because it can be used as a platform to help address the many emerging and reemerging infectious disease threats in Africa. In particular, capacity building for local sequencing within countries or within the continent should be prioritized because this is generally associated with shorter turnaround times, providing the most benefit to local public health authorities tasked with pandemic response and mitigation and allowing for the fastest reaction to localized outbreaks. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century
Tyrosine hydroxylase expression and activity in nigrostriatal dopaminergic neurons of MPTP-treated mice at the presymptomatic and symptomatic stages of parkinsonism
International audienceProgressive degeneration of nigrostriatal dopaminergic (DA-ergic) neurons is a key component in the pathogenesis of Parkinson's disease, which develops for a long time at the preclinical stage with no motor dysfunctions due to the initiation of compensatory processes. The goal of this study was to evaluate the changes in surviving nigrostriatal DA-ergic neurons with focus on tyrosine hydroxylase (TH) in MPTP-treated mice at the presymptomatic and early symptomatic stages of parkinsonism. According to our data, a partial degeneration of DA-ergic neurons at the presymptomatic stage was accompanied by: (i) no change in TH mRNA content in the substantia nigra (SN) suggesting a compensatory increase of TH gene expression in individual neurons; (ii) a decrease of TH protein content in the nigrostriatal system and no change in individual neurons, suggesting a slowdown of TH translation. When comparing DA-ergic neurons at the early symptomatic stage and presymptomatic stage, it becomes evident: (i) a decrease of TH mRNA content in the SN and hence gene expression in individual neurons; (ii) a decrease of TH content in the striatum and its increase in the SN and individual neurons suggesting an acceleration of TH translation. TH activity, an index of the rate of DA synthesis, was unchanged in the SN and decreased in the striatum to the same degree at both stages of parkinsonism. In the meantime, TH activity in individual neurons appeared to be compensatory increased, but to a higher degree at the symptomatic stage than at the presymptomatic one. These data first show that DA depletion, which provokes motor dysfunction, is not a result of the decrease of TH activity and the rate of DA synthesis but is rather related to either a decrease of DA release or an increase of DA uptake in striatal DA-ergic axons.Copyright © 2014 Elsevier B.V. All rights reserved
Гены «стахановцы» 18 хромосомы человека, отсутствующие белки и не охарактеризованные белки в ткани печени и клеточной линии HepG2
Missing (MP) and functionally uncharacterized proteins (uPE1) comprise less than 5% of the total number of proteins encoded by human Chr18 genes. Within half a year, since the January 2020 version of NextProt, the number of entries in the MP+uPE1 datasets changed, mainly due to the achievements of antibody-based proteomics. Assuming that the proteome is closely related to the transcriptome scaffold, quantitative PCR, Illumina HiSeq, and Oxford Nanopore Technology were applied to characterize the liver samples of three male donors in comparison with the HepG2 cell line. The data mining of the Expression Atlas (EMBL-EBI) and the profiling of biopsy samples by using orthogonal methods of transcriptome analysis have shown that in HepG2 cells and the liver, the genes encoding functionally uncharacterized proteins (uPE1) are expressed as low as for the missing proteins (less than 1 copy per cell), except the selected cases of HSBP1L1, TMEM241, C18orf21, and KLHL14. The initial expectation that uPE1 genes might be expressed at higher levels than MP genes, was compromised by severe discrepancies in our semi-quantitative gene expression data and in public databanks. Such discrepancy forced us to revisit the transcriptome of Chr18, the target of the Russian C-HPP Consortium. Tanglegram of highly expressed genes and further correlation analysis have shown the severe dependencies on the mRNA extraction method and the analytical platform. Targeted gene expression analysis by quantitative PCR (qPCR) and high-throughput transcriptome profiling (Illumina HiSeq and ONT MinION) for the same set of samples from normal liver tissue and HepG2 cells revealed the detectable expression of 250+ (92%) protein-coding genes of Chr18 (at least one method). The expression of slightly more than 50% protein-coding genes was detected simultaneously by all three methods. Correlation analysis of the gene expression profiles showed that the grouping of the datasets depended almost equally on both the type of biological material and the experimental method, particularly cDNA/mRNA isolation and library preparation.Отсутствующие белки и функционально не охарактеризованные белки (в англоязычной литературе обозначенные как missing (MP) и functionally uncharacterized proteins (uPE1), соответственно) составляют менее 5% от общего числа белков, кодируемых генами 18 хромосомы человека. В течение полугода, начиная с января 2020 года, в версии NextProt выросло количество записей в наборах данных MP+uPE1. Подобные изменения обусловлены преимущественно достижениями протеомики на основе антител. В данной работе количественная ПЦР, технологии секвенирования Illumina HiSeq и Oxford Nanopore Technologies были применены для сравнительного анализа транскриптомного профиля образцов печени трех доноров мужского пола и клеточной линии HepG2. Анализ данных атласа экспрессии (Expression Atlas, EMBL-EBI) и полученных результатов по биологическим образцам с использованием ортогональных методов анализа транскриптома показал, что в клетках печени и HepG2 уровень экспрессии генов, кодирующих функционально не охарактеризованные белки (uPE1), находится на таком же низком уровне, как и в случае генов MP (в количестве менее 1 копии на клетку). Исключение составили несколько генов: HSBP1L1, TMEM241, C18orf21 и KLHL14. Согласно существенным расхождениям в ранее полученных полуколичественных данных по экспрессии генов и данным в открытых базах данных, изначально предполагалось, что экспрессия генов uPE1 может быть выше, чем генов MP. Подобное расхождение побудило обратиться к транскриптому 18 хромосомы человека, являющейся целевой для России в проекте «Протеом человека». Полученные результаты о наиболее экспрессируемых генах и дальнейший корреляционный анализ показал существование зависимости от метода экстракции мРНК и аналитической платформы. Анализ экспрессии целевых генов 18 хромосомы с применением количественной ПЦР (qPCR) и методов высокопроизводительного профилирования транскриптома (Illumina HiSeq и ONT MinION) для одинаковых наборов образцов нормальной ткани печени и клеточной линии HepG2 выявил более 250 (92%) белок-кодирующих генов, детектируемых хотя бы одним методом. Экспрессия более чем 50% белок-кодирующих генов была детектирована всеми тремя методами. Корреляционный анализ профилей экспрессии генов показал, что результаты «группируются» в зависимости от типа биологического материала и экспериментальных методов, в частности от способа подготовки библиотеки (выделения кДНК, мРНК). Зависимость от выбора способа биоинформатической обработки была отмечена в значительно меньшей степени