40 research outputs found

    Statistical Methods for the Modelling of Label-Free Shotgun Proteomic Data in Cell Line Biomarker Discovery

    Get PDF
    [cat] En la tesi s'ha desenvolupat, dissenyat i implementat una solució per l'anàlisi de dades de proteòmica comparativa en descobriment de biomarcadors. Específicament la solució s'ha optimitzat per l'anàlisi de secretomes de línies cel•lulars tumorals per LC-MS/MS sense marcatge, i quantificant pel nombre d'espectres de pèptids assignats a cada proteïna. Durant el desenvolupament de la metodologia s'ha demostrat la incidència i rellevància dels efectes batch en l'anàlisi comparatiu de pèptits sense marcar per LC-MS/MS. Així com les característiques que identifiquen un potencial biomarcador com a reproductible. Els models s'han desenvolupat amb l'ajut de dades empíriques obtingudes de mostres amb mescles controlades de proteïnes, i de simulacions. La solució informàtica que implementa el model desenvolupat consta de dos paquets R/Bioconductor, amb les respectives interfícies gràfiques que faciliten el seu ús a no experts. El primer paquet, msmsEDA, consta de funcions útils en l'anàlisi exploratòria de dades, i permet avaluar la qualitat del conjunt de dades d'un experiment de LC-MS/MS basat en comptatge d'espectres, així com explorar l'eventual presència de valors extrems, factors de confusió, o d'efectes batch. El segon paquet, msmsTests, encapsula funcions per la inferència en el descobriment de biomarcadors. El model emprat és un GLM que contempla la inclusió de factors per blocs per la correcció d'efectes batch, i incorpora una normalització generalitzada per offsets que permet la comparació de secretoma al nivell d'una cel•lula. Les distribucions implementades són la de Poisson i la binomial negativa, així com l'extensió de la quasiversemblança. En conjut el model desenvolupat i la implementació informàtica que se'n ha fet permet: • Avaluar la qualitat d'un conjunt de dades de LC-MS/MS. • Identificar valors extrems. • Identificar la presència de factors de confusió o d'efectes batch. • El descobriment de biomarcadors emprant la distribució que millor s'ajusti a les dades. • Assegurar un bon nivell de reproductibilitat mercès a un filtre post-test. Els paquets i llur documentació es troben lliurement disponibles a bioconductor.org, i les interfícies gràfiques a github.com.[eng]In this work it has been developed and implemented a data analysis pipeline for the discovery of biomarkers by high throughput shotgun proteomics. Specifically the solution has been optimized for the analysis of secretomes of tumor cell lines by label-free LC-MS/MS, with proteins quantified by peptide spectral counts. Along the development it has been shown the incidence and relevance of batch effects in the comparative analysis of label-free proteomics by LC-MS/MS. Also the features providing reproducibility to potential biomarkers have been identified. The model has been developed on empirical data obtained from a series of spiked experiments, and with the help of simulations to evaluate its performance. The pipeline comprises an exploratory data analysis (EDA) R/Bioconductor package, msmsEDA, based on multidimensional analysis tools and a R/Bioconductor inference package, msmsTests, based on generalized linear models (GLM) with Poisson or negative binomial distributions, or the quasi-likelihood GLM extension. Two graphical interfaces have also been produced to ease the use of the provided solution in a MS lab by non experts, and are freely available at GitHub. The designed model is devised to discover differentially expressed proteins in tumor cell line secretomes, using the cell as the unit of interest. The model allows blocking factors as a mean for batch effects correction. The normalization to cell units is embedded in the model through the use of offsets, and no previous data treatment is required. The two packages developed, msmsEDA and msmsTests, allow for: • Dataset quality assessment. • The identification of outliers • The identification of confounding factors or batch effects. • The discovery of potential biomarkers by using the distribution best fitting the available data. • The improvement of reproducibility by a post test filter based of effect size and signal levels. Different papers have been published in peer-reviewed proteomics journals develo-ping each data treatment step, and demonstrating its use and value in biological experiments carried out in the Tumor Biomarker lab at VHIO

    Viral quasispecies diversity and evolution : a Bioinformatics molecular approach

    Get PDF
    El grup de hepatitis virals del Vall d'Hebron Institut de Recerca (VHIR) de Barcelona, en els darrers 10 anys, ha estat desenvolupant solucions metodològiques experimentals i computacionals per a l'estudi de poblacions complexes de virus (quasispecies) mitjançant l'aplicació de les tècniques de seqüenciació de nova generació (NGS). Aquest llibre consisteix en una selecció de treballs empírics sobre les quasispecies virals. En oferir aquest format obert de publicació, l'objectiu és que, per una banda, pugui ser una eina útil per a tots aquells investigadors interessats en aquest camp i, d'altra banda, divulgar aquesta àrea de coneixement a tota la comunitat científica que, sense ser necessàriament experta, vulgui conèixer amb més detall l'evolució i diversitat dels virus. En els tres primers treballs s'aprofundeix en l'ús, interpretació i utilitat d'índexs de biodiversitat, alguns específics per a poblacions genètiques i d'altres importats del camp de l'ecologia. La segona part posa de manifest algunes limitacions en aquests índexs de diversitat i aborda el desenvolupament d'eines integradores que proporcionen una interpretació més directa en termes biològics i clínics. Les seccions prèvies als sis treballs esmentats, situen al lector en el context en què es realitzen els desenvolupaments i expliquen la necessitat i utilitat. El llibre es tanca amb una secció que recull les observacions i conclusions generals dels treballs, i amb una altra que reflexiona sobre les limitacions que comporta l'estudi de sistemes complexos i dinàmics com les quasispecies virals

    Quantifying In-Host Quasispecies Evolution

    Get PDF
    Mutagenesis; Quasispecies evolution; Viral treatmentMutagénesis; Evolución de las cuasiespecies; Tratamiento viralMutagènesi; Evolució de les quasiespècies; Tractament viralWhat takes decades, centuries or millennia to happen with a natural ecosystem, it takes only days, weeks or months with a replicating viral quasispecies in a host, especially when under treatment. Some methods to quantify the evolution of a quasispecies are introduced and discussed, along with simple simulated examples to help in the interpretation and understanding of the results. The proposed methods treat the molecules in a quasispecies as individuals of competing species in an ecosystem, where the haplotypes are the competing species, and the ecosystem is the quasispecies in a host, and the evolution of the system is quantified by monitoring changes in haplotype frequencies. The correlation between the proposed indices is also discussed, and the R code used to generate the simulations, the data and the plots is provided. The virtues of the proposed indices are finally shown on a clinical case.This study was partially supported by Pla Estratègic de Recerca i Innovació en Salut (PERIS)—Direcció General de Recerca i Innovació en Salut (DGRIS), Catalan Health Ministry, Generalitat de Catalunya; the Spanish Network for the Research in Infectious Diseases (REIPI RD16/0016/0003) from the European Regional Development Fund (ERDF); Centro para el Desarrollo Tecnológico Industrial (CDTI) from the Spanish Ministry of Economy and Business, grant number IDI-20200297; grant PI19/00301 and PI22/00258 from Instituto de Salud Carlos III cofinanced by the European Regional Development Fund (ERDF), and Gilead’s biomedical research project GLD21/00006

    Whole-genome characterization and resistance-associated substitutions in a new HCV genotype 1 subtype

    Get PDF
    HCV; Direct-acting antivirals; Genotype 1VHC; Antivirals d'acció directa; Genotip 1VHC; Antivirales de acción directa; Genotipo 1Hepatitis C virus (HCV) is a highly variable infectious agent, classified into 8 genotypes and 86 subtypes. Our laboratory has implemented an in-house developed high-resolution HCV subtyping method based on next-generation sequencing (NGS) for error-free classification of the virus using phylogenetic analysis and analysis of genetic distances in sequences from patient samples compared to reference sequences. During routine diagnostic, a sample from an Equatorial Guinea patient could not be classified into any of the existing subtypes. The whole genome was analyzed to confirm that the new isolate could be classified as a new HCV subtype. In addition, naturally occurring resistance-associated substitutions (RAS) were analyzed by NGS. Whole-genome analysis based on p-distances suggests that the sample belongs to a new HCV genotype 1 subtype. Several RAS in the NS3 (S122T, D168E and I170V) and NS5A protein (Q(1b)24K, R(1b)30Q and Y93L+Y93F) were found, which could limit the use of some inhibitors for treating this subtype. RAS studies of new subtypes are of great interest for tailoring treatment, as no data on treatment efficacy are reported. In our case, the patient has not yet been treated, and the RAS report will be used to design the most effective treatment

    Population Disequilibrium as Promoter of Adaptive Explorations in Hepatitis C Virus

    Get PDF
    Coronavirus SARS-CoV-2; COVID-19; 2019-nCoV; Virus de l'hepatitis C; Vacunes universalsCoronavirus SARS-CoV-2; COVID-19; 2019-nCoV; Virus de la hepatitis C; Vacunas universalesCoronavirus SARS-CoV-2; COVID-19; 2019-nCoV; Hepatitis C virus; Universal vaccinesReplication of RNA viruses is characterized by exploration of sequence space which facilitates their adaptation to changing environments. It is generally accepted that such exploration takes place mainly in response to positive selection, and that further diversification is boosted by modifications of virus population size, particularly bottleneck events. Our recent results with hepatitis C virus (HCV) have shown that the expansion in sequence space of a viral clone continues despite prolonged replication in a stable cell culture environment. Diagnosis of the expansion was based on the quantification of diversity indices, the occurrence of intra-population mutational waves (variations in mutant frequencies), and greater individual residue variations in mutant spectra than those anticipated from sequence alignments in data banks. In the present report, we review our previous results, and show additionally that mutational waves in amplicons from the NS5A-NS5B-coding region are equally prominent during HCV passage in the absence or presence of the mutagenic nucleotide analogues favipiravir or ribavirin. In addition, by extending our previous analysis to amplicons of the NS3- and NS5A-coding region, we provide further evidence of the incongruence between amino acid conservation scores in mutant spectra from infected patients and in the Los Alamos National Laboratory HCV data banks. We hypothesize that these observations have as a common origin a permanent state of HCV population disequilibrium even upon extensive viral replication in the absence of external selective constraints or changes in population size. Such a persistent disequilibrium—revealed by the changing composition of the mutant spectrum—may facilitate finding alternative mutational pathways for HCV antiviral resistance. The possible significance of our model for other genetically variable viruses is discussed.The work at CBMSO was supported by grants SAF2014-52400-R from Ministerio de Economía y Competitividad (MINECO), SAF2017-87846-R and BFU2017-91384-EXP from Ministerio de Ciencia, Innovación y Universidades (MCIU), PI18/00210 from Instituto de Salud Carlos III, S2013/ABI-2906 (PLATESA from Comunidad de Madrid/FEDER), and S2018/BAA-4370 (PLATESA2 from Comunidad de Madrid/FEDER). C.P. is supported by the Miguel Servet program of the Instituto de Salud Carlos III (CPII19/00001), cofinanced by the European Regional Development Fund (ERDF). CIBERehd (Centro de Investigación en Red de Enfermedades Hepáticas y Digestivas) is funded by Instituto de Salud Carlos III. Institutional grants from the Fundación Ramón Areces and Banco Santander to the CBMSO are also acknowledged. The team at CBMSO belongs to the Global Virus Network (GVN). The work in Barcelona was supported by Instituto de Salud Carlos III, cofinanced by the European Regional Development Fund (ERDF) Grant No. PI19/00301 and by the Centro para el Desarrollo Tecnológico Industrial (CDTI) from the MICIU, Grant No. IDI-20200297. Work at CAB was supported by MINECO grant BIO2016-79618R and PID2019-104903RB-I00 (funded by the EU under the FEDER program) and by the Spanish State research agency (AEI) through project number MDM-2017-0737 Unidad de Excelencia “María de Maeztu”-Centro de Astrobiología (CSIC-INTA). C.G.-C. is supported by predoctoral contract PRE2018-083422 from MCIU. B.M.-G. is supported by predoctoral contract PFIS FI19/00119 from Instituto de Salud Carlos III (Ministerio de Sanidad y Consumo), cofinanced by Fondo Social Europeo (FSE)

    The Critical role of codon composition on the translation efficiency robustness of the Hepatitis A virus capsid

    Get PDF
    Hepatoviruses show an intriguing deviated codon usage, suggesting an evolutionary signature. Abundant and rare codons in the cellular genome are scarce in the human hepatitis A virus (HAV) genome, while intermediately abundant host codons are abundant in the virus. Genotype-phenotype maps, or fitness landscapes, are a means of representing a genotype position in sequence space and uncovering how genotype relates to phenotype and fitness. Using genotype-phenotype maps of the translation efficiency, we have shown the critical role of the HAV capsid codon composition in regulating translation and determining its robustness. Adaptation to an environmental perturbation such as the artificial induction of cellular shutoff not naturally occurring in HAV infection involved movements in the sequence space and dramatic changes of the translation efficiency. Capsid rare codons, including abundant and rare codons of the cellular genome, slowed down the translation efficiency in conditions of no cellular shutoff. In contrast, rare capsid codons that are abundant in the cellular genome were efficiently translated in conditions of shutoff. Capsid regions very rich in slowly translated codons adapt to shutoff through sequence space movements from positions with highly robust translation to others with diminished translation robustness. These movements paralleled decreases of the capsid physical and biological robustness, and resulted in the diversification of capsid phenotypes. The deviated codon usage of extant hepatoviruses compared with that of their hosts may suggest the occurrence of a virus ancestor with an optimized codon usage with respect to an unknown ancient host

    Characterization of intra- and inter-host norovirus P2 genetic variability in linked individuals by amplicon sequencing

    Get PDF
    Noroviruses are the main cause of epidemics of acute gastroenteritis at a global scale.Although chronically infected immunocompromised individuals are regarded as potential reservoirs for the emergence of new viral variants, viral quasispecies distribution and evolution patterns in acute symptomatic and asymptomatic infections has not been extensively studied. Amplicons of 450 nts from the P2 coding capsid domain were studied using nextgeneration sequencing (454/GS-Junior) platform. Inter-host diversity between symptomatic and asymptomatic acutely infected individuals linked to the same outbreak as well as their viral intra-host diversity over time were characterized. With an average of 2848 reads per sample and a cutoff frequency of 0.1%, minor variant haplotypes were detected in 5 out of 8 specimens. Transmitted variants could not be confirmed in all infected individuals in one outbreak. The observed initial intra-host viral diversity in asymptomatically infected subjects was higher than in symptomatic ones. Viral quasispecies evolution over time within individuals was host-specific, with an average of 2.8 nt changes per day (0.0062 changes per nucleotide per day) in a given symptomatic case. Nucleotide polymorphisms were detected in 28 out of 450 analyzed nucleotide positions, 32.14% of which were synonymous and 67.86% were non-synonymous. Most observed amino acid changes emerged at or near blockade epitopes A, B, D and E. Our results suggest that acutely infected individuals, even in the absence of symptoms, which go underreported and may enhance transmission, may contribute to norovirus genetic variability and evolution

    Deep-sequencing study of HCV G4a resistance-associated substitutions in Egyptian patients failing DAA treatment

    Get PDF
    Resistance-associated substitutions; RAS; Subtype 4aSustituciones asociadas a la resistencia; RAS; Subtipo 4aSubstitucions associades a la resistència; RAS; Subtipus 4aPurpose: To study resistance-associated substitutions using next-generation sequencing in Egyptian hepatitis C virus-infected patients failing direct-acting antiviral treatment. Methods: The current study describes three cases of treatment failure in patients referred to Zagazig Viral Hepatitis Treatment Center (ZVHTC), Sharkia Governorate, Egypt. RAS were identified and characterized using deep sequencing. The first patient had breakthrough while receiving a daclatasvir (DCV)+sofosbuvir (SOF) regimen, patient 2 relapsed after treatment with DCV+SOF+ribavirin (RBV), and patient 3 relapsed after DCV+SOF therapy. A serum sample was collected from each patient at failure and sent to Vall d’Hebron Research Institute at Hospital Universitari Vall d’Hebron in Barcelona (Spain) for deep-sequencing study to identify and characterize the RAS present in the samples. Results: The following were identified: L28M, L30S and L28M+L30S in patient 1, L30R in patient 2, and R155C, D168E, L28M, L30H, L30S, L28M+L30H, and L28M+L30S in patient 3. Conclusion: To the best of our knowledge, this is the first report from Egypt of patients failing DAA-based therapy, describing the associated RAS. This information will be of help to understand the natural history of HCV in Egyptian patients and guide the proper choice of retreatment protocols.This study was supported by the Spanish Ministry of Health, Consumer Affairs, and Social Welfare, grant name: Plan Estrategico Nacional contra la Hepatitis C. This study was also funded by Instituto de Salud Carlos III, PI15/00856 and PI16/00337, cofinanced by CIBERehd (Consorcio Centro de Investigacion en Red de Enfermedades Hepaticas y Digestivas), which is funded by Instituto de Salud Carlos III and Centro para el Desarrollo Tecnologico Industrial (CDTI) from the Spanish Ministry of Economy and Business, grant number, IDI-2015112

    Improving virus production through quasispecies genomic selection and molecular breedings

    Get PDF
    Virus production still is a challenging issue in antigen manufacture, particularly with slow-growing viruses. Deep-sequencing of genomic regions indicative of efficient replication may be used to identify high-fitness minority individuals suppressed by the ensemble of mutants in a virus quasispecies. Molecular breeding of quasispecies containing colonizer individuals, under regimes allowing more than one replicative cycle, is a strategy to select the fittest competitors among the colonizers. A slow-growing cell culture-adapted hepatitis A virus strain was employed as a model for this strategy. Using genomic selection in two regions predictive of efficient translation, the internal ribosome entry site and the VP1-coding region, high-fitness minority colonizer individuals were identified in a population adapted to conditions of artificially-induced cellular transcription shut-off. Molecular breeding of this population with a second one, also adapted to transcription shut-off and showing an overall colonizer phenotype, allowed the selection of a fast-growing population of great biotechnological potential

    Analysis of hepatitis B virus preS1 variability and prevalence of the rs2296651 polymorphism in a Spanish population

    Get PDF
    Altres ajuts: Cofinanced by the European Regional Development Fund (ERDF); and the Gilead Fellowship Program, No. GLD14-00296.To determine the variability/conservation of the domain of hepatitis B virus (HBV) preS1 region that interacts with sodium-taurocholate cotransporting polypeptide (hereafter, NTCP-interacting domain) and the prevalence of the rs2296651 polymorphism (S267F, NTCP variant) in a Spanish population. Serum samples from 246 individuals were included and divided into 3 groups: patients with chronic HBV infection (CHB) (n = 41, 73% Caucasians), patients with resolved HBV infection (n = 100, 100% Caucasians) and an HBV-uninfected control group (n = 105, 100% Caucasians). Variability/conservation of the amino acid (aa) sequences of the NTCP-interacting domain, (aa 2-48 in viral genotype D) and a highly conserved preS1 domain associated with virion morphogenesis (aa 92-103 in viral genotype D) were analyzed by next-generation sequencing and compared in 18 CHB patients with viremia > 4 log IU/mL. The rs2296651 polymorphism was determined in all individuals in all 3 groups using an in-house real-time PCR melting curve analysis. The HBV preS1 NTCP-interacting domain showed a high degree of conservation among the examined viral genomes especially between aa 9 and 21 (in the genotype D consensus sequence). As compared with the virion morphogenesis domain, the NTCP-interacting domain had a smaller proportion of HBV genotype-unrelated changes comprising > 1% of the quasispecies (25.5% vs 31.8%), but a larger proportion of genotype-associated viral polymorphisms (34% vs 27.3%), according to consensus sequences from GenBank patterns of HBV genotypes A to H. Variation/conservation in both domains depended on viral genotype, with genotype C being the most highly conserved and genotype E the most variable (limited finding, only 2 genotype E included). Of note, proline residues were highly conserved in both domains, and serine residues showed changes only to threonine or tyrosine in the virion morphogenesis domain. The rs2296651 polymorphism was not detected in any participant. In our CHB population, the NTCP-interacting domain was highly conserved, particularly the proline residues and essential amino acids related with the NTCP interaction, and the prevalence of rs2296651 was low/null
    corecore