151 research outputs found

    Finding conserved patterns in biological sequences, networks and genomes

    Get PDF
    Biological patterns are widely used for identifying biologically interesting regions within macromolecules, classifying biological objects, predicting functions and studying evolution. Good pattern finding algorithms will help biologists to formulate and validate hypotheses in an attempt to obtain important insights into the complex mechanisms of living things. In this dissertation, we aim to improve and develop algorithms for five biological pattern finding problems. For the multiple sequence alignment problem, we propose an alternative formulation in which a final alignment is obtained by preserving pairwise alignments specified by edges of a given tree. In contrast with traditional NPhard formulations, our preserving alignment formulation can be solved in polynomial time without using a heuristic, while having very good accuracy. For the path matching problem, we take advantage of the linearity of the query path to reduce the problem to finding a longest weighted path in a directed acyclic graph. We can find k paths with top scores in a network from the query path in polynomial time. As many biological pathways are not linear, our graph matching approach allows a non-linear graph query to be given. Our graph matching formulation overcomes the common weakness of previous approaches that there is no guarantee on the quality of the results. For the gene cluster finding problem, we investigate a formulation based on constraining the overall size of a cluster and develop statistical significance estimates that allow direct comparisons of clusters of different sizes. We explore both a restricted version which requires that orthologous genes are strictly ordered within each cluster, and the unrestricted problem that allows paralogous genes within a genome and clusters that may not appear in every genome. We solve the first problem in polynomial time and develop practical exact algorithms for the second one. In the gene cluster querying problem, based on a querying strategy, we propose an efficient approach for investigating clustering of related genes across multiple genomes for a given gene cluster. By analyzing gene clustering in 400 bacterial genomes, we show that our algorithm is efficient enough to study gene clusters across hundreds of genomes

    Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins.</p> <p>Results</p> <p>By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states.</p> <p>Conclusion</p> <p>Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold) is available at <url>http://faculty.cs.tamu.edu/shsze/ssfold</url>.</p

    Finding conserved patterns in biological sequences, networks and genomes

    Get PDF
    Biological patterns are widely used for identifying biologically interesting regions within macromolecules, classifying biological objects, predicting functions and studying evolution. Good pattern finding algorithms will help biologists to formulate and validate hypotheses in an attempt to obtain important insights into the complex mechanisms of living things. In this dissertation, we aim to improve and develop algorithms for five biological pattern finding problems. For the multiple sequence alignment problem, we propose an alternative formulation in which a final alignment is obtained by preserving pairwise alignments specified by edges of a given tree. In contrast with traditional NPhard formulations, our preserving alignment formulation can be solved in polynomial time without using a heuristic, while having very good accuracy. For the path matching problem, we take advantage of the linearity of the query path to reduce the problem to finding a longest weighted path in a directed acyclic graph. We can find k paths with top scores in a network from the query path in polynomial time. As many biological pathways are not linear, our graph matching approach allows a non-linear graph query to be given. Our graph matching formulation overcomes the common weakness of previous approaches that there is no guarantee on the quality of the results. For the gene cluster finding problem, we investigate a formulation based on constraining the overall size of a cluster and develop statistical significance estimates that allow direct comparisons of clusters of different sizes. We explore both a restricted version which requires that orthologous genes are strictly ordered within each cluster, and the unrestricted problem that allows paralogous genes within a genome and clusters that may not appear in every genome. We solve the first problem in polynomial time and develop practical exact algorithms for the second one. In the gene cluster querying problem, based on a querying strategy, we propose an efficient approach for investigating clustering of related genes across multiple genomes for a given gene cluster. By analyzing gene clustering in 400 bacterial genomes, we show that our algorithm is efficient enough to study gene clusters across hundreds of genomes

    MARVEL: A Randomized Double‐Blind, Placebo‐Controlled Trial in Patients Undergoing Endovascular Therapy: Study Rationale and Design

    Get PDF
    BACKGROUND Steroids have pleiotropic neuroprotective actions including the regulation of inflammation and apoptosis which may influence the effects of ischemia on neurons, glial cells, and blood vessels. The effect of low‐dose methylprednisolone in patients with acute ischemic stroke in the endovascular therapy era remains unknown. This trial investigates the efficacy and safety of low‐dose methylprednisolone (2 mg/kg IV for 3 days) as adjunctive therapy for patients with acute ischemic stroke undergoing endovascular therapy within 24 hours from symptom onset. METHODS The MARVEL(Methylprednisolone as Adjunctive Therapy for Acute Large Vessel Occlusion: A Randomized Double‐Blind, Placebo‐Controlled Trial in Patients Undergoing Endovascular Therapy) trial is an investigator‐initiated, prospective, randomized, double‐blind, placebo‐controlled multicenter clinical trial. Up to 1672 eligible patients with anterior circulation large‐vessel occlusion stroke presenting within 24 hours from symptom onset are planned to be consecutively randomized to receive methylprednisolone or placebo in a 1:1 ratio across 82 stroke centers in China. RESULTS The primary outcome is the ordinal shift in the modified Rankin scale score at 90 days. Secondary outcomes include 90‐day functional independence (modified Rankin scale score, 0–2). The primary safety end points include mortality rate at 90 days and symptomatic intracerebral hemorrhage within 48 hours of endovascular therapy. CONCLUSION The MARVEL trial will provide evidence of the efficacy and safety of low‐dose methylprednisolone as adjunctive therapy for patients with anterior circulation large‐vessel occlusion stroke undergoing endovascular therapy

    Fine mapping and candidate gene analysis of gynoecy trait in chieh-qua (Benincasa hispida Cogn. var. chieh-qua How)

    Get PDF
    Gynoecy demonstrates an earlier production of hybrids and a higher yield and improves the efficiency of hybrid seed production. Therefore, the utilization of gynoecy is beneficial for the genetic breeding of chieh-qua. However, little knowledge of gynoecious-related genes in chieh-qua has been reported until now. Here, we used an F2 population from the cross between the gynoecious line ‘A36’ and the monoecious line ‘SX’ for genetic mapping and revealed that chieh-qua gynoecy was regulated by a single recessive gene. We fine-mapped it into a 530-kb region flanked by the markers Indel-3 and KASP145 on Chr.8, which harbors eight candidate genes. One of the candidate genes, Bhi08G000345, encoding networked protein 4 (CqNET4), contained a non-synonymous SNP resulting in the amino acid substitution of isoleucine (ATA; I) to methionine (ATG; M). CqNET4 was prominently expressed in the female flower, and only three genes related to ethylene synthesis were significantly expressed between ‘A36’ and ‘SX.’ The results presented here provide support for the CqNET4 as the most likely candidate gene for chieh-qua gynoecy, which differed from the reported gynoecious genes

    High prevalence of hyperglycaemia and the impact of high household income in transforming Rural China

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The prevalence of hyperglycaemia and its association with socioeconomic factors have been well studied in developed countries, however, little is known about them in transforming rural China.</p> <p>Methods</p> <p>A cross-sectional study was carried out in 4 rural communities of Deqing County located in East China in 2006-07, including 4,506 subjects aged 18 to 64 years. Fasting plasma glucose (FPG) was measured. Subjects were considered to have impaired fasting glucose (IFG) if FPG was in the range from 5.6 to 6.9 mmol/L and to have diabetes mellitus (DM) if FG was 7.0 mmol/L or above.</p> <p>Results</p> <p>The crude prevalences of IFG and DM were 5.4% and 2.2%, respectively. The average ratio of IFG/DM was 2.5, and tended to be higher for those under the age of 35 years than older subjects. After adjustment for covariates including age (continuous), sex, BMI (continuous), smoking, alcohol drinking, and regular leisure physical activity, subjects in the high household income group had a significantly higher risk of IFG compared with the medium household income group (OR: 1.74, 95% CI: 1.11-2.72) and no significant difference in IFG was observed between the low and medium household income groups. Education and farmer occupation were not significantly associated with IFG.</p> <p>Conclusions</p> <p>High household income was significantly associated with an increased risk of IFG. A high ratio of IFG/DM suggests a high risk of diabetes in foreseeable future in the Chinese transforming rural communities.</p

    Tirofiban for Stroke without Large or Medium-Sized Vessel Occlusion

    Get PDF
    The effects of the glycoprotein IIb/IIIa receptor inhibitor tirofiban in patients with acute ischemic stroke but who have no evidence of complete occlusion of large or medium-sized vessels have not been extensively studied. In a multicenter trial in China, we enrolled patients with ischemic stroke without occlusion of large or medium-sized vessels and with a National Institutes of Health Stroke Scale score of 5 or more and at least one moderately to severely weak limb. Eligible patients had any of four clinical presentations: ineligible for thrombolysis or thrombectomy and within 24 hours after the patient was last known to be well; progression of stroke symptoms 24 to 96 hours after onset; early neurologic deterioration after thrombolysis; or thrombolysis with no improvement at 4 to 24 hours. Patients were assigned to receive intravenous tirofiban (plus oral placebo) or oral aspirin (100 mg per day, plus intravenous placebo) for 2 days; all patients then received oral aspirin until day 90. The primary efficacy end point was an excellent outcome, defined as a score of 0 or 1 on the modified Rankin scale (range, 0 [no symptoms] to 6 [death]) at 90 days. Secondary end points included functional independence at 90 days and a quality-of-life score. The primary safety end points were death and symptomatic intracranial hemorrhage. A total of 606 patients were assigned to the tirofiban group and 571 to the aspirin group. Most patients had small infarctions that were presumed to be atherosclerotic. The percentage of patients with a score of 0 or 1 on the modified Rankin scale at 90 days was 29.1% with tirofiban and 22.2% with aspirin (adjusted risk ratio, 1.26; 95% confidence interval, 1.04 to 1.53, P = 0.02). Results for secondary end points were generally not consistent with the results of the primary analysis. Mortality was similar in the two groups. The incidence of symptomatic intracranial hemorrhage was 1.0% in the tirofiban group and 0% in the aspirin group. In this trial involving heterogeneous groups of patients with stroke of recent onset or progression of stroke symptoms and nonoccluded large and medium-sized cerebral vessels, intravenous tirofiban was associated with a greater likelihood of an excellent outcome than low-dose aspirin. Incidences of intracranial hemorrhages were low but slightly higher with tirofiban

    Methylprednisolone as Adjunct to Endovascular Thrombectomy for Large-Vessel Occlusion Stroke

    Get PDF
    Importance It is uncertain whether intravenous methylprednisolone improves outcomes for patients with acute ischemic stroke due to large-vessel occlusion (LVO) undergoing endovascular thrombectomy. Objective To assess the efficacy and adverse events of adjunctive intravenous low-dose methylprednisolone to endovascular thrombectomy for acute ischemic stroke secondary to LVO. Design, Setting, and Participants This investigator-initiated, randomized, double-blind, placebo-controlled trial was implemented at 82 hospitals in China, enrolling 1680 patients with stroke and proximal intracranial LVO presenting within 24 hours of time last known to be well. Recruitment took place between February 9, 2022, and June 30, 2023, with a final follow-up on September 30, 2023.InterventionsEligible patients were randomly assigned to intravenous methylprednisolone (n = 839) at 2 mg/kg/d or placebo (n = 841) for 3 days adjunctive to endovascular thrombectomy. Main Outcomes and Measures The primary efficacy outcome was disability level at 90 days as measured by the overall distribution of the modified Rankin Scale scores (range, 0 [no symptoms] to 6 [death]). The primary safety outcomes included mortality at 90 days and the incidence of symptomatic intracranial hemorrhage within 48 hours. Results Among 1680 patients randomized (median age, 69 years; 727 female [43.3%]), 1673 (99.6%) completed the trial. The median 90-day modified Rankin Scale score was 3 (IQR, 1-5) in the methylprednisolone group vs 3 (IQR, 1-6) in the placebo group (adjusted generalized odds ratio for a lower level of disability, 1.10 [95% CI, 0.96-1.25]; P = .17). In the methylprednisolone group, there was a lower mortality rate (23.2% vs 28.5%; adjusted risk ratio, 0.84 [95% CI, 0.71-0.98]; P = .03) and a lower rate of symptomatic intracranial hemorrhage (8.6% vs 11.7%; adjusted risk ratio, 0.74 [95% CI, 0.55-0.99]; P = .04) compared with placebo. Conclusions and Relevance Among patients with acute ischemic stroke due to LVO undergoing endovascular thrombectomy, adjunctive methylprednisolone added to endovascular thrombectomy did not significantly improve the degree of overall disability.Trial RegistrationChiCTR.org.cn Identifier: ChiCTR210005172

    Large-scale analysis of gene clustering in bacteria

    No full text
    An important strategy to study operons and their evolution is to investigate clustering of related genes across multiple bacterial genomes. Although existing algorithms are available that can identify gene clusters across two or more genomes, very few algorithms are efficient enough to study gene clusters across hundreds of genomes. We observe that a querying strategy can be used to analyze gene clusters across a large number of genomes and develop an efficient algorithm to identify all related clusters on a genome from a given query cluster. We use this algorithm to study gene clustering in 400 bacterial genomes by starting from a well-characterized list of operons in Escherichia coli K12 and perform comparative analysis of operon occurrences, gene orientations, and rearrangements both within and across clusters. We show that important biological insights can be obtained by comparing results across these categories. A software program implementing the algorithm (GCQuery) and supplementary data containing detailed results are available at http://faculty.cs.tamu.edu/shsze/gcquery
    corecore