151 research outputs found
Finding conserved patterns in biological sequences, networks and genomes
Biological patterns are widely used for identifying biologically interesting regions
within macromolecules, classifying biological objects, predicting functions and studying
evolution. Good pattern finding algorithms will help biologists to formulate and
validate hypotheses in an attempt to obtain important insights into the complex
mechanisms of living things.
In this dissertation, we aim to improve and develop algorithms for five biological
pattern finding problems. For the multiple sequence alignment problem, we propose
an alternative formulation in which a final alignment is obtained by preserving pairwise
alignments specified by edges of a given tree. In contrast with traditional NPhard
formulations, our preserving alignment formulation can be solved in polynomial
time without using a heuristic, while having very good accuracy.
For the path matching problem, we take advantage of the linearity of the query
path to reduce the problem to finding a longest weighted path in a directed acyclic
graph. We can find k paths with top scores in a network from the query path in
polynomial time. As many biological pathways are not linear, our graph matching
approach allows a non-linear graph query to be given. Our graph matching formulation
overcomes the common weakness of previous approaches that there is no
guarantee on the quality of the results.
For the gene cluster finding problem, we investigate a formulation based on constraining the overall size of a cluster and develop statistical significance estimates that
allow direct comparisons of clusters of different sizes. We explore both a restricted
version which requires that orthologous genes are strictly ordered within each cluster,
and the unrestricted problem that allows paralogous genes within a genome and clusters
that may not appear in every genome. We solve the first problem in polynomial
time and develop practical exact algorithms for the second one.
In the gene cluster querying problem, based on a querying strategy, we propose
an efficient approach for investigating clustering of related genes across multiple
genomes for a given gene cluster. By analyzing gene clustering in 400 bacterial
genomes, we show that our algorithm is efficient enough to study gene clusters across
hundreds of genomes
Predicting protein folding pathways at the mesoscopic level based on native interactions between secondary structure elements
<p>Abstract</p> <p>Background</p> <p>Since experimental determination of protein folding pathways remains difficult, computational techniques are often used to simulate protein folding. Most current techniques to predict protein folding pathways are computationally intensive and are suitable only for small proteins.</p> <p>Results</p> <p>By assuming that the native structure of a protein is known and representing each intermediate conformation as a collection of fully folded structures in which each of them contains a set of interacting secondary structure elements, we show that it is possible to significantly reduce the conformation space while still being able to predict the most energetically favorable folding pathway of large proteins with hundreds of residues at the mesoscopic level, including the pig muscle phosphoglycerate kinase with 416 residues. The model is detailed enough to distinguish between different folding pathways of structurally very similar proteins, including the streptococcal protein G and the peptostreptococcal protein L. The model is also able to recognize the differences between the folding pathways of protein G and its two structurally similar variants NuG1 and NuG2, which are even harder to distinguish. We show that this strategy can produce accurate predictions on many other proteins with experimentally determined intermediate folding states.</p> <p>Conclusion</p> <p>Our technique is efficient enough to predict folding pathways for both large and small proteins at the mesoscopic level. Such a strategy is often the only feasible choice for large proteins. A software program implementing this strategy (SSFold) is available at <url>http://faculty.cs.tamu.edu/shsze/ssfold</url>.</p
Finding conserved patterns in biological sequences, networks and genomes
Biological patterns are widely used for identifying biologically interesting regions
within macromolecules, classifying biological objects, predicting functions and studying
evolution. Good pattern finding algorithms will help biologists to formulate and
validate hypotheses in an attempt to obtain important insights into the complex
mechanisms of living things.
In this dissertation, we aim to improve and develop algorithms for five biological
pattern finding problems. For the multiple sequence alignment problem, we propose
an alternative formulation in which a final alignment is obtained by preserving pairwise
alignments specified by edges of a given tree. In contrast with traditional NPhard
formulations, our preserving alignment formulation can be solved in polynomial
time without using a heuristic, while having very good accuracy.
For the path matching problem, we take advantage of the linearity of the query
path to reduce the problem to finding a longest weighted path in a directed acyclic
graph. We can find k paths with top scores in a network from the query path in
polynomial time. As many biological pathways are not linear, our graph matching
approach allows a non-linear graph query to be given. Our graph matching formulation
overcomes the common weakness of previous approaches that there is no
guarantee on the quality of the results.
For the gene cluster finding problem, we investigate a formulation based on constraining the overall size of a cluster and develop statistical significance estimates that
allow direct comparisons of clusters of different sizes. We explore both a restricted
version which requires that orthologous genes are strictly ordered within each cluster,
and the unrestricted problem that allows paralogous genes within a genome and clusters
that may not appear in every genome. We solve the first problem in polynomial
time and develop practical exact algorithms for the second one.
In the gene cluster querying problem, based on a querying strategy, we propose
an efficient approach for investigating clustering of related genes across multiple
genomes for a given gene cluster. By analyzing gene clustering in 400 bacterial
genomes, we show that our algorithm is efficient enough to study gene clusters across
hundreds of genomes
Recommended from our members
Perceived acceptable uncertainty regarding comparability of endovascular treatment alone versus intravenous thrombolysis plus endovascular treatment.
BACKGROUND
Most trials comparing endovascular treatment (EVT) alone versus intravenous thrombolysis with alteplase (IVT) + EVT in directly admitted patients with a stroke are non-inferiority trials. However, the margin based on the level of uncertainty regarding non-inferiority of the experimental treatment that clinicians are willing to accept to incorporate EVT alone into clinical practice remains unknown.
OBJECTIVE
To characterize what experienced stroke clinicians would consider an acceptable level of uncertainty for hypothetical decisions on whether to administer IVT or not before EVT in patients admitted directly to EVT-capable centers.
METHODS
A web-based, structured survey was distributed to a cross-section of 600 academic neurologists/neurointerventionalists. For this purpose, a response framework for a hypothetical trial comparing IVT+EVT (standard of care) with EVT alone (experimental arm) was designed. In this trial, a similar proportion of patients in each arm achieved functional independence at 90 days. Invited physicians were asked at what level of certainty they would feel comfortable skipping IVT in clinical practice, considering these hypothetical trial results.
RESULTS
There were 180 respondents (response rate: 30%) and 165 with complete answers. The median chosen acceptable uncertainty suggesting reasonable comparability between both treatments was an absolute difference in the rate of day 90 functional independence of 3% (mode 5%, IQR 1-5%), with higher chosen margins observed in interventionalists (aOR 2.20, 95% CI 1.06 to 4.67).
CONCLUSION
Physicians would generally feel comfortable skipping IVT before EVT at different certainty thresholds. Most physicians would treat with EVT alone if randomized trial data suggested that the number of patients achieving functional independence at 90 days was similar between the two groups, and one could be sufficiently sure that no more than 3 out of 100 patients would not achieve functional independence at 90 days due to skipping IVT
MARVEL: A Randomized Double‐Blind, Placebo‐Controlled Trial in Patients Undergoing Endovascular Therapy: Study Rationale and Design
BACKGROUND
Steroids have pleiotropic neuroprotective actions including the regulation of inflammation and apoptosis which may influence the effects of ischemia on neurons, glial cells, and blood vessels. The effect of low‐dose methylprednisolone in patients with acute ischemic stroke in the endovascular therapy era remains unknown. This trial investigates the efficacy and safety of low‐dose methylprednisolone (2 mg/kg IV for 3 days) as adjunctive therapy for patients with acute ischemic stroke undergoing endovascular therapy within 24 hours from symptom onset.
METHODS The MARVEL(Methylprednisolone as Adjunctive Therapy for Acute Large Vessel Occlusion:
A Randomized Double‐Blind, Placebo‐Controlled Trial in Patients Undergoing Endovascular Therapy) trial is an investigator‐initiated, prospective, randomized, double‐blind, placebo‐controlled multicenter clinical trial. Up to 1672 eligible patients with anterior circulation large‐vessel occlusion stroke presenting within 24 hours from symptom onset are planned to be consecutively randomized to receive methylprednisolone or placebo in a 1:1 ratio across 82 stroke centers in China.
RESULTS
The primary outcome is the ordinal shift in the modified Rankin scale score at 90 days. Secondary outcomes include 90‐day functional independence (modified Rankin scale score, 0–2). The primary safety end points include mortality rate at 90 days and symptomatic intracerebral hemorrhage within 48 hours of endovascular therapy.
CONCLUSION
The MARVEL trial will provide evidence of the efficacy and safety of low‐dose methylprednisolone as adjunctive therapy for patients with anterior circulation large‐vessel occlusion stroke undergoing endovascular therapy
Fine mapping and candidate gene analysis of gynoecy trait in chieh-qua (Benincasa hispida Cogn. var. chieh-qua How)
Gynoecy demonstrates an earlier production of hybrids and a higher yield and improves the efficiency of hybrid seed production. Therefore, the utilization of gynoecy is beneficial for the genetic breeding of chieh-qua. However, little knowledge of gynoecious-related genes in chieh-qua has been reported until now. Here, we used an F2 population from the cross between the gynoecious line ‘A36’ and the monoecious line ‘SX’ for genetic mapping and revealed that chieh-qua gynoecy was regulated by a single recessive gene. We fine-mapped it into a 530-kb region flanked by the markers Indel-3 and KASP145 on Chr.8, which harbors eight candidate genes. One of the candidate genes, Bhi08G000345, encoding networked protein 4 (CqNET4), contained a non-synonymous SNP resulting in the amino acid substitution of isoleucine (ATA; I) to methionine (ATG; M). CqNET4 was prominently expressed in the female flower, and only three genes related to ethylene synthesis were significantly expressed between ‘A36’ and ‘SX.’ The results presented here provide support for the CqNET4 as the most likely candidate gene for chieh-qua gynoecy, which differed from the reported gynoecious genes
High prevalence of hyperglycaemia and the impact of high household income in transforming Rural China
<p>Abstract</p> <p>Background</p> <p>The prevalence of hyperglycaemia and its association with socioeconomic factors have been well studied in developed countries, however, little is known about them in transforming rural China.</p> <p>Methods</p> <p>A cross-sectional study was carried out in 4 rural communities of Deqing County located in East China in 2006-07, including 4,506 subjects aged 18 to 64 years. Fasting plasma glucose (FPG) was measured. Subjects were considered to have impaired fasting glucose (IFG) if FPG was in the range from 5.6 to 6.9 mmol/L and to have diabetes mellitus (DM) if FG was 7.0 mmol/L or above.</p> <p>Results</p> <p>The crude prevalences of IFG and DM were 5.4% and 2.2%, respectively. The average ratio of IFG/DM was 2.5, and tended to be higher for those under the age of 35 years than older subjects. After adjustment for covariates including age (continuous), sex, BMI (continuous), smoking, alcohol drinking, and regular leisure physical activity, subjects in the high household income group had a significantly higher risk of IFG compared with the medium household income group (OR: 1.74, 95% CI: 1.11-2.72) and no significant difference in IFG was observed between the low and medium household income groups. Education and farmer occupation were not significantly associated with IFG.</p> <p>Conclusions</p> <p>High household income was significantly associated with an increased risk of IFG. A high ratio of IFG/DM suggests a high risk of diabetes in foreseeable future in the Chinese transforming rural communities.</p
Tirofiban for Stroke without Large or Medium-Sized Vessel Occlusion
The effects of the glycoprotein IIb/IIIa receptor inhibitor tirofiban in patients with acute ischemic stroke but who have no evidence of complete occlusion of large or medium-sized vessels have not been extensively studied. In a multicenter trial in China, we enrolled patients with ischemic stroke without occlusion of large or medium-sized vessels and with a National Institutes of Health Stroke Scale score of 5 or more and at least one moderately to severely weak limb. Eligible patients had any of four clinical presentations: ineligible for thrombolysis or thrombectomy and within 24 hours after the patient was last known to be well; progression of stroke symptoms 24 to 96 hours after onset; early neurologic deterioration after thrombolysis; or thrombolysis with no improvement at 4 to 24 hours. Patients were assigned to receive intravenous tirofiban (plus oral placebo) or oral aspirin (100 mg per day, plus intravenous placebo) for 2 days; all patients then received oral aspirin until day 90. The primary efficacy end point was an excellent outcome, defined as a score of 0 or 1 on the modified Rankin scale (range, 0 [no symptoms] to 6 [death]) at 90 days. Secondary end points included functional independence at 90 days and a quality-of-life score. The primary safety end points were death and symptomatic intracranial hemorrhage. A total of 606 patients were assigned to the tirofiban group and 571 to the aspirin group. Most patients had small infarctions that were presumed to be atherosclerotic. The percentage of patients with a score of 0 or 1 on the modified Rankin scale at 90 days was 29.1% with tirofiban and 22.2% with aspirin (adjusted risk ratio, 1.26; 95% confidence interval, 1.04 to 1.53, P = 0.02). Results for secondary end points were generally not consistent with the results of the primary analysis. Mortality was similar in the two groups. The incidence of symptomatic intracranial hemorrhage was 1.0% in the tirofiban group and 0% in the aspirin group. In this trial involving heterogeneous groups of patients with stroke of recent onset or progression of stroke symptoms and nonoccluded large and medium-sized cerebral vessels, intravenous tirofiban was associated with a greater likelihood of an excellent outcome than low-dose aspirin. Incidences of intracranial hemorrhages were low but slightly higher with tirofiban
Methylprednisolone as Adjunct to Endovascular Thrombectomy for Large-Vessel Occlusion Stroke
Importance
It is uncertain whether intravenous methylprednisolone improves outcomes for patients with acute ischemic stroke due to large-vessel occlusion (LVO) undergoing endovascular thrombectomy.
Objective
To assess the efficacy and adverse events of adjunctive intravenous low-dose methylprednisolone to endovascular thrombectomy for acute ischemic stroke secondary to LVO.
Design, Setting, and Participants
This investigator-initiated, randomized, double-blind, placebo-controlled trial was implemented at 82 hospitals in China, enrolling 1680 patients with stroke and proximal intracranial LVO presenting within 24 hours of time last known to be well. Recruitment took place between February 9, 2022, and June 30, 2023, with a final follow-up on September 30, 2023.InterventionsEligible patients were randomly assigned to intravenous methylprednisolone (n = 839) at 2 mg/kg/d or placebo (n = 841) for 3 days adjunctive to endovascular thrombectomy.
Main Outcomes and Measures
The primary efficacy outcome was disability level at 90 days as measured by the overall distribution of the modified Rankin Scale scores (range, 0 [no symptoms] to 6 [death]). The primary safety outcomes included mortality at 90 days and the incidence of symptomatic intracranial hemorrhage within 48 hours.
Results
Among 1680 patients randomized (median age, 69 years; 727 female [43.3%]), 1673 (99.6%) completed the trial. The median 90-day modified Rankin Scale score was 3 (IQR, 1-5) in the methylprednisolone group vs 3 (IQR, 1-6) in the placebo group (adjusted generalized odds ratio for a lower level of disability, 1.10 [95% CI, 0.96-1.25]; P = .17). In the methylprednisolone group, there was a lower mortality rate (23.2% vs 28.5%; adjusted risk ratio, 0.84 [95% CI, 0.71-0.98]; P = .03) and a lower rate of symptomatic intracranial hemorrhage (8.6% vs 11.7%; adjusted risk ratio, 0.74 [95% CI, 0.55-0.99]; P = .04) compared with placebo.
Conclusions and Relevance
Among patients with acute ischemic stroke due to LVO undergoing endovascular thrombectomy, adjunctive methylprednisolone added to endovascular thrombectomy did not significantly improve the degree of overall disability.Trial RegistrationChiCTR.org.cn Identifier: ChiCTR210005172
Large-scale analysis of gene clustering in bacteria
An important strategy to study operons and their evolution is to investigate clustering of related genes across multiple bacterial genomes. Although existing algorithms are available that can identify gene clusters across two or more genomes, very few algorithms are efficient enough to study gene clusters across hundreds of genomes. We observe that a querying strategy can be used to analyze gene clusters across a large number of genomes and develop an efficient algorithm to identify all related clusters on a genome from a given query cluster. We use this algorithm to study gene clustering in 400 bacterial genomes by starting from a well-characterized list of operons in Escherichia coli K12 and perform comparative analysis of operon occurrences, gene orientations, and rearrangements both within and across clusters. We show that important biological insights can be obtained by comparing results across these categories. A software program implementing the algorithm (GCQuery) and supplementary data containing detailed results are available at http://faculty.cs.tamu.edu/shsze/gcquery
- …