26 research outputs found
Image_2_Construction and validation of nomograms combined with novel machine learning algorithms to predict early death of patients with metastatic colorectal cancer.TIF
PurposeThe purpose of this study was to investigate the clinical and non-clinical characteristics that may affect the early death rate of patients with metastatic colorectal carcinoma (mCRC) and develop accurate prognostic predictive models for mCRC.MethodMedical records of 35,639 patients with mCRC diagnosed from 2010 to 2019 were obtained from the SEER database. All the patients were randomly divided into a training cohort and a validation cohort in a ratio of 7:3. X-tile software was utilized to identify the optimal cutoff point for age and tumor size. Univariate and multivariate logistic regression models were used to determine the independent predictors associated with overall early death and cancer-specific early death caused by mCRC. Simultaneously, predictive and dynamic nomograms were constructed. Moreover, logistic regression, random forest, CatBoost, LightGBM, and XGBoost were used to establish machine learning (ML) models. In addition, receiver operating characteristic curves (ROCs) and calibration plots were obtained to estimate the accuracy of the models. Decision curve analysis (DCA) was employed to determine the clinical benefits of ML models.ResultsThe optimal cutoff points for age were 58 and 77 years and those for tumor size of 45 and 76. A total of 15 independent risk factors, namely, age, marital status, race, tumor localization, histologic type, grade, N-stage, tumor size, surgery, radiation, chemotherapy, bone metastasis, brain metastasis, liver metastasis, and lung metastasis, were significantly associated with the overall early death rate of patients with mCRC and the cancer-specific early death rate of patients with mCRC, following which nomograms were constructed. The ML models revealed that the random forest model accurately predicted outcomes, followed by logistic regression, CatBoost, XGBoost, and LightGBM models. Compared with other algorithms, the random forest model provided more clinical benefits than other models and can be used to make clinical decisions in overall early death and specific early death caused by mCRC.ConclusionML algorithms combined with nomograms may play an important role in distinguishing early deaths owing to mCRC and potentially help clinicians make clinical decisions and follow-up strategies.</p
Table_1_Construction and validation of nomograms combined with novel machine learning algorithms to predict early death of patients with metastatic colorectal cancer.docx
PurposeThe purpose of this study was to investigate the clinical and non-clinical characteristics that may affect the early death rate of patients with metastatic colorectal carcinoma (mCRC) and develop accurate prognostic predictive models for mCRC.MethodMedical records of 35,639 patients with mCRC diagnosed from 2010 to 2019 were obtained from the SEER database. All the patients were randomly divided into a training cohort and a validation cohort in a ratio of 7:3. X-tile software was utilized to identify the optimal cutoff point for age and tumor size. Univariate and multivariate logistic regression models were used to determine the independent predictors associated with overall early death and cancer-specific early death caused by mCRC. Simultaneously, predictive and dynamic nomograms were constructed. Moreover, logistic regression, random forest, CatBoost, LightGBM, and XGBoost were used to establish machine learning (ML) models. In addition, receiver operating characteristic curves (ROCs) and calibration plots were obtained to estimate the accuracy of the models. Decision curve analysis (DCA) was employed to determine the clinical benefits of ML models.ResultsThe optimal cutoff points for age were 58 and 77 years and those for tumor size of 45 and 76. A total of 15 independent risk factors, namely, age, marital status, race, tumor localization, histologic type, grade, N-stage, tumor size, surgery, radiation, chemotherapy, bone metastasis, brain metastasis, liver metastasis, and lung metastasis, were significantly associated with the overall early death rate of patients with mCRC and the cancer-specific early death rate of patients with mCRC, following which nomograms were constructed. The ML models revealed that the random forest model accurately predicted outcomes, followed by logistic regression, CatBoost, XGBoost, and LightGBM models. Compared with other algorithms, the random forest model provided more clinical benefits than other models and can be used to make clinical decisions in overall early death and specific early death caused by mCRC.ConclusionML algorithms combined with nomograms may play an important role in distinguishing early deaths owing to mCRC and potentially help clinicians make clinical decisions and follow-up strategies.</p
Image_1_Construction and validation of nomograms combined with novel machine learning algorithms to predict early death of patients with metastatic colorectal cancer.TIF
PurposeThe purpose of this study was to investigate the clinical and non-clinical characteristics that may affect the early death rate of patients with metastatic colorectal carcinoma (mCRC) and develop accurate prognostic predictive models for mCRC.MethodMedical records of 35,639 patients with mCRC diagnosed from 2010 to 2019 were obtained from the SEER database. All the patients were randomly divided into a training cohort and a validation cohort in a ratio of 7:3. X-tile software was utilized to identify the optimal cutoff point for age and tumor size. Univariate and multivariate logistic regression models were used to determine the independent predictors associated with overall early death and cancer-specific early death caused by mCRC. Simultaneously, predictive and dynamic nomograms were constructed. Moreover, logistic regression, random forest, CatBoost, LightGBM, and XGBoost were used to establish machine learning (ML) models. In addition, receiver operating characteristic curves (ROCs) and calibration plots were obtained to estimate the accuracy of the models. Decision curve analysis (DCA) was employed to determine the clinical benefits of ML models.ResultsThe optimal cutoff points for age were 58 and 77 years and those for tumor size of 45 and 76. A total of 15 independent risk factors, namely, age, marital status, race, tumor localization, histologic type, grade, N-stage, tumor size, surgery, radiation, chemotherapy, bone metastasis, brain metastasis, liver metastasis, and lung metastasis, were significantly associated with the overall early death rate of patients with mCRC and the cancer-specific early death rate of patients with mCRC, following which nomograms were constructed. The ML models revealed that the random forest model accurately predicted outcomes, followed by logistic regression, CatBoost, XGBoost, and LightGBM models. Compared with other algorithms, the random forest model provided more clinical benefits than other models and can be used to make clinical decisions in overall early death and specific early death caused by mCRC.ConclusionML algorithms combined with nomograms may play an important role in distinguishing early deaths owing to mCRC and potentially help clinicians make clinical decisions and follow-up strategies.</p
Image_3_Construction and validation of nomograms combined with novel machine learning algorithms to predict early death of patients with metastatic colorectal cancer.TIF
PurposeThe purpose of this study was to investigate the clinical and non-clinical characteristics that may affect the early death rate of patients with metastatic colorectal carcinoma (mCRC) and develop accurate prognostic predictive models for mCRC.MethodMedical records of 35,639 patients with mCRC diagnosed from 2010 to 2019 were obtained from the SEER database. All the patients were randomly divided into a training cohort and a validation cohort in a ratio of 7:3. X-tile software was utilized to identify the optimal cutoff point for age and tumor size. Univariate and multivariate logistic regression models were used to determine the independent predictors associated with overall early death and cancer-specific early death caused by mCRC. Simultaneously, predictive and dynamic nomograms were constructed. Moreover, logistic regression, random forest, CatBoost, LightGBM, and XGBoost were used to establish machine learning (ML) models. In addition, receiver operating characteristic curves (ROCs) and calibration plots were obtained to estimate the accuracy of the models. Decision curve analysis (DCA) was employed to determine the clinical benefits of ML models.ResultsThe optimal cutoff points for age were 58 and 77 years and those for tumor size of 45 and 76. A total of 15 independent risk factors, namely, age, marital status, race, tumor localization, histologic type, grade, N-stage, tumor size, surgery, radiation, chemotherapy, bone metastasis, brain metastasis, liver metastasis, and lung metastasis, were significantly associated with the overall early death rate of patients with mCRC and the cancer-specific early death rate of patients with mCRC, following which nomograms were constructed. The ML models revealed that the random forest model accurately predicted outcomes, followed by logistic regression, CatBoost, XGBoost, and LightGBM models. Compared with other algorithms, the random forest model provided more clinical benefits than other models and can be used to make clinical decisions in overall early death and specific early death caused by mCRC.ConclusionML algorithms combined with nomograms may play an important role in distinguishing early deaths owing to mCRC and potentially help clinicians make clinical decisions and follow-up strategies.</p
An Array-Based Method To Identify Multivalent Inhibitors
Carbohydrate−protein interactions play a critical role in a variety of biological processes, and agonists/antagonists of these interactions are useful as biological probes and therapeutic agents. Most carbohydrate-binding proteins achieve tight binding through formation of a multivalent complex. Therefore, both ligand structure and presentation contribute to recognition. Since there are many potential combinations of structure, spacing, and orientation to consider and the optimal one cannot be predicted, high-throughput approaches for analyzing carbohydrate−protein interactions and designing inhibitors are appealing. In this report, we develop a strategy to vary neoglycoprotein density on a surface of a glycan array. This feature of presentation was combined with variations in glycan structure and glycan density to produce an array with approximately 600 combinations of glycan structure and presentation. The unique array platform allows one to distinguish between different types of multivalent complexes on the array surface. To illustrate the advantages of this format, it was used to rapidly identify multivalent probes for various lectins. The new array was first tested with several plant lectins, including concanavalin A (conA), Vicia villosa isolectin B4 (VVL-B4), and Ricinus communis agglutinin (RCA120). Next, it was used to rapidly identify potent multivalent inhibitors of Pseudomonas aeruginosa lectin I (PA-IL), a key protein involved in opportunistic infections of P. aeruginosa, and mouse macrophage galactose-type lectin (mMGL-2), a protein expressed on antigen presenting cells that may be useful as a vaccine targeting receptor. An advantage of the approach is that structural information about the lectin/receptor is not required to obtain a multivalent inhibitor/probe
Divergent Behavior of Glycosylated Threonine and Serine Derivatives in Solid Phase Peptide Synthesis
Solid phase peptide coupling of glycosylated threonine derivatives was systematically evaluated. In contrast to glycosylated serine derivatives which are highly prone to epimerization, glycosylated threonine derivatives produce only negligible amounts of epimerization. Under forcing conditions, glycosylated threonine analogs undergo β-elimination, rather than epimerization. Mechanistic studies and molecular modeling were used to understand the origin of the differences in reactivity
An Array-Based Method To Identify Multivalent Inhibitors
Carbohydrate−protein interactions play a critical role in a variety of biological processes, and agonists/antagonists of these interactions are useful as biological probes and therapeutic agents. Most carbohydrate-binding proteins achieve tight binding through formation of a multivalent complex. Therefore, both ligand structure and presentation contribute to recognition. Since there are many potential combinations of structure, spacing, and orientation to consider and the optimal one cannot be predicted, high-throughput approaches for analyzing carbohydrate−protein interactions and designing inhibitors are appealing. In this report, we develop a strategy to vary neoglycoprotein density on a surface of a glycan array. This feature of presentation was combined with variations in glycan structure and glycan density to produce an array with approximately 600 combinations of glycan structure and presentation. The unique array platform allows one to distinguish between different types of multivalent complexes on the array surface. To illustrate the advantages of this format, it was used to rapidly identify multivalent probes for various lectins. The new array was first tested with several plant lectins, including concanavalin A (conA), Vicia villosa isolectin B4 (VVL-B4), and Ricinus communis agglutinin (RCA120). Next, it was used to rapidly identify potent multivalent inhibitors of Pseudomonas aeruginosa lectin I (PA-IL), a key protein involved in opportunistic infections of P. aeruginosa, and mouse macrophage galactose-type lectin (mMGL-2), a protein expressed on antigen presenting cells that may be useful as a vaccine targeting receptor. An advantage of the approach is that structural information about the lectin/receptor is not required to obtain a multivalent inhibitor/probe
Competition between Serum IgG, IgM, and IgA Anti-Glycan Antibodies
<div><p>Anti-glycan antibodies are an abundant subpopulation of serum antibodies with critical functions in many immune processes. Changes in the levels of these antibodies can occur with the onset of disease, exposure to pathogens, or vaccination. As a result, there has been significant interest in exploiting anti-glycan antibodies as biomarkers for many diseases. Serum contains a mixture of anti-glycan antibodies that can recognize the same antigen, and competition for binding can potentially influence the detection of antibody subpopulations that are more relevant to disease processes. The most abundant antibody isotypes in serum are IgG, IgM, and IgA, but little is known regarding how these different isotypes compete for the same glycan antigen. In this study, we developed a multiplexed glycan microarray assay and applied it to evaluate how different isotypes of anti-glycan antibodies (IgA, IgG, and IgM) compete for printed glycan antigens. While IgG and IgA antibodies typically outcompete IgM for peptide or protein antigens, we found that IgM outcompete IgG and IgA for many glycan antigens. To illustrate the importance of this effect, we provide evidence that IgM competition can account for the unexpected observation that IgG of certain antigen specificities appear to be preferentially transported from mothers to fetuses. We demonstrate that IgM in maternal sera compete with IgG resulting in lower than expected IgG signals. Since cord blood contains very low levels of IgM, competition only affects maternal IgG signals, making it appear as though certain IgG antibodies are higher in cord blood than matched maternal blood. Taken together, the results highlight the importance of competition for studies involving anti-glycan antibodies.</p></div
Competition between serum IgG, IgA, and IgM anti-glycan antibodies.
<p>(A) Addition of IgM and IgA to IgG. Polyclonal IgG isolated from serum was first profiled on the array alone. Separately, IgG was premixed with varying amounts of IgM or IgA and then profiled on the array. For each array component, the change in IgG signal in the presence of IgM or IgA was determined. The box plots depict the range of IgG changes (on a log base 2 scale) measured on the array upon addition of 4 serum equivalents of IgM or IgA. The line in the middle of the box is the median, the box spans 1 standard deviation above and below the median, and the whiskers represent 2 standard deviations above or below the median. (B) Addition of IgG and IgA to IgM. An analogous protocol as above was used to evaluate effects of IgG and IgA on IgM signals. (C) Addition of IgM and IgG to IgA. An analogous protocol as above was used to evaluate effects of IgG and IgM on IgA signals. The box plots demonstrate significant decreases in IgG and IgA signals in the presence of IgM for the vast majority of array components.</p
Site-Selective Glycosylation of Hemoglobin on Cys β93
In this work, we describe the synthesis and characterization of a novel glycosylated hemoglobin (Hb) with high oxygen affinity as a potential Hb-based oxygen carrier. Site-selective glycosylation of bovine Hb was achieved by conjugating a lactose derivative to Cys 93 on the β subunit of Hb. LC-MS analysis indicates that the reaction was quantitative, with no unmodified Hb present in the reaction product. The glycosylation site was identified by chymotrypsin digestion of the glycosylated bovine Hb followed with LC-MS/MS and from the X-ray crystal structure of the glycosylated Hb. The chemical conjugation of the lactose derivative at Cys β93 yields an oxygen carrier with a high oxygen affinity (P50 of 4.94 mmHg) and low cooperativity coefficient (n) of 1.20. Asymmetric flow field-flow fractionation (AFFFF) coupled with multiangle static light scattering (MASLS) was used to measure the absolute molecular weight of the glycosylated Hb. AFFFF-MASLS analysis indicates that glycosylation of Hb significantly altered the α2β2−αβ equilibrium compared to native Hb. Subsequent X-ray analysis of the glycosylated Hb crystal showed that the covalently linked lactose derivative is sandwiched between the β1 and α2 (and hence by symmetry the β2 and α1) subunits of the tetramer, and the interaction between the saccharide and amino acid residues located at the interface is apparently stabilized by hydrogen bonding interactions. The resultant structural analysis of the glycosylated Hb helps to explain the shift in the α2β2−αβ equilibrium in terms of the hydrogen bonding interactions at the β1α2/β2α1 interface. Taken together, all of these results indicate that it is feasible to site-specifically glycosylate Hb. This work has great potential in developing an oxygen carrier with defined chemistry that can target oxygen delivery to low pO2 tissues and organs
