60 research outputs found

    Synergistic Effects of Different Levels of Genomic Data for the Staging of Lung Adenocarcinoma: An Illustrative Study

    Get PDF
    Lung adenocarcinoma (LUAD) is a common and very lethal cancer. Accurate staging is a prerequisite for its effective diagnosis and treatment. Therefore, improving the accuracy of the stage prediction of LUAD patients is of great clinical relevance. Previous works have mainly focused on single genomic data information or a small number of different omics data types concurrently for generating predictive models. A few of them have considered multi-omics data from genome to proteome. We used a publicly available dataset to illustrate the potential of multi-omics data for stage prediction in LUAD. In particular, we investigated the roles of the specific omics data types in the prediction process. We used a self-developed method, Omics-MKL, for stage prediction that combines an existing feature ranking technique Minimum Redundancy and Maximum Relevance (mRMR), which avoids redundancy among the selected features, and multiple kernel learning (MKL), applying different kernels for different omics data types. Each of the considered omics data types individually provided useful prediction results. Moreover, using multi-omics data delivered notably better results than using single-omics data. Gene expression and methylation information seem to play vital roles in the staging of LUAD. The Omics-MKL method retained 70 features after the selection process. Of these, 21 (30%) were methylation features and 34 (48.57%) were gene expression features. Moreover, 18 (25.71%) of the selected features are known to be related to LUAD, and 29 (41.43%) to lung cancer in general. Using multi-omics data from genome to proteome for predicting the stage of LUAD seems promising because each omics data type may improve the accuracy of the predictions. Here, methylation and gene expression data may play particularly important roles

    Benchmark study of feature selection strategies for multi-omics data

    Get PDF
    BACKGROUND: In the last few years, multi-omics data, that is, datasets containing different types of high-dimensional molecular variables for the same samples, have become increasingly available. To date, several comparison studies focused on feature selection methods for omics data, but to our knowledge, none compared these methods for the special case of multi-omics data. Given that these data have specific structures that differentiate them from single-omics data, it is unclear whether different feature selection strategies may be optimal for such data. In this paper, using 15 cancer multi-omics datasets we compared four filter methods, two embedded methods, and two wrapper methods with respect to their performance in the prediction of a binary outcome in several situations that may affect the prediction results. As classifiers, we used support vector machines and random forests. The methods were compared using repeated fivefold cross-validation. The accuracy, the AUC, and the Brier score served as performance metrics. RESULTS: The results suggested that, first, the chosen number of selected features affects the predictive performance for many feature selection methods but not all. Second, whether the features were selected by data type or from all data types concurrently did not considerably affect the predictive performance, but for some methods, concurrent selection took more time. Third, regardless of which performance measure was considered, the feature selection methods mRMR, the permutation importance of random forests, and the Lasso tended to outperform the other considered methods. Here, mRMR and the permutation importance of random forests already delivered strong predictive performance when considering only a few selected features. Finally, the wrapper methods were computationally much more expensive than the filter and embedded methods. CONCLUSIONS: We recommend the permutation importance of random forests and the filter method mRMR for feature selection using multi-omics data, where, however, mRMR is considerably more computationally costly. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-04962-x

    A Diagnostic Model for Kawasaki Disease Based on Immune Cell Characterization From Blood Samples

    Get PDF
    Background: Kawasaki disease (KD) is the leading cause of acquired heart disease in children. However, distinguishing KD from febrile infections early in the disease course remains difficult. Our goal was to estimate the immune cell composition in KD patients and febrile controls (FC), and to develop a tool for KD diagnosis. Methods: We used a machine-learning algorithm, CIBERSORT, to estimate the proportions of 22 immune cell types based on blood samples from children with KD and FC. Using these immune cell compositions, a diagnostic score for predicting KD was then constructed based on LASSO regression for binary outcomes. Results: In the training set (n = 496), a model was fit which consisted of eight types of immune cells. The area under the curve (AUC) values for diagnosing KD in a held-out test set (n = 212) and an external validation set (n = 36) were 0.80 and 0.77, respectively. The most common cell types in KD blood samples were monocytes, neutrophils, CD4(+)-naïve and CD8(+) T cells, and M0 macrophages. The diagnostic score was highly correlated to genes that had been previously reported as associated with KD, such as interleukins and chemokine receptors, and enriched in reported pathways, such as IL-6/JAK/STAT3 and TNFα signaling pathways. Conclusion: Altogether, the diagnostic score for predicting KD could potentially serve as a biomarker. Prospective studies could evaluate how incorporating the diagnostic score into a clinical algorithm would improve diagnostic accuracy further

    Smc5/6 coordinates formation and resolution of joint molecules with chromosome morphology to ensure meiotic divisions

    Get PDF
    During meiosis, Structural Maintenance of Chromosome (SMC) complexes underpin two fundamental features of meiosis: homologous recombination and chromosome segregation. While meiotic functions of the cohesin and condensin complexes have been delineated, the role of the third SMC complex, Smc5/6, remains enigmatic. Here we identify specific, essential meiotic functions for the Smc5/6 complex in homologous recombination and the regulation of cohesin. We show that Smc5/6 is enriched at centromeres and cohesin-association sites where it regulates sister-chromatid cohesion and the timely removal of cohesin from chromosomal arms, respectively. Smc5/6 also localizes to recombination hotspots, where it promotes normal formation and resolution of a subset of joint-molecule intermediates. In this regard, Smc5/6 functions independently of the major crossover pathway defined by the MutLγ complex. Furthermore, we show that Smc5/6 is required for stable chromosomal localization of the XPF-family endonuclease, Mus81-Mms4Eme1. Our data suggest that the Smc5/6 complex is required for specific recombination and chromosomal processes throughout meiosis and that in its absence, attempts at cell division with unresolved joint molecules and residual cohesin lead to severe recombination-induced meiotic catastroph

    A Temperature-dependent Model of Ratio of Specific Heats Applying in Diesel Engine

    No full text
    Rate of heat release is a standard tool when engineers tune and develop new engines. The ratio of specific heats γ\gamma is considered an essential parameter for achieving accurate rate of heat release calculation as it couples the engine system energy and other thermodynamic properties.The γ\gamma model is a function of various factors, such as temperature, air-fuel-ratio, pressure, etc. To improve the accuracy of ROHR calculation in the Scania diesel engine in NTNU machinery laboratory since constant-value γ\gamma model is applied there, three existing temperature-dependent γ\gamma models are investigated in this work. Some other more complex models considering other factors are not discussed here as real-time computation is required. An engine experiment comprising of constant speed and propeller curve test cycle is carried out for supplying the raw data.\\ A reference model based on the theory of Zacharias's formulas is established to evaluate the accuracy of three candidate models. The models are evaluated in both γ\gamma domain and ROHR domain. Three evaluating criteria including MRE, RMSE and NRMSE are carried out to evaluate the error of models. A retrofit γ\gamma model is proposed by applying γ\gamma offset value to further improve accuracy in the end

    Scaled boundary finite element method for fluid-structure interaction

    No full text
    This study presents the first attempt to extend the scaled boundary finite element method (SBFEM) for Fluid-Structure-Interaction problems. A fluid velocity-to-pressure relationship based on the SBFEM and acoustic approximations is developed. A FEM/SBFEM coupling procedure is presented. A new SBFEM formulation which involves a zero matrix for reservoir with absorptive flat bottom is derived.Doctor of Philosophy (CEE

    Temperature dependent mechanical property of PZT film: an investigation by nanoindentation.

    No full text
    Load-depth curves of an unpoled Lead Zirconate Titanate (PZT) film composite as a function of temperature were measured by nanoindentation technique. Its reduce modulus and hardness were calculated by the typical Oliver-Pharr method. Then the true modulus and hardness of the PZT film were assessed by decoupling the influence of substrate using methods proposed by Zhou et al. and Korsunsky et al., respectively. Results show that the indentation depth and modulus increase, but the hardness decreases at elevated temperature. The increasing of indentation depth and the decreasing of hardness are thought to be caused by the decreasing of the critical stress needed to excite dislocation initiation at high temperature. The increasing of true modulus is attributed to the reducing of recoverable indentation depth induced by back-switched domains. The influence of residual stress on the indentation behavior of PZT film composite was also investigated by measuring its load-depth curves with pre-load strains

    Design of a digital NHit trigger circuit for nuclear and particle physics experiment

    No full text
    In nuclear and particle physics experiments, due to the influence of the experimental background and detector noise, the experiment needs to pick out valid physical events by the trigger selection, and eliminate the background noise. Aiming at the trigger selection requirements based on hit multiplicity (NHit) in the cases of high event rates in physical experiments, this paper designs a high-performance digital trigger circuit. This circuit has 13 high-speed serial communication interfaces, which support optical fiber data transmission and Gigabit network communication; onboard 32 Gbit DDR4 cache and high-end FPGA to support large-capacity high-speed storage and real-time data processing. Based on this circuit, the real-time hardware NHit trigger algorithm could be run, which could realize fast trigger selection and data readout for the front-end data. At the same time, the circuit is easy to expand and could be flexibly used in different physical experiments. After testing and verification, the transmission rate of one single optical fiber interface could reach 8.125 Gb/s, the uplink transmission rate of SiTCP could reach 949.3 Mb/s, and the actual read and write rates of the DDR4 cache could reach 102.6 Gb/s, which meet the data transmission and caching requirements of digital trigger circuit design
    corecore