9 research outputs found

    COMBSS: Best Subset Selection via Continuous Optimization

    Full text link
    The problem of best subset selection in linear regression is considered with the aim to find a fixed size subset of features that best fits the response. This is particularly challenging when the total available number of features is very large compared to the number of data samples. Existing optimal methods for solving this problem tend to be slow while fast methods tend to have low accuracy. Ideally, new methods perform best subset selection faster than existing optimal methods but with comparable accuracy, or, being more accurate than methods of comparable computational speed. Here, we propose a novel continuous optimization method that identifies a subset solution path, a small set of models of varying size, that consists of candidates for the single best subset of features, that is optimal in a specific sense in linear regression. Our method turns out to be fast, making the best subset selection possible when the number of features is well in excess of thousands. Because of the outstanding overall performance, framing the best subset selection challenge as a continuous optimization problem opens new research directions for feature extraction for a large variety of regression models

    Dilatancy Equation Based on the Property-Dependent Plastic Potential Theory for Geomaterials

    No full text
    The dilatancy equation ignores the noncoaxiality of granular soil for the coaxial assumption of the direction of the stress and strain rate in conventional plastic potential theory, which is inconsistent with extensive laboratory tests. To reasonably describe the noncoaxial effects on dilatancy, the energy dissipation of plastic flow is derived based on the property-dependent plastic potential theory for geomaterials and integrates the noncoaxiality, the potential theory links the plastic strain of granular materials with its fabric, and the noncoaxiality is naturally related to the mesoscopic properties of materials. When the fabric is isotropic, the dilatancy equation degenerates into the form of the critical state theory, and when the fabric is anisotropic, it naturally describes the effects of noncoaxiality. In the plane stress state, a comparison between a simple shear test and prediction of the dilatancy equation shows that the equation can reasonably describe the effect of noncoaxiality on dilatancy with the introduction of microscopic fabric parameters, and its physical significance is clear. This paper can provide a reference for the theoretical description of the macro and micro mechanical properties of geomaterials

    Comprehensive analysis of cuproptosis-related lncRNAs signature to predict prognosis in bladder urothelial carcinoma

    No full text
    Abstract Background Cuproptosis-related genes (CRGs) have been recently discovered to regulate the occurrence and development of various tumors by controlling cuproptosis, a novel type of copper ion-dependent cell death. Although cuproptosis is mediated by lipoylated tricarboxylic acid cycle proteins, the relationship between cuproptosis-related long noncoding RNAs (crlncRNAs) in bladder urothelial carcinoma (BLCA) and clinical outcomes, tumor microenvironment (TME) modification, and immunotherapy remains unknown. In this paper, we tried to discover the importance of lncRNAs for BLCA. Methods The BLCA-related lncRNAs and clinical data were first obtained from The Cancer Genome Atlas (TCGA). CRGs were obtained through Coexpression, Cox regression and Lasso regression. Besides, a prognosis model was established for verification. Meanwhile, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis, gene ontology (GO) analysis, principal component analysis (PCA), half-maximal inhibitory concentration prediction (IC50), immune status and drug susceptibility analysis were carried out. Results We identified 277 crlncRNAs and 16 survival-related lncRNAs. According to the 8-crlncRNA risk model, patients could be divided into high-risk group and low-risk group. Progression-Free-Survival (PFS), independent prognostic analysis, concordance index (C-index), receiver operating characteristic (ROC) curve and nomogram all confirmed the excellent predictive capability of the 8-lncRNA risk model for BLCA. During gene mutation burden survival analysis, noticeable differences were observed in high- and low-risk patients. We also found that the two groups of patients might respond differently to immune targets and anti-tumor drugs. Conclusion The nomogram with 8-lncRNA may help guide treatment of BLCA. More clinical studies are necessary to verify the nomogram

    Discrepancy Bounds for Deterministic Acceptance-Rejection Samplers

    Full text link
    The Monte Carlo method is one of the widely used numerical methods for simulating probability distributions. Its convergence rate is independent of the dimension but slow.Quasi-Monte Carlo methods, which can be seen as a deterministic version of Monte Carlo methods, have been developed to improve the convergence rate to achieve greater accuracy, which partially depends on generating samples with small discrepancy. Putting the quasi-Monte Carlo idea into statistical sampling is a good way to improve the convergence rate and widen practical applications.In this thesis we focus on constructing low-discrepancy point sets with respect to non-uniform target measures using the acceptance-rejection sampler. We consider the acceptance-rejection samplers based on different driver sequences. The driver sequence is chosen such that the discrepancy between the empirical distribution and the target distribution is small. Hence digital nets, stratified inputs and lattice point sets are used for this purpose. The central contribution in this work is the establishment of discrepancy bounds for samples generated by acceptance-rejection samplers. Together with a Koksma-Hlawka type inequality, we obtain an improvement of the numerical integration error for non-uniform measures.Furthermore we introduce a quality criterion for measuring the goodness of driver sequences in the acceptance-rejection method. Explicit constructions of driver sequences yield a convergence order beyond plain Monte Carlo for samples generated by the deterministic acceptance-rejection samplers in dimension one.The proposed algorithms are numerically tested and compared with the standard acceptance-rejection algorithm using pseudo-random inputs. The empirical evidence confirms that adapting low-discrepancy sequences in the acceptance-rejection sampler outperforms the original algorithm
    corecore