61 research outputs found

    Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

    Full text link
    In this paper, we investigate the convergence properties of the stochastic gradient descent (SGD) method and its variants, especially in training neural networks built from nonsmooth activation functions. We develop a novel framework that assigns different timescales to stepsizes for updating the momentum terms and variables, respectively. Under mild conditions, we prove the global convergence of our proposed framework in both single-timescale and two-timescale cases. We show that our proposed framework encompasses a wide range of well-known SGD-type methods, including heavy-ball SGD, SignSGD, Lion, normalized SGD and clipped SGD. Furthermore, when the objective function adopts a finite-sum formulation, we prove the convergence properties for these SGD-type methods based on our proposed framework. In particular, we prove that these SGD-type methods find the Clarke stationary points of the objective function with randomly chosen stepsizes and initial points under mild assumptions. Preliminary numerical experiments demonstrate the high efficiency of our analyzed SGD-type methods.Comment: 30 pages, the introduction part is modified and some typos are correcte

    CDOpt: A Python Package for a Class of Riemannian Optimization

    Full text link
    Optimization over the embedded submanifold defined by constraints c(x)=0c(x) = 0 has attracted much interest over the past few decades due to its wide applications in various areas. Plenty of related optimization packages have been developed based on Riemannian optimization approaches, which rely on some basic geometrical materials of Riemannian manifolds, including retractions, vector transports, etc. These geometrical materials can be challenging to determine in general. Existing packages only accommodate a few well-known manifolds whose geometrical materials are easily accessible. For other manifolds which are not contained in these packages, the users have to develop the geometric materials by themselves. In addition, it is not always tractable to adopt advanced features from various state-of-the-art unconstrained optimization solvers to Riemannian optimization approaches. We introduce CDOpt (available at https://cdopt.github.io/), a user-friendly Python package for a class Riemannian optimization. Based on constraint dissolving approaches, Riemannian optimization problems are transformed into their equivalent unconstrained counterparts in CDOpt. Therefore, solving Riemannian optimization problems through CDOpt directly benefits from various existing solvers and the rich expertise gained over decades for unconstrained optimization. Moreover, all the computations in CDOpt related to any manifold in question are conducted on its constraints expression, hence users can easily define new manifolds in CDOpt without any background on differential geometry. Furthermore, CDOpt extends the neural layers from PyTorch and Flax, thus allows users to train manifold constrained neural networks directly by the solvers for unconstrained optimization. Extensive numerical experiments demonstrate that CDOpt is highly efficient and robust in solving various classes of Riemannian optimization problems.Comment: 31 page

    Molecular-size dependence of glycogen enzymatic degradation and its importance for diabetes

    Get PDF
    Glycogen, a hyperbranched glucose polymer, is the blood-sugar reservoir in animals. Liver glycogen comprises small β particles, which can join together as large composite α particles. It had been shown that the binding between β in α particles in the liver of diabetic mice is more fragile than in healthy mice. This could be linked to the loss of blood-sugar control characteristic of diabetes if the rate per monomer unit of the enzymatic degradation to glucose of α particles were significantly slower than that of β particles. This is tested here by examining the in vitro time evolution of the molecular size distribution of glycogen from the livers of healthy and diabetic mice and rats, containing distinct components of both α and β particles; this treatment is analogous to the “competitive growth” method used to explore mechanisms in emulsion polymerization. Simulations for the time evolution of the molecular size distribution were also performed. It is found that the degradation rate per monomer unit is indeed faster for the smaller particles, supporting the hypothesis of a causal link between chemical fragility of glycogen from diabetic liver with poor control of blood-sugar release. Comparison between simulations and experiment indicate that α and β particles have significant structural differences

    Targeted metabolomics analysis of nucleosides and the identification of biomarkers for colorectal adenomas and colorectal cancer

    Get PDF
    The morbidity and mortality of colorectal cancer (CRC) have been increasing in recent years, and early detection of CRC can improve the survival rate of patients. RNA methylation plays crucial roles in many biological processes and has been implicated in the initiation of various diseases, including cancer. Serum contains a variety of biomolecules and is an important clinical sample for biomarker discovery. In this study, we developed a targeted metabolomics method for the quantitative analysis of nucleosides in human serum samples by using liquid chromatography with tandem mass spectrometry (LC-MS/MS). We successfully quantified the concentrations of nucleosides in serum samples from 51 healthy controls, 37 patients with colorectal adenomas, and 55 patients with CRC. The results showed that the concentrations of N6-methyladenosine (m6A), N1-methyladenosine (m1A), and 3-methyluridine (m3U) were increased in patients with CRC, whereas the concentrations of N2-methylguanosine (m2G), 2′-O-methyluridine (Um), and 2′-O-methylguanosine (Gm) were decreased in patients with CRC, compared with the healthy controls and patients with colorectal adenomas. Moreover, the levels of 2′-O-methyluridine and 2′-O-methylguanosine were lower in patients with colorectal adenomas than those in healthy controls. Interestingly, the levels of Um and Gm gradually decreased in the following order: healthy controls to colorectal adenoma patients to CRC patients. These results revealed that the aberrations of these nucleosides were tightly correlated to colorectal adenomas and CRC. In addition, the present work will stimulate future investigations about the regulatory roles of these nucleosides in the initiation and development of CRC

    Implications for biological function of lobe dependence of the molecular structure of liver glycogen

    Get PDF
    Liver glycogen, a complex branched polymer of glucose, plays a major role in controlling blood-sugar levels. Understanding its molecular structure is important for diabetes, especially since it has been found that this structure is more fragile in diabetic than in healthy mouse liver. However, there are differences in metabolic processes between liver lobes, which would be expected to be reflected in differing glycogen molecular structures. This structure was examined for separated lobe regions in rat livers, using size-exclusion chromatography (SEC) and fluorophore-assisted carbohydrate electrophoresis. The results show that the SEC weight distribution of glycogen, and the molecular weight distribution of individual branches (chains), from different lobes are similar. This shows that (a) molecular structural characterization of glycogen from whole-liver biopsy is representative (which is convenient because the commonest animal model for diabetes is the mouse, whose livers are very small), and (b) the fact that molecular structure is conserved (regulated) in different lobes suggests that this structure plays an important role in blood-sugar regulation

    Associations between body composition profile and hypertension in different fatty liver phenotypes

    Get PDF
    BackgroundIt is currently unclear whether and how the association between body composition and hypertension varies based on the presence and severity of fatty liver disease (FLD).MethodsFLD was diagnosed using ultrasonography among 6,358 participants. The association between body composition and hypertension was analyzed separately in the whole population, as well as in subgroups of non-FLD, mild FLD, and moderate/severe FLD populations, respectively. The mediation effect of FLD in their association was explored.ResultsFat-related anthropometric measurements and lipid metabolism indicators were positively associated with hypertension in both the whole population and the non-FLD subgroup. The strength of this association was slightly reduced in the mild FLD subgroup. Notably, only waist-to-hip ratio and waist-to-height ratio showed significant associations with hypertension in the moderate/severe FLD subgroup. Furthermore, FLD accounted for 17.26% to 38.90% of the association between multiple body composition indicators and the risk of hypertension.ConclusionsThe association between body composition and hypertension becomes gradually weaker as FLD becomes more severe. FLD plays a significant mediating role in their association

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    A Revised Inverse Data Envelopment Analysis Model Based on Radial Models

    No full text
    In recent years, there has been an increasing interest in applying inverse data envelopment analysis (DEA) to a wide range of disciplines, and most applications have adopted radial-based inverse DEA models. However, results given by existing radial based inverse DEA models can be unreliable as they neglect slacks while evaluating decision-making units’ (DMUs) overall efficiency level, whereas classic radial DEA models measure the efficiency level through not only radial efficiency index but also slacks. This paper points out these disadvantages with a counterexample, where current inverse DEA models give results that outputs shall increase when inputs decrease. We show that these unreasonable results are the consequence of existing inverse DEA models’ failure in preserving DMU’s efficiency level. To rectify this problem, we propose a revised model for the situation where the investigated DMU has no slacks. Compared to existing radial inverse DEA models, our revised model can preserve radial efficiency index as well as eliminating all slacks, thus fulfilling the requirement of efficiency level invariant. Numerical examples are provided to illustrate the validity and limitations of the revised model

    An Efficient Orthonormalization-Free Approach for Sparse Dictionary Learning and Dual Principal Component Pursuit

    No full text
    Sparse dictionary learning (SDL) is a classic representation learning method and has been widely used in data analysis. Recently, the ℓ m -norm ( m ≥ 3 , m ∈ N ) maximization has been proposed to solve SDL, which reshapes the problem to an optimization problem with orthogonality constraints. In this paper, we first propose an ℓ m -norm maximization model for solving dual principal component pursuit (DPCP) based on the similarities between DPCP and SDL. Then, we propose a smooth unconstrained exact penalty model and show its equivalence with the ℓ m -norm maximization model. Based on our penalty model, we develop an efficient first-order algorithm for solving our penalty model (PenNMF) and show its global convergence. Extensive experiments illustrate the high efficiency of PenNMF when compared with the other state-of-the-art algorithms on solving the ℓ m -norm maximization with orthogonality constraints
    corecore