31 research outputs found

    Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator

    Full text link
    We study here a fixed mini-batch gradient decent (FMGD) algorithm to solve optimization problems with massive datasets. In FMGD, the whole sample is split into multiple non-overlapping partitions. Once the partitions are formed, they are then fixed throughout the rest of the algorithm. For convenience, we refer to the fixed partitions as fixed mini-batches. Then for each computation iteration, the gradients are sequentially calculated on each fixed mini-batch. Because the size of fixed mini-batches is typically much smaller than the whole sample size, it can be easily computed. This leads to much reduced computation cost for each computational iteration. It makes FMGD computationally efficient and practically more feasible. To demonstrate the theoretical properties of FMGD, we start with a linear regression model with a constant learning rate. We study its numerical convergence and statistical efficiency properties. We find that sufficiently small learning rates are necessarily required for both numerical convergence and statistical efficiency. Nevertheless, an extremely small learning rate might lead to painfully slow numerical convergence. To solve the problem, a diminishing learning rate scheduling strategy can be used. This leads to the FMGD estimator with faster numerical convergence and better statistical efficiency. Finally, the FMGD algorithms with random shuffling and a general loss function are also studied

    RECOST: External Knowledge Guided Data-efficient Instruction Tuning

    Full text link
    In the current landscape of large language models (LLMs), the process of instruction tuning serves as an essential step. Considering the high computing power overhead, data-efficient instruction tuning was proposed to reduce the training data size in this process, aiming at selecting high-quality instructional data. Nevertheless, we argue that most current data-efficient instruction-tuning methods are highly dependent on the quality of the original instruction-tuning dataset. When it comes to datasets synthesized by LLMs, a common scenario in this field, dirty samples will even be selected with a higher probability than other samples. To address these challenges, we utilized external knowledge (relevant examples or paragraphs) to evaluate those samples synthesized by LLMs with an in-context-based relative predictive entropy. Based on the new metric, we proposed a framework, dubbed as \textbf{RECOST}, which integrates external-knowledge-base re-ranking and diversity-consistent sampling into a single pipeline. Through extensive experiments on several synthetic datasets (Alpaca and Alpaca-gpt4), we demonstrate the effectiveness of our method and achieve even better results with only \textbf{1\%} of the full dataset

    An Asymptotic Analysis of Minibatch-Based Momentum Methods for Linear Regression Models

    Full text link
    Momentum methods have been shown to accelerate the convergence of the standard gradient descent algorithm in practice and theory. In particular, the minibatch-based gradient descent methods with momentum (MGDM) are widely used to solve large-scale optimization problems with massive datasets. Despite the success of the MGDM methods in practice, their theoretical properties are still underexplored. To this end, we investigate the theoretical properties of MGDM methods based on the linear regression models. We first study the numerical convergence properties of the MGDM algorithm and further provide the theoretically optimal tuning parameters specification to achieve faster convergence rate. In addition, we explore the relationship between the statistical properties of the resulting MGDM estimator and the tuning parameters. Based on these theoretical findings, we give the conditions for the resulting estimator to achieve the optimal statistical efficiency. Finally, extensive numerical experiments are conducted to verify our theoretical results.Comment: 45 pages, 5 figure

    RAP-SAM: Towards Real-Time All-Purpose Segment Anything

    Full text link
    Advanced by transformer architecture, vision foundation models (VFMs) achieve remarkable progress in performance and generalization ability. Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation. However, most VFMs cannot run in realtime, which makes it difficult to transfer them into several products. On the other hand, current real-time segmentation mainly has one purpose, such as semantic segmentation on the driving scene. We argue that diverse outputs are needed for real applications. Thus, this work explores a new real-time segmentation setting, named all-purpose segmentation in real-time, to transfer VFMs in real-time deployment. It contains three different tasks, including interactive segmentation, panoptic segmentation, and video segmentation. We aim to use one model to achieve the above tasks in real-time. We first benchmark several strong baselines. Then, we present Real-Time All Purpose SAM (RAP-SAM). It contains an efficient encoder and an efficient decoupled decoder to perform prompt-driven decoding. Moreover, we further explore different training strategies and tuning methods to boost co-training performance further. Our code and model are available at https://github.com/xushilin1/RAP-SAM/.Comment: Project Page: https://xushilin1.github.io/rap_sam

    Statistical Analysis of Fixed Mini-Batch Gradient Descent Estimator

    No full text
    We study here a fixed mini-batch gradient decent (FMGD) algorithm to solve optimization problems with massive datasets. In FMGD, the whole sample is split into multiple non-overlapping partitions. Once the partitions are formed, they are then fixed throughout the rest of the algorithm. For convenience, we refer to the fixed partitions as fixed mini-batches. Then for each computation iteration, the gradients are sequentially calculated on each fixed mini-batch. Because the size of fixed mini-batches is typically much smaller than the whole sample size, it can be easily computed. This leads to much reduced computation cost for each computational iteration. It makes FMGD computationally efficient and practically more feasible. To demonstrate the theoretical properties of FMGD, we start with a linear regression model with a constant learning rate. We study its numerical convergence and statistical efficiency properties. We find that sufficiently small learning rates are necessarily required for both numerical convergence and statistical efficiency. Nevertheless, an extremely small learning rate might lead to painfully slow numerical convergence. To solve the problem, a diminishing learning rate scheduling strategy (Gitman et al., 2019) can be used. This leads to the FMGD estimator with faster numerical convergence and better statistical efficiency. Finally, the FMGD algorithms with random shuffling and a general loss function are also studied.</p

    Dynamics of Dalk Glacier in East Antarctica Derived from Multisource Satellite Observations Since 2000

    No full text
    Monitoring variability in outlet glaciers can improve the understanding of feedbacks associated with calving, ocean thermal forcing, and climate change. In this study, we present a remote-sensing investigation of Dalk Glacier in East Antarctica to analyze its dynamic changes. Terminus positions and surface ice velocities were estimated from Landsat and Sentinel-1 data, and the high-precision Worldview digital elevation model (DEM) was generated to determine the location of the potential ice rumple. We detected the cyclic behavior of glacier terminus changes and similar periodic increases in surface velocity since 2000. The terminus retreated in 2006, 2009, 2010, and 2016 and advanced in other years. The surface velocity of Dalk Glacier has a 5-year cycle with interannual speed-ups in 2007, 2012, and 2017. Our observations show the relationship between velocity changes and terminus variations, as well as the driving role of the ice rumple. Ice velocity often increases after calving events and continuous retreats. The loss of buttressing provided by an ice rumple may be a primary factor for increases in ice velocity. Given the restriction of the ice rumple, the surface velocity remains relatively stable when the glacier advances. The calving events may be linked to the unstable terminus caused by the ice rumple

    General method of calculating annular laminar pressure drop of drilling fluids with different rheological models

    No full text
    Traditional annular laminar flow analysis method applied in drilling engineering has poor accuracy and does not apply to complicated rheological models. A general method of calculating annular laminar pressure drop of drilling fluids with different rheological models was proposed, compared with traditional method, and verified using experiment data in published articles. Based on annular slot flow model, the pipe flow equation was popularized in annular and an annular flow equation was established, and the relationship between annular flow and shear stress at pipe wall was established by combining annular flow equation and fluid's rheological equation. If annular flow is given, shear stress at pipe wall can be got, and then the annular pressure drop is also available. Compared with traditional methods, the general method is adapt to any rheological model of annular laminar flow, and has good universality, simple modeling process and high accuracy. The calculation results based on experiment data in published articles show: no matter the flow rate is high or low, the results of the general method match the measured results well, which makes up the error of traditional method. Key words: annular laminar pressure drop, annular flow equation, non-newtonian fluid, rheological equation, traditional method, general method, drilling fluid

    3-D Gabor-based anisotropic diffusion for speckle noise suppression in dynamic ultrasound images

    No full text
    Speckle noise contaminates medical ultrasound images, and the suppression of speckle noise is helpful for image interpretation. Traditional ultrasound denoising (i.e., despeckling) methods are developed on two-dimensional static images. However, one of the advantages of ultrasonography is its nature of dynamic imaging. A method for dynamic ultrasound despeckling is expected to incorporate both the spatial and temporal information in successive images of dynamic ultrasound and thus yield better denoising performance. Here we regard a dynamic ultrasound video as three-dimensional (3-D) images with two dimensions in the spatial domain and one in the temporal domain, and we propose a despeckling algorithm for dynamic ultrasound named the 3-D Gabor-based anisotropic diffusion (GAD-3D). The GAD-3D expands the classic two-dimensional Gabor-based anisotropic diffusion (GAD) into 3-D domain. First, we proposed a robust 3-D Gabor-based edge detector by capturing the edge with 3-D Gabor transformation. Then we embed this novel detector into the partial differential equation of GAD to guide the 3-D diffusion process. In the simulation experiment, when the noise variance is as high as 0.14, the GAD-3D improves the Pratt’s figure of merit, mean structural similarity index and peak signal-to-noise ratio by 24.32%, 10.98%, and 6.51%, respectively, compared with the best values of seven other methods. Experimental results on clinical dynamic ultrasonography suggest that the GAD-3D outperforms the other seven methods in noise reduction and detail preservation. The GAD-3D is effective for dynamic ultrasound despeckling and may be potentially valuable for disease assessment in dynamic medical ultrasonography
    corecore