18 research outputs found
Optimal Subsampling Bootstrap for Massive Data
The bootstrap is a widely used procedure for statistical inference because of
its simplicity and attractive statistical properties. However, the vanilla
version of bootstrap is no longer feasible computationally for many modern
massive datasets due to the need to repeatedly resample the entire data.
Therefore, several improvements to the bootstrap method have been made in
recent years, which assess the quality of estimators by subsampling the full
dataset before resampling the subsamples. Naturally, the performance of these
modern subsampling methods is influenced by tuning parameters such as the size
of subsamples, the number of subsamples, and the number of resamples per
subsample. In this paper, we develop a novel hyperparameter selection
methodology for selecting these tuning parameters. Formulated as an
optimization problem to find the optimal value of some measure of accuracy of
an estimator subject to computational cost, our framework provides closed-form
solutions for the optimal hyperparameter values for subsampled bootstrap,
subsampled double bootstrap and bag of little bootstraps, at no or little extra
time cost. Using the mean square errors as a proxy of the accuracy measure, we
apply our methodology to study, compare and improve the performance of these
modern versions of bootstrap developed for massive data through simulation
study. The results are promising
MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning
Large Language models (LLMs) have demonstrated impressive in-context learning
(ICL) capabilities, where a LLM makes predictions for a given test input
together with a few input-output pairs (demonstrations). Nevertheless, the
inclusion of demonstrations leads to a quadratic increase in the computational
overhead of the self-attention mechanism. Existing solutions attempt to distill
lengthy demonstrations into compact vectors. However, they often require
task-specific retraining or compromise LLM's in-context learning performance.
To mitigate these challenges, we present Meta dEmonstratioN Distillation
(MEND), where a language model learns to distill any lengthy demonstrations
into vectors without retraining for a new downstream task. We exploit the
knowledge distillation to enhance alignment between MEND and LLM, achieving
both efficiency and effectiveness simultaneously. MEND is endowed with the
meta-knowledge of distilling demonstrations through a two-stage training
process, which includes meta-distillation pretraining and fine-tuning.
Comprehensive evaluations across seven diverse ICL task partitions using
decoder-only (GPT-2) and encoder-decoder (T5) attest to MEND's prowess. It not
only matches but often outperforms the Vanilla ICL as well as other
state-of-the-art distillation models, while significantly reducing the
computational demands. This innovation promises enhanced scalability and
efficiency for the practical deployment of large language modelsComment: ICLR 202
Optimal subsampling bootstrap for massive data
The bootstrap is a widely used procedure for statistical inference because of its simplicity and attractive statistical properties. However, the vanilla version of bootstrap is no longer feasible computationally for many modern massive datasets due to the need to repeatedly resample the entire data. Therefore, several improvements to the bootstrap method have been made in recent years, which assess the quality of estimators by subsampling the full dataset before resampling the subsamples. Naturally, the performance of these modern subsampling methods is influenced by tuning parameters such as the size of subsamples, the number of subsamples, and the number of resamples per subsample. In this paper, we develop a novel hyperparameter selection methodology for selecting these tuning parameters. Formulated as an optimization problem to find the optimal value of some measure of accuracy of an estimator subject to computational cost, our framework provides closed-form solutions for the optimal hyperparameter values for subsampled bootstrap, subsampled double bootstrap and bag of little bootstraps, at no or little extra time cost. Using the mean square errors as a proxy of the accuracy measure, we apply our methodology to study, compare and improve the performance of these modern versions of bootstrap developed for massive data through numerical study. The results are promising.</p
Numerical Simulation Study on Welding Process of Upper Frame of Hydropower Unit
To optimize the welding process of the upper frame of the hydropower unit, a thermal elastic–plastic (TEP) finite element model of the typical T-joint of the upper frame was established, and the effectiveness and accuracy of the model were verified by welding tests. The effect of welding speed and interlayer cooling time on welding residual stress and deformation was analyzed, and a welding process in line with the requirements was obtained. Based on the results of the TEP calculation, the inherent strain was obtained, and the inherent strain method (ISM) was used to predict the overall deformation of the upper frame under three welding sequence schemes, and the optimal welding sequence was obtained
Unveiling poly(rC)-binding protein 2 as the target protein for curcusone C against prostate cancer: mechanism validation through click chemistry-activity based proteomics profiling approach
Abstract Background Prostate cancer is a disease that seriously troubles men. However, there are some inevitable limitations in interventional therapy for prostate cancer patients at present, most of which are caused by low selectivity and high toxic side effects due to unclear drug targets. In this study, we identified the target protein of Curcusone C with anti-prostate cancer potential activity and verified its target and mechanism of action. Methods Click chemistry-activity based proteomics profiling (CC-ABPP) method was used to find target protein of Curcusone C against prostate cancer. Competitive CC-ABPP, drug affinity responsive target stability (DARTS) and surface plasmon resonance (SPR) methods were used to verifying the target protein. Moreover, potential mechanism was validated by western blot in vitro and by hematoxylin-eosin (HE) staining, detection of apoptosis in tumor tissue (TUNEL), and immunohistochemical (IHC) in vivo. Results We found that poly(rC)-binding protein 2 (PCBP2) was the target protein of Curcusone C. In addition, Curcusone C might disrupt the Bax/Bcl-2 balance in PC-3 cells by inhibiting the expression of the target protein PCBP2, thereby inducing mitochondrial damage and activation of the mitochondrial apoptosis pathway, and ultimately inducing apoptosis of prostate cancer cells. Conclusions Curcusone C is a potential compound with anti-prostate cancer activity, and this effect occurs by targeting the PCBP2 protein, which in turn may affect the TGF/Smad signaling pathway and Bax/Bcl-2 balance. Our results laid a material and theoretical foundation for Curcusone C, to be widely used in anti-prostate cancer
Core–Shell Structure, Biodegradation, and Drug Release Behavior of Poly(lactic acid)/Poly(ethylene glycol) Block Copolymer Micelles Tuned by Macromolecular Stereostructure
Poly(ethylene glycol)-<i>b</i>-poly(l-lactic
acid)-<i>b</i>-poly(d-lactic acid) (PEG-<i>b</i>-PLLA-<i>b</i>-PDLA) stereoblock copolymers were
synthesized by sequential ring-opening polymerization. Their micelle
formation, precise micelle structure, biodegradation, and drug release
behavior were systematically investigated and compared with the PEG-<i>b</i>-poly(lactic acid) (PEG-<i>b</i>-PLA) diblock
copolymers with various PLA stereostructures and PEG-<i>b</i>-PLLA/PEG-<i>b</i>-PDLA enantiomeric mixture. Stereoblock
copolymers having comparable PLLA and PDLA block lengths and enantiomerically-mixed
copolymers assemble into the stereocomplexed core–shell micelles,
while the isotactic and atactic PEG-<i>b</i>-PLA copolymers
formed the homocrystalline and amorphous micelles, respectively. The
PLA segments in stereoblock copolymer micelles show smaller crystallinity
than those in the isotactic and enantiomerically-mixed ones, attributed
to the short block length and presence of covalent junction between
PLLA and PDLA blocks. As indicated by the synchrotron radiation small-angle
X-ray scattering results, the stereoblock copolymer micelles have
larger size, micellar aggregation number, core radius, smaller core
density, and looser packing of core-forming segments than the isotactic
and enantiomerically-mixed copolymer micelles. These unique structural
characteristics cause the stereoblock copolymer micelles to possess
higher drug loading content, slower degradation, and drug release
rates
High-Refractive-Index Chip with Periodically Fine-Tuning Gratings for Tunable Virtual-Wavevector Spatial Frequency Shift Universal Super-Resolution Imaging.
Funder: Zhejiang University Education Foundation Global Partnership FundFunder: Open Foundation of the State Key Laboratory of Modern Optical InstrumentationFunder: Zhejiang University Micro‐Nano Fabrication CenterContinued research in fields such as materials science and biomedicine requires the development of a super-resolution imaging technique with a large field of view (FOV) and deep subwavelength resolution that is compatible with both fluorescent and nonfluorescent samples. Existing on-chip super-resolution methods exclusively focus on either fluorescent or nonfluorescent imaging, and, as such, there is an urgent requirement for a more general technique that is capable of both modes of imaging. In this study, to realize labeled and label-free super-resolution imaging on a single scalable photonic chip, a universal super-resolution imaging method based on the tunable virtual-wavevector spatial frequency shift (TVSFS) principle is introduced. Using this principle, imaging resolution can be improved more than threefold over the diffraction limit of a linear optical system. Here, diffractive units are fabricated on the chip's surface to provide wavevector-variable evanescent wave illumination, enabling tunable spatial frequency shifts in the Fourier space. A large FOV and resolutions of λ/4.7 and λ/7.1 were achieved for label-free and fluorescently labeled samples using a gallium phosphide (GaP) chip. With its large FOV, compatibility with different imaging modes, and monolithic integration, the proposed TVSFS chip may advance fields such as cell engineering, precision industry inspection, and chemical research