75 research outputs found

    Cascaded Detail-Preserving Networks for Super-Resolution of Document Images

    Full text link
    The accuracy of OCR is usually affected by the quality of the input document image and different kinds of marred document images hamper the OCR results. Among these scenarios, the low-resolution image is a common and challenging case. In this paper, we propose the cascaded networks for document image super-resolution. Our model is composed by the Detail-Preserving Networks with small magnification. The loss function with perceptual terms is designed to simultaneously preserve the original patterns and enhance the edge of the characters. These networks are trained with the same architecture and different parameters and then assembled into a pipeline model with a larger magnification. The low-resolution images can upscale gradually by passing through each Detail-Preserving Network until the final high-resolution images. Through extensive experiments on two scanning document image datasets, we demonstrate that the proposed approach outperforms recent state-of-the-art image super-resolution methods, and combining it with standard OCR system lead to signification improvements on the recognition results

    Learning to Generalize Provably in Learning to Optimize

    Full text link
    Learning to optimize (L2O) has gained increasing popularity, which automates the design of optimizers by data-driven approaches. However, current L2O methods often suffer from poor generalization performance in at least two folds: (i) applying the L2O-learned optimizer to unseen optimizees, in terms of lowering their loss function values (optimizer generalization, or ``generalizable learning of optimizers"); and (ii) the test performance of an optimizee (itself as a machine learning model), trained by the optimizer, in terms of the accuracy over unseen data (optimizee generalization, or ``learning to generalize"). While the optimizer generalization has been recently studied, the optimizee generalization (or learning to generalize) has not been rigorously studied in the L2O context, which is the aim of this paper. We first theoretically establish an implicit connection between the local entropy and the Hessian, and hence unify their roles in the handcrafted design of generalizable optimizers as equivalent metrics of the landscape flatness of loss functions. We then propose to incorporate these two metrics as flatness-aware regularizers into the L2O framework in order to meta-train optimizers to learn to generalize, and theoretically show that such generalization ability can be learned during the L2O meta-training process and then transformed to the optimizee loss function. Extensive experiments consistently validate the effectiveness of our proposals with substantially improved generalization on multiple sophisticated L2O models and diverse optimizees. Our code is available at: https://github.com/VITA-Group/Open-L2O/tree/main/Model_Free_L2O/L2O-Entropy.Comment: This paper is accepted in AISTATS 202

    MCAD: Multi-teacher Cross-modal Alignment Distillation for efficient image-text retrieval

    Full text link
    With the success of large-scale visual-language pretraining models and the wide application of image-text retrieval in industry areas, reducing the model size and streamlining their terminal-device deployment have become urgently necessary. The mainstream model structures for image-text retrieval are single-stream and dual-stream, both aiming to close the semantic gap between visual and textual modalities. Dual-stream models excel at offline indexing and fast inference, while single-stream models achieve more accurate cross-model alignment by employing adequate feature fusion. We propose a multi-teacher cross-modality alignment distillation (MCAD) technique to integrate the advantages of single-stream and dual-stream models. By incorporating the fused single-stream features into the image and text features of the dual-stream model, we formulate new modified teacher features and logits. Then, we conduct both logit and feature distillation to boost the capability of the student dual-stream model, achieving high retrieval performance without increasing inference complexity. Extensive experiments demonstrate the remarkable performance and high efficiency of MCAD on image-text retrieval tasks. Furthermore, we implement a mobile CLIP model on Snapdragon clips with only 93M running memory and 30ms search latency, without apparent performance degradation of the original large CLIP

    Enhanced soil aggregate stability limits colloidal phosphorus loss potentials in agricultural systems

    Get PDF
    BackgroundColloid-facilitated phosphorus (P) transport is recognized as an important pathway for the loss of soil P in agricultural systems; however, information regarding soil aggregate-associated colloidal P (Pcoll) is lacking. To elucidate the effects of aggregate size on the potential loss of Pcoll in agricultural systems, soils (0–20 cm depth) from six land-use types were sampled in the Zhejiang Province in the Yangtze River Delta region, China. The aggregate size fractions (2–8 mm, 0.26–2 mm, 0.053–0.26 mm and < 0.053 mm) were separated using the wet sieving method. Colloidal P and other soil parameters in aggregates were analyzed.ResultsOur study demonstrated that 0.26–2 mm small macroaggregates had the highest total P (TP) content. In acidic soils, the highest Pcoll content was observed in the 0.26- to 2-mm-sized aggregates, while the lowest was reported in the < 0.053 mm (silt + clay)-sized particles, the opposite of that revealed in alkaline and neutral soils. Paddy soils contained less Pcoll than other land-use types. The proportion of Pcoll in total dissolved P (TDP) was dominated by < 0.053 mm (silt + clay)-sized particles. Aggregate size strongly influenced the loss potential of Pcoll in paddy soils, where Pcoll contributed up to 83% TDP in the silt + clay-sized particles. The Pcoll content was positively correlated with TP, Al, Fe, and the mean weight diameter. Aggregate-associated total carbon (TC), total nitrogen (TN), C/P, and C/N had significant negative effects on the contribution of Pcoll to potential soil P loss. The Pcoll content of the aggregates was controlled by the aggregate-associated TP and Al content, as well as the soil pH value. The potential loss of Pcoll from aggregates was controlled by its organic matter content.ConclusionWe concluded that management practices that increase soil aggregate stability or its organic carbon content will limit Pcoll loss in agricultural systems

    Multi-dimensional variables and feature parameter selection for aboveground biomass estimation of potato based on UAV multispectral imagery

    Get PDF
    Aboveground biomass (AGB) is an essential assessment of plant development and guiding agricultural production management in the field. Therefore, efficient and accurate access to crop AGB information can provide a timely and precise yield estimation, which is strong evidence for securing food supply and trade. In this study, the spectral, texture, geometric, and frequency-domain variables were extracted through multispectral imagery of drones, and each variable importance for different dimensional parameter combinations was computed by three feature parameter selection methods. The selected variables from the different combinations were used to perform potato AGB estimation. The results showed that compared with no feature parameter selection, the accuracy and robustness of the AGB prediction models were significantly improved after parameter selection. The random forest based on out-of-bag (RF-OOB) method was proved to be the most effective feature selection method, and in combination with RF regression, the coefficient of determination (R2) of the AGB validation model could reach 0.90, with root mean square error (RMSE), mean absolute error (MAE), and normalized RMSE (nRMSE) of 71.68 g/m2, 51.27 g/m2, and 11.56%, respectively. Meanwhile, the regression models of the RF-OOB method provided a good solution to the problem that high AGB values were underestimated with the variables of four dimensions. Moreover, the precision of AGB estimates was improved as the dimensionality of parameters increased. This present work can contribute to a rapid, efficient, and non-destructive means of obtaining AGB information for crops as well as provide technical support for high-throughput plant phenotypes screening

    Viral neutralization by antibody-imposed physical disruption

    Get PDF
    中和抗体是机体抵御病毒入侵的一类免疫球蛋白,也是疫苗发挥作用的主要效应分子。目前已知的中和抗体作用机制,主要包括阻断病毒-细胞相互作用和介导免疫调理作用。最近我校夏宁邵教授团队研究结果揭示了一种由抗体诱导病毒原位崩解的中和新机制。该研究首次揭示了抗体的直接物理碰撞中和机制,并提出诱导这类中和抗体的方法,有助于病毒保护性抗体和疫苗设计,适用于多种病原体,而不仅限于戊型肝炎病毒。分子疫苗学和分子诊断学国家重点实验室夏宁邵教授、李少伟教授和顾颖副教授为该论文的共同通讯作者,郑清炳博士、硕士生蒋婕、博士生何茂洲和郑子峥副教授为共同第一作者。In adaptive immunity, organisms produce neutralizing antibodies (nAbs) to eliminate invading pathogens. Here, we explored whether viral neutralization could be attained through the physical disruption of a virus upon nAb binding. We report the neutralization mechanism of a potent nAb 8C11 against the hepatitis E virus (HEV), a nonenveloped positive-sense single-stranded RNA virus associated with abundant acute hepatitis. The 8C11 binding flanks the protrusion spike of the HEV viruslike particles (VLPs) and leads to tremendous physical collision between the antibody and the capsid, dissociating the VLPs into homodimer species within 2 h. Cryo-electron microscopy reconstruction of the dissociation intermediates at an earlier (15-min) stage revealed smeared protrusion spikes and a loss of icosahedral symmetry with the capsid core remaining unchanged. This structural disruption leads to the presence of only a few native HEV virions in the ultracentrifugation pellet and exposes the viral genome. Conceptually, we propose a strategy to raise collision-inducing nAbs against single spike moieties that feature in the context of the entire pathogen at positions where the neighboring space cannot afford to accommodate an antibody. This rationale may facilitate unique vaccine development and antimicrobial antibody design.This research was supported by grants from the Natural Science Foundation of Fujian Province (Grant 2017J07005), the National Science and Technology Major Project of Infectious Diseases (Grant 2018ZX10101001-002), and the National Natural Science Foundation of China (Grants 81871247, 81991490, and 81571996).国家自然科学基金重大项目、海峡联合项目和面上项目、福建省自然科学杰出青年基金、国家传染病科技重大专项等资助了该项研究
    corecore