133 research outputs found

    Linearized Alternating Direction Method with Parallel Splitting and Adaptive Penalty for Separable Convex Programs in Machine Learning

    Full text link
    Many problems in machine learning and other fields can be (re)for-mulated as linearly constrained separable convex programs. In most of the cases, there are multiple blocks of variables. However, the traditional alternating direction method (ADM) and its linearized version (LADM, obtained by linearizing the quadratic penalty term) are for the two-block case and cannot be naively generalized to solve the multi-block case. So there is great demand on extending the ADM based methods for the multi-block case. In this paper, we propose LADM with parallel splitting and adaptive penalty (LADMPSAP) to solve multi-block separable convex programs efficiently. When all the component objective functions have bounded subgradients, we obtain convergence results that are stronger than those of ADM and LADM, e.g., allowing the penalty parameter to be unbounded and proving the sufficient and necessary conditions} for global convergence. We further propose a simple optimality measure and reveal the convergence rate of LADMPSAP in an ergodic sense. For programs with extra convex set constraints, with refined parameter estimation we devise a practical version of LADMPSAP for faster convergence. Finally, we generalize LADMPSAP to handle programs with more difficult objective functions by linearizing part of the objective function as well. LADMPSAP is particularly suitable for sparse representation and low-rank recovery problems because its subproblems have closed form solutions and the sparsity and low-rankness of the iterates can be preserved during the iteration. It is also highly parallelizable and hence fits for parallel or distributed computing. Numerical experiments testify to the advantages of LADMPSAP in speed and numerical accuracy.Comment: Preliminary version published on Asian Conference on Machine Learning 201

    Study of Rail Transit and Urban Spatial Structure Based on Urban Economics

    Get PDF
    The spatial changes of utilization intensity of urban lands are decided by the dual substitute relation of transportation costs and rent’s substitute and elements substitute (producer) or consumption substitute (residence). The land use intensity affects the urban spatial form directly. This paper aims to study the relation between construction of rail transit and urban spatial form from the perspectives of urban economics, urban traffic conditions and spatial structure evolution. It takes the metropolitan areas of Tokyo and Singapore as sample cases to analyse the influence of urban development brought by the rail transit

    AdvMono3D: Advanced Monocular 3D Object Detection with Depth-Aware Robust Adversarial Training

    Full text link
    Monocular 3D object detection plays a pivotal role in the field of autonomous driving and numerous deep learning-based methods have made significant breakthroughs in this area. Despite the advancements in detection accuracy and efficiency, these models tend to fail when faced with such attacks, rendering them ineffective. Therefore, bolstering the adversarial robustness of 3D detection models has become a crucial issue that demands immediate attention and innovative solutions. To mitigate this issue, we propose a depth-aware robust adversarial training method for monocular 3D object detection, dubbed DART3D. Specifically, we first design an adversarial attack that iteratively degrades the 2D and 3D perception capabilities of 3D object detection models(IDP), serves as the foundation for our subsequent defense mechanism. In response to this attack, we propose an uncertainty-based residual learning method for adversarial training. Our adversarial training approach capitalizes on the inherent uncertainty, enabling the model to significantly improve its robustness against adversarial attacks. We conducted extensive experiments on the KITTI 3D datasets, demonstrating that DART3D surpasses direct adversarial training (the most popular approach) under attacks in 3D object detection APR40AP_{R40} of car category for the Easy, Moderate, and Hard settings, with improvements of 4.415%, 4.112%, and 3.195%, respectively

    Dependency of the Finite-Impulse-Response-Based Head-Related Impulse Response Model on Filter Order

    Get PDF
    Various approaches have been reported on HRIR modeling to lighten the high computation cost of the 3-D audio systems without sacrificing the quality of the rendered sounds. The performance of these HRIR models have been widely evaluated usually in terms of the objective estimation errors between the original measured HRIRs and the modeled HRIRs. However, it is still unclear how much these objective evaluation results match the psychoacoustic evaluations. In this research, an efficient finite-impulse-response (FIR) model is studied as a case study which is essentially based on the concept of the minimum-phase modeling technique. The accuracy dependency of this modeling approach on the order of FIR filter is examined with the objective estimation errors and the psychoacoustic tests. In the psychoacoustic tests, the MIT HRIR database are exploited and evaluated in terms of sound source localization difference and sound quality difference by comparing the synthesized stimuli with the measured HRIRs and those with the FIR models of different orders. Results indicated that the measured hundred-sample-length HRIRs can be sufficiently modeled by the low-order FIR model from the perceptual point of view, and provided the relationship between perceptual sound localization/quality difference and the objective estimation results that should be useful for evaluating the other HRIR modeling approaches

    Fearless Luminance Adaptation: A Macro-Micro-Hierarchical Transformer for Exposure Correction

    Full text link
    Photographs taken with less-than-ideal exposure settings often display poor visual quality. Since the correction procedures vary significantly, it is difficult for a single neural network to handle all exposure problems. Moreover, the inherent limitations of convolutions, hinder the models ability to restore faithful color or details on extremely over-/under- exposed regions. To overcome these limitations, we propose a Macro-Micro-Hierarchical transformer, which consists of a macro attention to capture long-range dependencies, a micro attention to extract local features, and a hierarchical structure for coarse-to-fine correction. In specific, the complementary macro-micro attention designs enhance locality while allowing global interactions. The hierarchical structure enables the network to correct exposure errors of different scales layer by layer. Furthermore, we propose a contrast constraint and couple it seamlessly in the loss function, where the corrected image is pulled towards the positive sample and pushed away from the dynamically generated negative samples. Thus the remaining color distortion and loss of detail can be removed. We also extend our method as an image enhancer for low-light face recognition and low-light semantic segmentation. Experiments demonstrate that our approach obtains more attractive results than state-of-the-art methods quantitatively and qualitatively.Comment: Accepted by ACM MM 202

    Multiple Methods to Partition Evapotranspiration in a Maize Field

    Get PDF
    Partitioning evapotranspiration (ET) into soil evaporation E and plant transpiration T is important, but it is still a theoretical and technical challenge. The isotopic technique is considered to be an effective method, but it is difficult to quantify the isotopic composition of transpiration δT and evaporation δE directly and continuously; few previous studies determined δT successfully under a non-steady state (NSS). Here, multiple methods were used to partition ET in a maize field and a new flow-through chamber system was refined to provide direct and continuous measurement of δT and δE. An eddy covariance and lysimeter (EC-L)-based method and two isotope-based methods [isotope combined with the Craig–Gordon model (Iso-CG) and isotope using chamber measurement (Iso-M)] were applied to partition ET. Results showed the transpiration fraction FT in Iso-CG was consistent with EC-L at both diurnal and growing season time scales, but FT calculated by Iso-M was less than Iso-CG and EC-L. The chamber system method presented here to determine δT under NSS and isotope steady state (ISS) was robust, but there could be some deviation in measuring δE. The FT varied from 52% to 91%, with a mean of 78% during the entire growing season, and it was well described by a function of LAI, with a nonlinear relationship of FT = 0.71LAI0.14. The results demonstrated the feasibility of the isotope-based chamber system to partition ET. This technique and its further development may enable field ET partitioning accurately and continuously and improve understanding of water cycling through the soil–plant–atmosphere continuum

    Online low-rank representation learning for joint multi-subspace recovery and clustering

    Get PDF
    Benefiting from global rank constraints, the lowrank representation (LRR) method has been shown to be an effective solution to subspace learning. However, the global mechanism also means that the LRR model is not suitable for handling large-scale data or dynamic data. For large-scale data, the LRR method suffers from high time complexity, and for dynamic data, it has to recompute a complex rank minimization for the entire data set whenever new samples are dynamically added, making it prohibitively expensive. Existing attempts to online LRR either take a stochastic approach or build the representation purely based on a small sample set and treat new input as out-of-sample data. The former often requires multiple runs for good performance and thus takes longer time to run, and the latter formulates online LRR as an out-ofsample classification problem and is less robust to noise. In this paper, a novel online low-rank representation subspace learning method is proposed for both large-scale and dynamic data. The proposed algorithm is composed of two stages: static learning and dynamic updating. In the first stage, the subspace structure is learned from a small number of data samples. In the second stage, the intrinsic principal components of the entire data set are computed incrementally by utilizing the learned subspace structure, and the low-rank representation matrix can also be incrementally solved by an efficient online singular value decomposition (SVD) algorithm. The time complexity is reduced dramatically for large-scale data, and repeated computation is avoided for dynamic problems. We further perform theoretical analysis comparing the proposed online algorithm with the batch LRR method. Finally, experimental results on typical tasks of subspace recovery and subspace clustering show that the proposed algorithm performs comparably or better than batch methods including the batch LRR, and significantly outperforms state-of-the-art online methods

    ASF-Net: Robust Video Deraining via Temporal Alignment and Online Adaptive Learning

    Full text link
    In recent times, learning-based methods for video deraining have demonstrated commendable results. However, there are two critical challenges that these methods are yet to address: exploiting temporal correlations among adjacent frames and ensuring adaptability to unknown real-world scenarios. To overcome these challenges, we explore video deraining from a paradigm design perspective to learning strategy construction. Specifically, we propose a new computational paradigm, Alignment-Shift-Fusion Network (ASF-Net), which incorporates a temporal shift module. This module is novel to this field and provides deeper exploration of temporal information by facilitating the exchange of channel-level information within the feature space. To fully discharge the model's characterization capability, we further construct a LArge-scale RAiny video dataset (LARA) which also supports the development of this community. On the basis of the newly-constructed dataset, we explore the parameters learning process by developing an innovative re-degraded learning strategy. This strategy bridges the gap between synthetic and real-world scenes, resulting in stronger scene adaptability. Our proposed approach exhibits superior performance in three benchmarks and compelling visual quality in real-world scenarios, underscoring its efficacy. The code is available at https://github.com/vis-opt-group/ASF-Net

    From Text to Pixels: A Context-Aware Semantic Synergy Solution for Infrared and Visible Image Fusion

    Full text link
    With the rapid progression of deep learning technologies, multi-modality image fusion has become increasingly prevalent in object detection tasks. Despite its popularity, the inherent disparities in how different sources depict scene content make fusion a challenging problem. Current fusion methodologies identify shared characteristics between the two modalities and integrate them within this shared domain using either iterative optimization or deep learning architectures, which often neglect the intricate semantic relationships between modalities, resulting in a superficial understanding of inter-modal connections and, consequently, suboptimal fusion outcomes. To address this, we introduce a text-guided multi-modality image fusion method that leverages the high-level semantics from textual descriptions to integrate semantics from infrared and visible images. This method capitalizes on the complementary characteristics of diverse modalities, bolstering both the accuracy and robustness of object detection. The codebook is utilized to enhance a streamlined and concise depiction of the fused intra- and inter-domain dynamics, fine-tuned for optimal performance in detection tasks. We present a bilevel optimization strategy that establishes a nexus between the joint problem of fusion and detection, optimizing both processes concurrently. Furthermore, we introduce the first dataset of paired infrared and visible images accompanied by text prompts, paving the way for future research. Extensive experiments on several datasets demonstrate that our method not only produces visually superior fusion results but also achieves a higher detection mAP over existing methods, achieving state-of-the-art results.Comment: 10 pages, 12 figures, 3 tables, conferenc
    • …
    corecore