195 research outputs found

    Snap-Shot Decentralized Stochastic Gradient Tracking Methods

    Full text link
    In decentralized optimization, mm agents form a network and only communicate with their neighbors, which gives advantages in data ownership, privacy, and scalability. At the same time, decentralized stochastic gradient descent (\texttt{SGD}) methods, as popular decentralized algorithms for training large-scale machine learning models, have shown their superiority over centralized counterparts. Distributed stochastic gradient tracking~(\texttt{DSGT})~\citep{pu2021distributed} has been recognized as the popular and state-of-the-art decentralized \texttt{SGD} method due to its proper theoretical guarantees. However, the theoretical analysis of \dsgt~\citep{koloskova2021improved} shows that its iteration complexity is O~(σˉ2mμε+Lσˉμ(1−λ2(W))1/2CWε)\tilde{\mathcal{O}} \left(\frac{\bar{\sigma}^2}{m\mu \varepsilon} + \frac{\sqrt{L}\bar{\sigma}}{\mu(1 - \lambda_2(W))^{1/2} C_W \sqrt{\varepsilon} }\right), where WW is a double stochastic mixing matrix that presents the network topology and CW C_W is a parameter that depends on WW. Thus, it indicates that the convergence property of \texttt{DSGT} is heavily affected by the topology of the communication network. To overcome the weakness of \texttt{DSGT}, we resort to the snap-shot gradient tracking skill and propose two novel algorithms. We further justify that the proposed two algorithms are more robust to the topology of communication networks under similar algorithmic structures and the same communication strategy to \dsgt~. Compared with \dsgt, their iteration complexity are O(σˉ2mμε+Lσˉμ(1−λ2(W))ε)\mathcal{O}\left( \frac{\bar{\sigma}^2}{m\mu\varepsilon} + \frac{\sqrt{L}\bar{\sigma}}{\mu (1 - \lambda_2(W))\sqrt{\varepsilon}} \right) and O(σˉ2mμε+Lσˉμ(1−λ2(W))1/2ε)\mathcal{O}\left( \frac{\bar{\sigma}^2}{m\mu \varepsilon} + \frac{\sqrt{L}\bar{\sigma}}{\mu (1 - \lambda_2(W))^{1/2}\sqrt{\varepsilon}} \right) which reduce the impact on network topology (no CWC_W)

    PPFL: A Personalized Federated Learning Framework for Heterogeneous Population

    Full text link
    Personalization aims to characterize individual preferences and is widely applied across many fields. However, conventional personalized methods operate in a centralized manner and potentially expose the raw data when pooling individual information. In this paper, with privacy considerations, we develop a flexible and interpretable personalized framework within the paradigm of Federated Learning, called PPFL (Population Personalized Federated Learning). By leveraging canonical models to capture fundamental characteristics among the heterogeneous population and employing membership vectors to reveal clients' preferences, it models the heterogeneity as clients' varying preferences for these characteristics and provides substantial insights into client characteristics, which is lacking in existing Personalized Federated Learning (PFL) methods. Furthermore, we explore the relationship between our method and three main branches of PFL methods: multi-task PFL, clustered FL, and decoupling PFL, and demonstrate the advantages of PPFL. To solve PPFL (a non-convex constrained optimization problem), we propose a novel random block coordinate descent algorithm and present the convergence property. We conduct experiments on both pathological and practical datasets, and the results validate the effectiveness of PPFL.Comment: 38 pages, 11 figure

    Reasonable Planning of King County's E-bus Replacement Plan

    Get PDF
    With the increasingly intensified global warming severe air pollutions, governments all over the world have begun or is right now looking for ways to fix the problem. Among all the solutions, sustainable urban transportation system is what many governments pay attention to because of their apparent contribution to the reduction in greenhouse gases emissions and pollutants. In this passage, we focus on the exact ecology impacts the promotion of e-buses will cause. On the other hand, the potential financial burdens the transitioning processes will bring are perceived by us in order to make a decent plan for the government to implement.For the first part of constructing the model to measure the ecology impact of transition in one area, we start with the identification of King County as a metropolitan area suitable for prediction. Then we collect information and data of its local bus fleet and e-bus transitioning plan, find out the exact number of each type of buses (diesel, hybrid and electric) around these years and the emissions of corresponding buses. We use both ARIMA and Least Squares Regression to predict the number these buses in the future until the year the local government aim to complete the plan but choose the result displayed by the better one. In this case, we obtain the data for emission of carbon dioxide, oxynitride and PM10 in each year and evaluate the ecological impact. At the same time, we predict the data for the emissions later if the bus fleet keep the same and observed over 90% of decline when we compare the value after transition to the control group.Afterwards, to estimate the financial cost, we identify the main parts involved in the processes of transitioning, classify them with one group of cost for long-term and the other for short-term---that is, changed as the plan is gradually implemented. We build models for each factor we identified and use Riemann Sum to unify the long-term and short-term costs. Based on the data we predict in problem one and from the local government’s website, we easily gain the financial implications of for about 50 million dollars in King County. Through the analysis of the government’s grant in other area, we roughly verify that 50 million dollars would be an acceptable cost.Finally, in our 10-year roadmap development, we explored further into the population distribution of King County and the existing traits for public transportation. Our starting point in this article is to classify the urban pattern into four types of bus operation routes, and then define the passenger flow from high to low. At the same time, we define the total number of vehicles that need to be replaced for different types to develop transportation replacement plans. Based on our assumptions, calculate the carrying capacity of each type of bus in line with the passenger flow. Eventually, it can be proven that the total carrying capacity of the bus meets the transportation needs of all King counties. Then apply to other regions

    Enhancing Traffic Prediction with Learnable Filter Module

    Full text link
    Modeling future traffic conditions often relies heavily on complex spatial-temporal neural networks to capture spatial and temporal correlations, which can overlook the inherent noise in the data. This noise, often manifesting as unexpected short-term peaks or drops in traffic observation, is typically caused by traffic accidents or inherent sensor vibration. In practice, such noise can be challenging to model due to its stochastic nature and can lead to overfitting risks if a neural network is designed to learn this behavior. To address this issue, we propose a learnable filter module to filter out noise in traffic data adaptively. This module leverages the Fourier transform to convert the data to the frequency domain, where noise is filtered based on its pattern. The denoised data is then recovered to the time domain using the inverse Fourier transform. Our approach focuses on enhancing the quality of the input data for traffic prediction models, which is a critical yet often overlooked aspect in the field. We demonstrate that the proposed module is lightweight, easy to integrate with existing models, and can significantly improve traffic prediction performance. Furthermore, we validate our approach with extensive experimental results on real-world datasets, showing that it effectively mitigates noise and enhances prediction accuracy

    Accelerated Computation of Free Energy Profile at ab Initio Quantum Mechanical/Molecular Mechanics Accuracy via a Semi-Empirical Reference Potential. I. Weighted Thermodynamics Perturbation

    Full text link
    Free energy profile (FE Profile) is an essential quantity for the estimation of reaction rate and the validation of reaction mechanism. For chemical reactions in condensed phase or enzymatic reactions, the computation of FE profile at ab initio (ai) quantum mechanical/molecular mechanics (QM/MM) level is still far too expensive. Semiempirical (SE) method can be hundreds or thousands of times faster than the ai methods. However, the accuracy of SE methods is often unsatisfactory, due to the approximations that have been adopted in these methods. In this work, we proposed a new method termed MBAR+wTP, in which the ai QM/MM free energy profile is computed by a weighted thermodynamic perturbation (TP) correction to the SE profile generated by the multistate Bennett acceptance ratio (MBAR) analysis of the trajectories from umbrella samplings (US). The weight factors used in the TP calculations are a byproduct of the MBAR analysis in the post-processing of the US trajectories, which are often discarded after the free energy calculations. The results show that this approach can enhance the efficiency of ai FE profile calculations by several orders of magnitude

    DiffTraj: Generating GPS Trajectory with Diffusion Probabilistic Model

    Full text link
    Pervasive integration of GPS-enabled devices and data acquisition technologies has led to an exponential increase in GPS trajectory data, fostering advancements in spatial-temporal data mining research. Nonetheless, GPS trajectories contain personal geolocation information, rendering serious privacy concerns when working with raw data. A promising approach to address this issue is trajectory generation, which involves replacing original data with generated, privacy-free alternatives. Despite the potential of trajectory generation, the complex nature of human behavior and its inherent stochastic characteristics pose challenges in generating high-quality trajectories. In this work, we propose a spatial-temporal diffusion probabilistic model for trajectory generation (DiffTraj). This model effectively combines the generative abilities of diffusion models with the spatial-temporal features derived from real trajectories. The core idea is to reconstruct and synthesize geographic trajectories from white noise through a reverse trajectory denoising process. Furthermore, we propose a Trajectory UNet (Traj-UNet) deep neural network to embed conditional information and accurately estimate noise levels during the reverse process. Experiments on two real-world datasets show that DiffTraj can be intuitively applied to generate high-fidelity trajectories while retaining the original distributions. Moreover, the generated results can support downstream trajectory analysis tasks and significantly outperform other methods in terms of geo-distribution evaluations

    MUSIED: A Benchmark for Event Detection from Multi-Source Heterogeneous Informal Texts

    Full text link
    Event detection (ED) identifies and classifies event triggers from unstructured texts, serving as a fundamental task for information extraction. Despite the remarkable progress achieved in the past several years, most research efforts focus on detecting events from formal texts (e.g., news articles, Wikipedia documents, financial announcements). Moreover, the texts in each dataset are either from a single source or multiple yet relatively homogeneous sources. With massive amounts of user-generated text accumulating on the Web and inside enterprises, identifying meaningful events in these informal texts, usually from multiple heterogeneous sources, has become a problem of significant practical value. As a pioneering exploration that expands event detection to the scenarios involving informal and heterogeneous texts, we propose a new large-scale Chinese event detection dataset based on user reviews, text conversations, and phone conversations in a leading e-commerce platform for food service. We carefully investigate the proposed dataset's textual informality and multi-source heterogeneity characteristics by inspecting data samples quantitatively and qualitatively. Extensive experiments with state-of-the-art event detection methods verify the unique challenges posed by these characteristics, indicating that multi-source informal event detection remains an open problem and requires further efforts. Our benchmark and code are released at \url{https://github.com/myeclipse/MUSIED}.Comment: Accepted at EMNLP 202

    mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar

    Full text link
    Millimeter Wave (mmWave) Radar is gaining popularity as it can work in adverse environments like smoke, rain, snow, poor lighting, etc. Prior work has explored the possibility of reconstructing 3D skeletons or meshes from the noisy and sparse mmWave Radar signals. However, it is unclear how accurately we can reconstruct the 3D body from the mmWave signals across scenes and how it performs compared with cameras, which are important aspects needed to be considered when either using mmWave radars alone or combining them with cameras. To answer these questions, an automatic 3D body annotation system is first designed and built up with multiple sensors to collect a large-scale dataset. The dataset consists of synchronized and calibrated mmWave radar point clouds and RGB(D) images in different scenes and skeleton/mesh annotations for humans in the scenes. With this dataset, we train state-of-the-art methods with inputs from different sensors and test them in various scenarios. The results demonstrate that 1) despite the noise and sparsity of the generated point clouds, the mmWave radar can achieve better reconstruction accuracy than the RGB camera but worse than the depth camera; 2) the reconstruction from the mmWave radar is affected by adverse weather conditions moderately while the RGB(D) camera is severely affected. Further, analysis of the dataset and the results shadow insights on improving the reconstruction from the mmWave radar and the combination of signals from different sensors.Comment: ACM Multimedia 2022, Project Page: https://chen3110.github.io/mmbody/index.htm
    • …
    corecore