312 research outputs found
Optimizing the Long-Term Operating Plan of Railway Marshalling Station for Capacity Utilization Analysis
Not only is the operating plan the basis of organizing marshalling stationâs operation, but it is also used to analyze in detail the capacity utilization of each facility in marshalling station. In this paper, a long-term operating plan is optimized mainly for capacity utilization analysis. Firstly, a model is developed to minimize railcarsâ average staying time with the constraints of minimum time intervals, marshalling track capacity, and so forth. Secondly, an algorithm is designed to solve this model based on genetic algorithm (GA) and simulation method. It divides the plan of whole planning horizon into many subplans, and optimizes them with GA one by one in order to obtain a satisfactory plan with less computing time. Finally, some numeric examples are constructed to analyze (1) the convergence of the algorithm, (2) the effect of some algorithm parameters, and (3) the influence of arrival train flow on the algorithm
Crew Scheduling Considering both Crew Duty Time Difference and Cost on Urban Rail System
Urban rail crew scheduling problem is to allocate train services to crews based on a given train timetable while satisfying all the operational and contractual requirements. In this paper, we present a new mathematical programming model with the aim of minimizing both the related costs of crew duty and the variance of duty time spreads. In addition to iincorporating the commonly encountered crew scheduling constraints, it also takes into consideration the constraint of arranging crews having a meal in the specific meal period of one day rather than after a minimum continual service time. The proposed model is solved by an ant colony algorithm which is built based on the construction of ant travel network and the design of ant travel path choosing strategy. The performances of the model and the algorithm are evaluated by conducting case study on Changsha urban rail. The results indicate that the proposed method can obtain a satisfactory crew schedule for urban rails with a relatively small computational time
Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
In this paper, we present a new approach for model acceleration by exploiting
spatial sparsity in visual data. We observe that the final prediction in vision
Transformers is only based on a subset of the most informative tokens, which is
sufficient for accurate image recognition. Based on this observation, we
propose a dynamic token sparsification framework to prune redundant tokens
progressively and dynamically based on the input to accelerate vision
Transformers. Specifically, we devise a lightweight prediction module to
estimate the importance score of each token given the current features. The
module is added to different layers to prune redundant tokens hierarchically.
While the framework is inspired by our observation of the sparse attention in
vision Transformers, we find the idea of adaptive and asymmetric computation
can be a general solution for accelerating various architectures. We extend our
method to hierarchical models including CNNs and hierarchical vision
Transformers as well as more complex dense prediction tasks that require
structured feature maps by formulating a more generic dynamic spatial
sparsification framework with progressive sparsification and asymmetric
computation for different spatial locations. By applying lightweight fast paths
to less informative features and using more expressive slow paths to more
important locations, we can maintain the structure of feature maps while
significantly reducing the overall computations. Extensive experiments
demonstrate the effectiveness of our framework on various modern architectures
and different visual recognition tasks. Our results clearly demonstrate that
dynamic spatial sparsification offers a new and more effective dimension for
model acceleration. Code is available at
https://github.com/raoyongming/DynamicViTComment: Accepted to T-PAMI. Journal version of our NeurIPS 2021 work:
arXiv:2106.02034. Code is available at
https://github.com/raoyongming/DynamicVi
UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models
Diffusion probabilistic models (DPMs) have demonstrated a very promising
ability in high-resolution image synthesis. However, sampling from a
pre-trained DPM is time-consuming due to the multiple evaluations of the
denoising network, making it more and more important to accelerate the sampling
of DPMs. Despite recent progress in designing fast samplers, existing methods
still cannot generate satisfying images in many applications where fewer steps
(e.g., 10) are favored. In this paper, we develop a unified corrector (UniC)
that can be applied after any existing DPM sampler to increase the order of
accuracy without extra model evaluations, and derive a unified predictor (UniP)
that supports arbitrary order as a byproduct. Combining UniP and UniC, we
propose a unified predictor-corrector framework called UniPC for the fast
sampling of DPMs, which has a unified analytical form for any order and can
significantly improve the sampling quality over previous methods, especially in
extremely few steps. We evaluate our methods through extensive experiments
including both unconditional and conditional sampling using pixel-space and
latent-space DPMs. Our UniPC can achieve 3.87 FID on CIFAR10 (unconditional)
and 7.51 FID on ImageNet 256256 (conditional) with only 10 function
evaluations. Code is available at https://github.com/wl-zhao/UniPC.Comment: Accepted by NeurIPS 2023. Project page:
https://unipc.ivg-research.xy
DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery
The recovery of occluded human meshes presents challenges for current methods
due to the difficulty in extracting effective image features under severe
occlusion. In this paper, we introduce DPMesh, an innovative framework for
occluded human mesh recovery that capitalizes on the profound diffusion prior
about object structure and spatial relationships embedded in a pre-trained
text-to-image diffusion model. Unlike previous methods reliant on conventional
backbones for vanilla feature extraction, DPMesh seamlessly integrates the
pre-trained denoising U-Net with potent knowledge as its image backbone and
performs a single-step inference to provide occlusion-aware information. To
enhance the perception capability for occluded poses, DPMesh incorporates
well-designed guidance via condition injection, which produces effective
controls from 2D observations for the denoising U-Net. Furthermore, we explore
a dedicated noisy key-point reasoning approach to mitigate disturbances arising
from occlusion and crowded scenarios. This strategy fully unleashes the
perceptual capability of the diffusion prior, thereby enhancing accuracy.
Extensive experiments affirm the efficacy of our framework, as we outperform
state-of-the-art methods on both occlusion-specific and standard datasets. The
persuasive results underscore its ability to achieve precise and robust 3D
human mesh recovery, particularly in challenging scenarios involving occlusion
and crowded scenes.Comment: Accepted by IEEE/CVF Conference on Computer Vision and Pattern
Recognition (CVPR) 202
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
Talking head synthesis is a promising approach for the video production
industry. Recently, a lot of effort has been devoted in this research area to
improve the generation quality or enhance the model generalization. However,
there are few works able to address both issues simultaneously, which is
essential for practical applications. To this end, in this paper, we turn
attention to the emerging powerful Latent Diffusion Models, and model the
Talking head generation as an audio-driven temporally coherent denoising
process (DiffTalk). More specifically, instead of employing audio signals as
the single driving factor, we investigate the control mechanism of the talking
face, and incorporate reference face images and landmarks as conditions for
personality-aware generalized synthesis. In this way, the proposed DiffTalk is
capable of producing high-quality talking head videos in synchronization with
the source audio, and more importantly, it can be naturally generalized across
different identities without any further fine-tuning. Additionally, our
DiffTalk can be gracefully tailored for higher-resolution synthesis with
negligible extra computational cost. Extensive experiments show that the
proposed DiffTalk efficiently synthesizes high-fidelity audio-driven talking
head videos for generalized novel identities. For more video results, please
refer to \url{https://sstzal.github.io/DiffTalk/}.Comment: Project page https://sstzal.github.io/DiffTalk
Integrated Optimization of Service-Oriented Train Plan and Schedule on Intercity Rail Network with Varying Demand
For a better service level of a train operating plan, we propose an integrated optimization method of train planning and train scheduling, which generally are optimized, respectively. Based on the cost analysis of both passengers travelling and enterprises operation, and the constraint analysis of trains operation, we construct a multiobjective function and build an integrated optimization model with the aim of reducing both passenger travel costs and enterprise operating costs. Then, a solving algorithm is established based on the simulated annealing algorithm. Finally, using as an example the Changzhutan intercity rail network, as an example we analyze the optimized results and the influence of the model parameters on the results
Reliability analysis of all components in structural systems based on adaptive point estimate method and the principle of maximum entropy
Date: May 14 (Mon), 2018Place: ROHM Plaza Meeting Room, Kyoto University Katsura Campus, Kyoto, JAPANSupported by JSPS-NSFC Japan-China Scientific Cooperation ProjectOrganized by Structural Engineering of Buildings Laboratory, Department of Architecture and Architectural Engineering, Kyoto Universit
PEACE: Prototype lEarning Augmented transferable framework for Cross-domain rEcommendation
To help merchants/customers to provide/access a variety of services through
miniapps, online service platforms have occupied a critical position in the
effective content delivery, in which how to recommend items in the new domain
launched by the service provider for customers has become more urgent. However,
the non-negligible gap between the source and diversified target domains poses
a considerable challenge to cross-domain recommendation systems, which often
leads to performance bottlenecks in industrial settings. While entity graphs
have the potential to serve as a bridge between domains, rudimentary
utilization still fail to distill useful knowledge and even induce the negative
transfer issue. To this end, we propose PEACE, a Prototype lEarning Augmented
transferable framework for Cross-domain rEcommendation. For domain gap
bridging, PEACE is built upon a multi-interest and entity-oriented pre-training
architecture which could not only benefit the learning of generalized knowledge
in a multi-granularity manner, but also help leverage more structural
information in the entity graph. Then, we bring the prototype learning into the
pre-training over source domains, so that representations of users and items
are greatly improved by the contrastive prototype learning module and the
prototype enhanced attention mechanism for adaptive knowledge utilization. To
ease the pressure of online serving, PEACE is carefully deployed in a
lightweight manner, and significant performance improvements are observed in
both online and offline environments.Comment: Accepted by WSDM 202
- âŠ