70 research outputs found
Causal conditional hidden Markov model for multimodal traffic prediction
Multimodal traffic flow can reflect the health of the transportation system,
and its prediction is crucial to urban traffic management. Recent works
overemphasize spatio-temporal correlations of traffic flow, ignoring the
physical concepts that lead to the generation of observations and their causal
relationship. Spatio-temporal correlations are considered unstable under the
influence of different conditions, and spurious correlations may exist in
observations. In this paper, we analyze the physical concepts affecting the
generation of multimode traffic flow from the perspective of the observation
generation principle and propose a Causal Conditional Hidden Markov Model
(CCHMM) to predict multimodal traffic flow. In the latent variables inference
stage, a posterior network disentangles the causal representations of the
concepts of interest from conditional information and observations, and a
causal propagation module mines their causal relationship. In the data
generation stage, a prior network samples the causal latent variables from the
prior distribution and feeds them into the generator to generate multimodal
traffic flow. We use a mutually supervised training method for the prior and
posterior to enhance the identifiability of the model. Experiments on
real-world datasets show that CCHMM can effectively disentangle causal
representations of concepts of interest and identify causality, and accurately
predict multimodal traffic flow.Comment: 8 pages, 5 figure
Shallow and deep convolutional networks for saliency prediction
The prediction of salient areas in images has been traditionally addressed with hand-crafted features based on neuroscience principles. This paper, however, addresses the problem with a completely data-driven approach by training a convolutional neural network (convnet). The learning process is formulated as a minimization of a loss function that measures the Euclidean distance of the predicted saliency map with the provided ground truth. The recent publication of large datasets of saliency prediction has provided enough data to train end-to-end architectures that are both fast and accurate. Two designs are proposed: a shallow convnet trained from scratch, and a another deeper solution whose first three layers are adapted from another network trained for classification. To the authors knowledge, these are the first end-to-end CNNs trained and tested for the purpose of saliency prediction
Gene expression profile indicates involvement of NO in Camellia sinensis pollen tube growth at low temperature
DEGs identified from the comparison between control (CsPT-CK) and 4 °C-treated (CsPT-LT) pollen tbues. All of the samples were replicated three times. CK and LT FPKM: fragments per kb per million reads for each unigene in the CK and LT libraries, respectively. The log2Ratio (LT/CK): ratio between the FPKM of LT and CK. The absolute values of log2Ratio > 1 and probability > 0.7 were used as threshold for assigning significance. Annotation of DEGs against NR, NT, Swiss-Prot protein, KEGG, COG and GO were all reported in the tables. “-”: no hit. (XLS 381 kb
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Self-attention based models such as vision transformers (ViTs) have emerged
as a very competitive architecture alternative to convolutional neural networks
(CNNs) in computer vision. Despite increasingly stronger variants with
ever-higher recognition accuracies, due to the quadratic complexity of
self-attention, existing ViTs are typically demanding in computation and model
size. Although several successful design choices (e.g., the convolutions and
hierarchical multi-stage structure) of prior CNNs have been reintroduced into
recent ViTs, they are still not sufficient to meet the limited resource
requirements of mobile devices. This motivates a very recent attempt to develop
light ViTs based on the state-of-the-art MobileNet-v2, but still leaves a
performance gap behind. In this work, pushing further along this under-studied
direction we introduce EdgeViTs, a new family of light-weight ViTs that, for
the first time, enable attention-based vision models to compete with the best
light-weight CNNs in the tradeoff between accuracy and on-device efficiency.
This is realized by introducing a highly cost-effective local-global-local
(LGL) information exchange bottleneck based on optimal integration of
self-attention and convolutions. For device-dedicated evaluation, rather than
relying on inaccurate proxies like the number of FLOPs or parameters, we adopt
a practical approach of focusing directly on on-device latency and, for the
first time, energy efficiency. Specifically, we show that our models are
Pareto-optimal when both accuracy-latency and accuracy-energy trade-offs are
considered, achieving strict dominance over other ViTs in almost all cases and
competing with the most efficient CNNs. Code is available at
https://github.com/saic-fi/edgevit.Comment: Accepted in ECCV 202
- …