129 research outputs found
Fixed-point Factorized Networks
In recent years, Deep Neural Networks (DNN) based methods have achieved
remarkable performance in a wide range of tasks and have been among the most
powerful and widely used techniques in computer vision. However, DNN-based
methods are both computational-intensive and resource-consuming, which hinders
the application of these methods on embedded systems like smart phones. To
alleviate this problem, we introduce a novel Fixed-point Factorized Networks
(FFN) for pretrained models to reduce the computational complexity as well as
the storage requirement of networks. The resulting networks have only weights
of -1, 0 and 1, which significantly eliminates the most resource-consuming
multiply-accumulate operations (MACs). Extensive experiments on large-scale
ImageNet classification task show the proposed FFN only requires one-thousandth
of multiply operations with comparable accuracy
From Hashing to CNNs: Training BinaryWeight Networks via Hashing
Deep convolutional neural networks (CNNs) have shown appealing performance on
various computer vision tasks in recent years. This motivates people to deploy
CNNs to realworld applications. However, most of state-of-art CNNs require
large memory and computational resources, which hinders the deployment on
mobile devices. Recent studies show that low-bit weight representation can
reduce much storage and memory demand, and also can achieve efficient network
inference. To achieve this goal, we propose a novel approach named BWNH to
train Binary Weight Networks via Hashing. In this paper, we first reveal the
strong connection between inner-product preserving hashing and binary weight
networks, and show that training binary weight networks can be intrinsically
regarded as a hashing problem. Based on this perspective, we propose an
alternating optimization method to learn the hash codes instead of directly
learning binary weights. Extensive experiments on CIFAR10, CIFAR100 and
ImageNet demonstrate that our proposed BWNH outperforms current state-of-art by
a large margin
Estimating Marginal Hazard Ratios by Simultaneously Using A Set of Propensity Score Models: A Multiply Robust Approach
The inverse probability weighted Cox model is frequently used to estimate marginal hazard ratios. Its validity requires a crucial condition that the propensity score model is correctly specified. To provide protection against misspecification of the propensity score model, we propose a weighted estimation method rooted in empirical likelihood theory. The proposed estimator is multiply robust in that it is guaranteed to be consistent when a set of postulated propensity score models contains a correctly specified model. Our simulation studies demonstrate satisfactory finite sample performance of the proposed method in terms of consistency and efficiency. We apply the proposed method to compare the risk of postoperative hospitalization between sleeve gastrectomy and Roux-en-Y gastric bypass using data from a large medical claims and billing database.We further extend the development to multi-site studies to enable each site to postulate multiple site-specific propensity score models
Total thyroidectomy may be more reasonable as initial surgery in unilateral multifocal papillary thyroid microcarcinoma: a single-center experience
The ethics statement of our study by the Ethics Committee of Jilin University affiliated First Hospital. (DOC 58 kb
Feature Distilled Tracking
Feature extraction and representation is one of the most important components for fast, accurate, and robust visual tracking. Very deep convolutional neural networks (CNNs) provide effective tools for feature extraction with good generalization ability. However, extracting features using very deep CNN models needs high performance hardware due to its large computation complexity, which prohibits its extensions in real-time applications. To alleviate this problem, we aim at obtaining small and fast-to-execute shallow models based on model compression for visual tracking. Specifically, we propose a small feature distilled network (FDN) for tracking by imitating the intermediate representations of a much deeper network. The FDN extracts rich visual features with higher speed than the original deeper network. To further speed-up, we introduce a shift-and-stitch method to reduce the arithmetic operations, while preserving the spatial resolution of the distilled feature maps unchanged. Finally, a scale adaptive discriminative correlation filter is learned on the distilled feature for visual tracking to handle scale variation of the target. Comprehensive experimental results on object tracking benchmark datasets show that the proposed approach achieves 5x speed-up with competitive performance to the state-of-the-art deep trackers
Portal Vein Thrombosis in Liver Cirrhosis
In liver cirrhosis, portal vein thrombosis (PVT), which is defined as thrombosis that occurs within the main portal vein and intrahepatic portal branches, is one of the most common complications. High incidence of PVT in the setting of liver cirrhosis is mainly due to hypercoagulable state and altered dynamic of blood flow in the portal vein. The clinical manifestations of PVT are variable among different patients, so the diagnosis of PVT is mainly dependent on the imaging examinations, like ultrasound, computed tomography and magnetic resonance imaging. The overall goal of treatment for PVT can be summarized as reducing risk factors of PVT, thus to prevent further expansion of thrombus and maintain portal patency and prevent and treat the symptoms of PVT by anticoagulants, local thrombolysis, transjugular intrahepatic portosystemic shunt and/or surgery. In future, due to the progress in vascular imaging and innovation in clinical anti-thrombotic drug, PVT could be prevented and cured effectively
One Fits All:Power General Time Series Analysis by Pretrained LM
Although we have witnessed great success of pre-trained models in natural
language processing (NLP) and computer vision (CV), limited progress has been
made for general time series analysis. Unlike NLP and CV where a unified model
can be used to perform different tasks, specially designed approach still
dominates in each time series analysis task such as classification, anomaly
detection, forecasting, and few-shot learning. The main challenge that blocks
the development of pre-trained model for time series analysis is the lack of a
large amount of data for training. In this work, we address this challenge by
leveraging language or CV models, pre-trained from billions of tokens, for time
series analysis. Specifically, we refrain from altering the self-attention and
feedforward layers of the residual blocks in the pre-trained language or image
model. This model, known as the Frozen Pretrained Transformer (FPT), is
evaluated through fine-tuning on all major types of tasks involving time
series. Our results demonstrate that pre-trained models on natural language or
images can lead to a comparable or state-of-the-art performance in all main
time series analysis tasks, as illustrated in Figure 1. We also found both
theoretically and empirically that the self-attention module behaviors
similarly to principle component analysis (PCA), an observation that helps
explains how transformer bridges the domain gap and a crucial step towards
understanding the universality of a pre-trained transformer.The code is
publicly available at https://github.com/DAMO-DI-ML/One_Fits_All.Comment: Neurips 2023 Spotligh
Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method
The generative model has made significant advancements in the creation of
realistic videos, which causes security issues. However, this emerging risk has
not been adequately addressed due to the absence of a benchmark dataset for
AI-generated videos. In this paper, we first construct a video dataset using
advanced diffusion-based video generation algorithms with various semantic
contents. Besides, typical video lossy operations over network transmission are
adopted to generate degraded samples. Then, by analyzing local and global
temporal defects of current AI-generated videos, a novel detection framework by
adaptively learning local motion information and global appearance variation is
constructed to expose fake videos. Finally, experiments are conducted to
evaluate the generalization and robustness of different spatial and temporal
domain detection methods, where the results can serve as the baseline and
demonstrate the research challenge for future studies
- …