221 research outputs found
Look at Adjacent Frames: Video Anomaly Detection without Offline Training
We propose a solution to detect anomalous events in videos without the need
to train a model offline. Specifically, our solution is based on a
randomly-initialized multilayer perceptron that is optimized online to
reconstruct video frames, pixel-by-pixel, from their frequency information.
Based on the information shifts between adjacent frames, an incremental learner
is used to update parameters of the multilayer perceptron after observing each
frame, thus allowing to detect anomalous events along the video stream.
Traditional solutions that require no offline training are limited to operating
on videos with only a few abnormal frames. Our solution breaks this limit and
achieves strong performance on benchmark datasets.Comment: Accepted in ECCV 2022 RW
Fast Hybrid Cascade for Voxel-based 3D Object Classification
Voxel-based 3D object classification has been frequently studied in recent
years. The previous methods often directly convert the classic 2D convolution
into a 3D form applied to an object with binary voxel representation. In this
paper, we investigate the reason why binary voxel representation is not very
suitable for 3D convolution and how to simultaneously improve the performance
both in accuracy and speed. We show that by giving each voxel a signed distance
value, the accuracy will gain about 30% promotion compared with binary voxel
representation using a two-layer fully connected network. We then propose a
fast fully connected and convolution hybrid cascade network for voxel-based 3D
object classification. This threestage cascade network can divide 3D models
into three categories: easy, moderate and hard. Consequently, the mean
inference time (0.3ms) can speedup about 5x and 2x compared with the
state-of-the-art point cloud and voxel based methods respectively, while
achieving the highest accuracy in the latter category of methods (92%).
Experiments with ModelNet andMNIST verify the performance of the proposed
hybrid cascade network
An Unified Search and Recommendation Foundation Model for Cold-Start Scenario
In modern commercial search engines and recommendation systems, data from
multiple domains is available to jointly train the multi-domain model.
Traditional methods train multi-domain models in the multi-task setting, with
shared parameters to learn the similarity of multiple tasks, and task-specific
parameters to learn the divergence of features, labels, and sample
distributions of individual tasks. With the development of large language
models, LLM can extract global domain-invariant text features that serve both
search and recommendation tasks. We propose a novel framework called S\&R
Multi-Domain Foundation, which uses LLM to extract domain invariant features,
and Aspect Gating Fusion to merge the ID feature, domain invariant text
features and task-specific heterogeneous sparse features to obtain the
representations of query and item. Additionally, samples from multiple search
and recommendation scenarios are trained jointly with Domain Adaptive
Multi-Task module to obtain the multi-domain foundation model. We apply the
S\&R Multi-Domain foundation model to cold start scenarios in the
pretrain-finetune manner, which achieves better performance than other SOTA
transfer learning methods. The S\&R Multi-Domain Foundation model has been
successfully deployed in Alipay Mobile Application's online services, such as
content query recommendation and service card recommendation, etc.Comment: CIKM 2023,6 page
2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection
This technical report introduces the winning solution of the team Segment Any
Anomaly for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge.
Going beyond uni-modal prompt, e.g., language prompt, we present a novel
framework, i.e., Segment Any Anomaly + (SAA), for zero-shot anomaly
segmentation with multi-modal prompts for the regularization of cascaded modern
foundation models. Inspired by the great zero-shot generalization ability of
foundation models like Segment Anything, we first explore their assembly (SAA)
to leverage diverse multi-modal prior knowledge for anomaly localization.
Subsequently, we further introduce multimodal prompts (SAA) derived from
domain expert knowledge and target image context to enable the non-parameter
adaptation of foundation models to anomaly segmentation. The proposed SAA
model achieves state-of-the-art performance on several anomaly segmentation
benchmarks, including VisA and MVTec-AD, in the zero-shot setting. We will
release the code of our winning solution for the CVPR2023 VAN.Comment: The first two author contribute equally. CVPR workshop challenge
report. arXiv admin note: substantial text overlap with arXiv:2305.1072
Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training
Image-text retrieval is a central problem for understanding the semantic
relationship between vision and language, and serves as the basis for various
visual and language tasks. Most previous works either simply learn
coarse-grained representations of the overall image and text, or elaborately
establish the correspondence between image regions or pixels and text words.
However, the close relations between coarse- and fine-grained representations
for each modality are important for image-text retrieval but almost neglected.
As a result, such previous works inevitably suffer from low retrieval accuracy
or heavy computational cost. In this work, we address image-text retrieval from
a novel perspective by combining coarse- and fine-grained representation
learning into a unified framework. This framework is consistent with human
cognition, as humans simultaneously pay attention to the entire sample and
regional elements to understand the semantic content. To this end, a
Token-Guided Dual Transformer (TGDT) architecture which consists of two
homogeneous branches for image and text modalities, respectively, is proposed
for image-text retrieval. The TGDT incorporates both coarse- and fine-grained
retrievals into a unified framework and beneficially leverages the advantages
of both retrieval approaches. A novel training objective called Consistent
Multimodal Contrastive (CMC) loss is proposed accordingly to ensure the intra-
and inter-modal semantic consistencies between images and texts in the common
embedding space. Equipped with a two-stage inference method based on the mixed
global and local cross-modal similarity, the proposed method achieves
state-of-the-art retrieval performances with extremely low inference time when
compared with representative recent approaches.Comment: Code is publicly available: https://github.com/LCFractal/TGD
A stable variational formulation for non-ordinary state-based peridynamics
The paper builds a stable variational formulation for the non-ordinary state-based peridynamics (NOSB-PD). Firstly, a new force state vector is reformulated by introducing the first Piola-Kirchhoff stress in continuum mechanics. The consistency of the new governing equation of the proposed pridynamic model and classical continuum mechanics is proved. Secondly, a stable variational formulation of non-ordinary state based peridynamics is developed to unify the boundary conditions in peridynamcis and continuum mechanics. The zero mode oscillations of non-ordinary state based peridynamics is also eliminated by penalty method in numerical implementation. Numerical examples are illustrated to validate the proposed method. Numerical solutions obtained by the proposed method also indicate that the proposed method can well capture the general nonlinear behavior of solid materials
Recommended from our members
Functional Organization of a Neural Network for Aversive Olfactory Learning in Caenorhabditis elegans
Many animals use their olfactory systems to learn to avoid dangers, but how neural circuits encode naive and learned olfactory preferences, and switch between those preferences, is poorly understood. Here, we map an olfactory network, from sensory input to motor output, which regulates the learned olfactory aversion of Caenorhabditis elegans for the smell of pathogenic bacteria. Naive animals prefer smells of pathogens but animals trained with pathogens lose this attraction. We find that two different neural circuits subserve these preferences, with one required for the naive preference and the other specifically for the learned preference. Calcium imaging and behavioral analysis reveal that the naive preference reflects the direct transduction of the activity of olfactory sensory neurons into motor response, whereas the learned preference involves modulations to signal transduction to downstream neurons to alter motor response. Thus, two different neural circuits regulate a behavioral switch between naive and learned olfactory preferences.Organismic and Evolutionary BiologyPhysic
Endovascular treatment strategy and clinical outcome of tentorial dural arteriovenous fistula
IntroductionTo evaluate treatment strategies and clinical outcomes following endovascular embolization of tentorial dural arteriovenous fistulas.MethodsWe retrospectively analyzed 19 patients with tentorial dural arteriovenous fistulas admitted to the Department of Neurosurgery at Jiangsu Provincial People’s Hospital between October 2015 and May 2022, all treated with endovascular therapy. To collect and analyze patients’ clinical presentation, imaging data, postoperative complications, and prognosis and to analyze the safety and clinical outcomes of endovascular treatment of tentorial dural arteriovenous fistulas.ResultsImaging cure was achieved in 18 patients, with the arterial route chosen for embolization in 17 patients and the venous route in one patient; one patient received partial embolization. Staged embolization was performed in four patients. At postoperative follow-up of 9–83 months (37.8 ± 21.2), all 19 patients had recovered well (mRS score ≤ 2). Three patients experienced perioperative complications: intraoperative Onyx reflux into the middle cerebral artery in one patient; postoperative permanent limited left visual field loss and deafness in the left ear in one patient; and transient diplopia, vertigo, and decreased pain and temperature sensation of the left limb in one patient, with no abnormalities on post-procedure magnetic resonance examinations. A total of 17 patients completed a postoperative digital subtraction angiography review during follow-up, and one patient had a recurrence of an arteriovenous fistula.ConclusionEndovascular treatment of tentorial dural arteriovenous fistulas is safe and effective. Reduction of the Borden or Cognard classification via eliminating cortical venous reflux through multi-staged embolization or combined open surgery is a reasonable goal of treatment where complete obliteration of the fistula is not achievable
Coronary angiography review in 21 children with Kawasaki disease complicated with coronary artery disease
Objective·To analyze the progression of children with severe coronary artery lesions due to Kawasaki disease by coronary artery angiography, and evaluate the diagnostic value of echocardiography in these children.Methods·A retrospective analysis was performed to enroll children with Kawasaki disease whose coronary artery lesions were graded Ⅳ or above from Shanghai Children's Medical Center, Shanghai Jiao Tong University School of Medicine, from January 2013 to January 2023. The subjects were required to have received at least 2 times of coronary angiogram, and their clinical and imaging data were collected to analyze the progression of the lesions. Echocardiography results were compared with the results of the coronary angiogram.Results·A total of 21 children were included, including 15 males and 6 females, with a median age at onset of 3 years and 6 months, a median age at initial coronary angiography of 7 years and 11 months, a median interval of 4 years and 5 months between the time of onset and initial angiography, a median age at angiographic review of 9 years and 2 months, and a median interval of 1 year and 3 months between the time of initial angiography and review. Coronary stenosis or occlusion was detected in 13 children in the initial angiography, of whom 6 underwent coronary artery bypass grafting (CABG) and had their angiography reviews 1 year later. The review results showed that the bridging vessels were unobstructed and no obvious stenosis was observed. Fifteen children had progression of the lesions detected by echocardiography in the subsequent follow-up and had their angiogram reviews, of whom 8 had significant progression of the coronary lesions. Intracoronary balloon dilatation was performed in 1 case, and CABG was performed in another case. Sixteen lesions of coronary stenosis or occlusion were detected in the initial angiography in 21 children, while only 1 lesion of coronary stenosis was detected by echocardiography during the same period of time. Twenty-eight medium- to large-sized coronary aneurysms were detected in the initial angiography in the 21 children, and the diameters of the 28 aneurysms measured by echocardiography and coronary angiogram were subjected to the Bland-Altman analysis. The Bland-Altman analysis showed that the difference in maximum diameter between 2 methods was (1.63±2.33) mm, with 95%CI of -2.95‒6.21 mm.Conclusion·Coronary artery lesions due to Kawasaki disease may be progressive; in the children with severe lesions, coronary artery stenosis or occlusion may be missed or misdiagnosed and some errors may exist in the measurement of diameters of aneurysms by echocardiography. Regular review of coronary angiography is needed
- …