221 research outputs found

    Look at Adjacent Frames: Video Anomaly Detection without Offline Training

    Full text link
    We propose a solution to detect anomalous events in videos without the need to train a model offline. Specifically, our solution is based on a randomly-initialized multilayer perceptron that is optimized online to reconstruct video frames, pixel-by-pixel, from their frequency information. Based on the information shifts between adjacent frames, an incremental learner is used to update parameters of the multilayer perceptron after observing each frame, thus allowing to detect anomalous events along the video stream. Traditional solutions that require no offline training are limited to operating on videos with only a few abnormal frames. Our solution breaks this limit and achieves strong performance on benchmark datasets.Comment: Accepted in ECCV 2022 RW

    Fast Hybrid Cascade for Voxel-based 3D Object Classification

    Full text link
    Voxel-based 3D object classification has been frequently studied in recent years. The previous methods often directly convert the classic 2D convolution into a 3D form applied to an object with binary voxel representation. In this paper, we investigate the reason why binary voxel representation is not very suitable for 3D convolution and how to simultaneously improve the performance both in accuracy and speed. We show that by giving each voxel a signed distance value, the accuracy will gain about 30% promotion compared with binary voxel representation using a two-layer fully connected network. We then propose a fast fully connected and convolution hybrid cascade network for voxel-based 3D object classification. This threestage cascade network can divide 3D models into three categories: easy, moderate and hard. Consequently, the mean inference time (0.3ms) can speedup about 5x and 2x compared with the state-of-the-art point cloud and voxel based methods respectively, while achieving the highest accuracy in the latter category of methods (92%). Experiments with ModelNet andMNIST verify the performance of the proposed hybrid cascade network

    An Unified Search and Recommendation Foundation Model for Cold-Start Scenario

    Full text link
    In modern commercial search engines and recommendation systems, data from multiple domains is available to jointly train the multi-domain model. Traditional methods train multi-domain models in the multi-task setting, with shared parameters to learn the similarity of multiple tasks, and task-specific parameters to learn the divergence of features, labels, and sample distributions of individual tasks. With the development of large language models, LLM can extract global domain-invariant text features that serve both search and recommendation tasks. We propose a novel framework called S\&R Multi-Domain Foundation, which uses LLM to extract domain invariant features, and Aspect Gating Fusion to merge the ID feature, domain invariant text features and task-specific heterogeneous sparse features to obtain the representations of query and item. Additionally, samples from multiple search and recommendation scenarios are trained jointly with Domain Adaptive Multi-Task module to obtain the multi-domain foundation model. We apply the S\&R Multi-Domain foundation model to cold start scenarios in the pretrain-finetune manner, which achieves better performance than other SOTA transfer learning methods. The S\&R Multi-Domain Foundation model has been successfully deployed in Alipay Mobile Application's online services, such as content query recommendation and service card recommendation, etc.Comment: CIKM 2023,6 page

    2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection

    Full text link
    This technical report introduces the winning solution of the team Segment Any Anomaly for the CVPR2023 Visual Anomaly and Novelty Detection (VAND) challenge. Going beyond uni-modal prompt, e.g., language prompt, we present a novel framework, i.e., Segment Any Anomaly + (SAA++), for zero-shot anomaly segmentation with multi-modal prompts for the regularization of cascaded modern foundation models. Inspired by the great zero-shot generalization ability of foundation models like Segment Anything, we first explore their assembly (SAA) to leverage diverse multi-modal prior knowledge for anomaly localization. Subsequently, we further introduce multimodal prompts (SAA++) derived from domain expert knowledge and target image context to enable the non-parameter adaptation of foundation models to anomaly segmentation. The proposed SAA++ model achieves state-of-the-art performance on several anomaly segmentation benchmarks, including VisA and MVTec-AD, in the zero-shot setting. We will release the code of our winning solution for the CVPR2023 VAN.Comment: The first two author contribute equally. CVPR workshop challenge report. arXiv admin note: substantial text overlap with arXiv:2305.1072

    Efficient Token-Guided Image-Text Retrieval with Consistent Multimodal Contrastive Training

    Full text link
    Image-text retrieval is a central problem for understanding the semantic relationship between vision and language, and serves as the basis for various visual and language tasks. Most previous works either simply learn coarse-grained representations of the overall image and text, or elaborately establish the correspondence between image regions or pixels and text words. However, the close relations between coarse- and fine-grained representations for each modality are important for image-text retrieval but almost neglected. As a result, such previous works inevitably suffer from low retrieval accuracy or heavy computational cost. In this work, we address image-text retrieval from a novel perspective by combining coarse- and fine-grained representation learning into a unified framework. This framework is consistent with human cognition, as humans simultaneously pay attention to the entire sample and regional elements to understand the semantic content. To this end, a Token-Guided Dual Transformer (TGDT) architecture which consists of two homogeneous branches for image and text modalities, respectively, is proposed for image-text retrieval. The TGDT incorporates both coarse- and fine-grained retrievals into a unified framework and beneficially leverages the advantages of both retrieval approaches. A novel training objective called Consistent Multimodal Contrastive (CMC) loss is proposed accordingly to ensure the intra- and inter-modal semantic consistencies between images and texts in the common embedding space. Equipped with a two-stage inference method based on the mixed global and local cross-modal similarity, the proposed method achieves state-of-the-art retrieval performances with extremely low inference time when compared with representative recent approaches.Comment: Code is publicly available: https://github.com/LCFractal/TGD

    A stable variational formulation for non-ordinary state-based peridynamics

    Get PDF
    The paper builds a stable variational formulation for the non-ordinary state-based peridynamics (NOSB-PD). Firstly, a new force state vector is reformulated by introducing the first Piola-Kirchhoff stress in continuum mechanics. The consistency of the new governing equation of the proposed pridynamic model and classical continuum mechanics is proved. Secondly, a stable variational formulation of non-ordinary state based peridynamics is developed to unify the boundary conditions in peridynamcis and continuum mechanics. The zero mode oscillations of non-ordinary state based peridynamics is also eliminated by penalty method in numerical implementation. Numerical examples are illustrated to validate the proposed method. Numerical solutions obtained by the proposed method also indicate that the proposed method can well capture the general nonlinear behavior of solid materials

    Endovascular treatment strategy and clinical outcome of tentorial dural arteriovenous fistula

    Get PDF
    IntroductionTo evaluate treatment strategies and clinical outcomes following endovascular embolization of tentorial dural arteriovenous fistulas.MethodsWe retrospectively analyzed 19 patients with tentorial dural arteriovenous fistulas admitted to the Department of Neurosurgery at Jiangsu Provincial People’s Hospital between October 2015 and May 2022, all treated with endovascular therapy. To collect and analyze patients’ clinical presentation, imaging data, postoperative complications, and prognosis and to analyze the safety and clinical outcomes of endovascular treatment of tentorial dural arteriovenous fistulas.ResultsImaging cure was achieved in 18 patients, with the arterial route chosen for embolization in 17 patients and the venous route in one patient; one patient received partial embolization. Staged embolization was performed in four patients. At postoperative follow-up of 9–83 months (37.8 ± 21.2), all 19 patients had recovered well (mRS score ≤ 2). Three patients experienced perioperative complications: intraoperative Onyx reflux into the middle cerebral artery in one patient; postoperative permanent limited left visual field loss and deafness in the left ear in one patient; and transient diplopia, vertigo, and decreased pain and temperature sensation of the left limb in one patient, with no abnormalities on post-procedure magnetic resonance examinations. A total of 17 patients completed a postoperative digital subtraction angiography review during follow-up, and one patient had a recurrence of an arteriovenous fistula.ConclusionEndovascular treatment of tentorial dural arteriovenous fistulas is safe and effective. Reduction of the Borden or Cognard classification via eliminating cortical venous reflux through multi-staged embolization or combined open surgery is a reasonable goal of treatment where complete obliteration of the fistula is not achievable

    Coronary angiography review in 21 children with Kawasaki disease complicated with coronary artery disease

    Get PDF
    Objective·To analyze the progression of children with severe coronary artery lesions due to Kawasaki disease by coronary artery angiography, and evaluate the diagnostic value of echocardiography in these children.Methods·A retrospective analysis was performed to enroll children with Kawasaki disease whose coronary artery lesions were graded Ⅳ or above from Shanghai Children's Medical Center, Shanghai Jiao Tong University School of Medicine, from January 2013 to January 2023. The subjects were required to have received at least 2 times of coronary angiogram, and their clinical and imaging data were collected to analyze the progression of the lesions. Echocardiography results were compared with the results of the coronary angiogram.Results·A total of 21 children were included, including 15 males and 6 females, with a median age at onset of 3 years and 6 months, a median age at initial coronary angiography of 7 years and 11 months, a median interval of 4 years and 5 months between the time of onset and initial angiography, a median age at angiographic review of 9 years and 2 months, and a median interval of 1 year and 3 months between the time of initial angiography and review. Coronary stenosis or occlusion was detected in 13 children in the initial angiography, of whom 6 underwent coronary artery bypass grafting (CABG) and had their angiography reviews 1 year later. The review results showed that the bridging vessels were unobstructed and no obvious stenosis was observed. Fifteen children had progression of the lesions detected by echocardiography in the subsequent follow-up and had their angiogram reviews, of whom 8 had significant progression of the coronary lesions. Intracoronary balloon dilatation was performed in 1 case, and CABG was performed in another case. Sixteen lesions of coronary stenosis or occlusion were detected in the initial angiography in 21 children, while only 1 lesion of coronary stenosis was detected by echocardiography during the same period of time. Twenty-eight medium- to large-sized coronary aneurysms were detected in the initial angiography in the 21 children, and the diameters of the 28 aneurysms measured by echocardiography and coronary angiogram were subjected to the Bland-Altman analysis. The Bland-Altman analysis showed that the difference in maximum diameter between 2 methods was (1.63±2.33) mm, with 95%CI of -2.95‒6.21 mm.Conclusion·Coronary artery lesions due to Kawasaki disease may be progressive; in the children with severe lesions, coronary artery stenosis or occlusion may be missed or misdiagnosed and some errors may exist in the measurement of diameters of aneurysms by echocardiography. Regular review of coronary angiography is needed
    • …
    corecore