Search CORE

26 research outputs found

DETEKSI PENYAKIT TUBERKULOSIS MELALUI SEGMENTASI CITRA MENGGUNAKAN ALGORITMA K-MEANS

Author: Agung Ignatius Wiseto Prasetyo
Marco Niki
Sadikin Nanda Dwi Husna
Sadikin Nandi Dwi Husni
Wati Sesilia
Publication venue: Universitas Sriwijaya
Publication date: 16/10/2023
Field of study

Tuberkulosis telah menjadi penyakit yang sangat berbahaya, penularan yang cepat dan mudah menjadi penyakit menular paling berbahaya di dunia saat ini. Deteksi bakteri mycobacterium tuberkulosis pun diperlukan untuk mempercepat diagnosis pasien, agar pasien dapat segera diobati dan penularan dapat dihentikan. Dalam penelitian ini, sebuah pendekatan segmentasi citra yang menggabungkan model warna LAB dan algoritma clustering K-Means diajukan untuk memisahkan dengan akurat area yang berisi bakteri tuberculosis dalam citra dari latar belakang. Pertama-tama, citra mikroskopis diubah ke dalam ruang warna LAB guna mengekstraksi komponen warna yang paling sensitif terhadap perbedaan intensitas dalam citra bakteri mycobacterium tuberculosis. Selanjutnya, melalui penerapan algoritma K-Means clustering, piksel-piksel citra dikelompokkan menjadi beberapa kelompok berdasarkan perbedaan intensitasnya. Hasil eksperimen menunjukkan bahwa pendekatan ini mampu mengisolasi area yang berisi bakteri mycobacterium tuberculosis dalam citra mikroskopis dengan akurasi dan efisiensi yang tinggi. Meskipun hasil akurasi yang tinggi didapatkan dengan cara diamati secara visual, penting untuk dicatat bahwa validasi akurasi segmentasi ini menjadi tantangan karena kurangnya cara yang objektif untuk memvalidasi keberadaan bakteri tuberkulosis dalam citra hasil. Namun, hasil penelitian ini memberikan indikasi yang kuat bahwa pendekatan segmentasi yang diusulkan ini memiliki potensi sebagai langkah awal dalam pengembangan sistem deteksi otomatis bakteri tuberculosis yang lebih canggih

PRSNet: A Masked Self-Supervised Learning Pedestrian Re-Identification Method

Author: Dong Zhicheng
Xiang Hao
Xiao Zhijie
Publication venue
Publication date: 11/03/2023
Field of study

In recent years, self-supervised learning has attracted widespread academic debate and addressed many of the key issues of computer vision. The present research focus is on how to construct a good agent task that allows for improved network learning of advanced semantic information on images so that model reasoning is accelerated during pre-training of the current task. In order to solve the problem that existing feature extraction networks are pre-trained on the ImageNet dataset and cannot extract the fine-grained information in pedestrian images well, and the existing pre-task of contrast self-supervised learning may destroy the original properties of pedestrian images, this paper designs a pre-task of mask reconstruction to obtain a pre-training model with strong robustness and uses it for the pedestrian re-identification task. The training optimization of the network is performed by improving the triplet loss based on the centroid, and the mask image is added as an additional sample to the loss calculation, so that the network can better cope with the pedestrian matching in practical applications after the training is completed. This method achieves about 5% higher mAP on Marker1501 and CUHK03 data than existing self-supervised learning pedestrian re-identification methods, and about 1% higher for Rank1, and ablation experiments are conducted to demonstrate the feasibility of this method. Our model code is located at https://github.com/ZJieX/prsnet

arXiv.org e-Print Archive

SimSwap: An Efficient Framework For High Fidelity Face Swapping

Author: Bao Jianmin
Brock Andrew
Deng Jiankang
Goodfellow Ian J.
Gulrajani Ishaan
He Kaiming
Huang Xun
Ioffe Sergey
Isola Phillip
Korshunova Iryna
Liu Ming-Yu
Liu Ziwei
Natsume Ryota
Nirkin Yuval
Park Taesung
Wang Ting-Chun
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/06/2021
Field of study

We propose an efficient framework, called Simple Swap (SimSwap), aiming for generalized and high fidelity face swapping. In contrast to previous approaches that either lack the ability to generalize to arbitrary identity or fail to preserve attributes like facial expression and gaze direction, our framework is capable of transferring the identity of an arbitrary source face into an arbitrary target face while preserving the attributes of the target face. We overcome the above defects in the following two ways. First, we present the ID Injection Module (IIM) which transfers the identity information of the source face into the target face at feature level. By using this module, we extend the architecture of an identity-specific face swapping algorithm to a framework for arbitrary face swapping. Second, we propose the Weak Feature Matching Loss which efficiently helps our framework to preserve the facial attributes in an implicit way. Extensive experiments on wild faces demonstrate that our SimSwap is able to achieve competitive identity performance while preserving attributes better than previous state-of-the-art methods. The code is already available on github: https://github.com/neuralchen/SimSwap.Comment: Accepted by ACMMM 202

arXiv.org e-Print Archive

AnoDODE: Anomaly Detection with Diffusion ODE

Author: Hu Xianyao
Jin Congming
Publication venue
Publication date: 10/10/2023
Field of study

Anomaly detection is the process of identifying atypical data samples that significantly deviate from the majority of the dataset. In the realm of clinical screening and diagnosis, detecting abnormalities in medical images holds great importance. Typically, clinical practice provides access to a vast collection of normal images, while abnormal images are relatively scarce. We hypothesize that abnormal images and their associated features tend to manifest in low-density regions of the data distribution. Following this assumption, we turn to diffusion ODEs for unsupervised anomaly detection, given their tractability and superior performance in density estimation tasks. More precisely, we propose a new anomaly detection method based on diffusion ODEs by estimating the density of features extracted from multi-scale medical images. Our anomaly scoring mechanism depends on computing the negative log-likelihood of features extracted from medical images at different scales, quantified in bits per dimension. Furthermore, we propose a reconstruction-based anomaly localization suitable for our method. Our proposed method not only identifie anomalies but also provides interpretability at both the image and pixel levels. Through experiments on the BraTS2021 medical dataset, our proposed method outperforms existing methods. These results confirm the effectiveness and robustness of our method.Comment: 11 pages, 5 figure

arXiv.org e-Print Archive

Pattern Anomaly Detection based on Sequence-to-Sequence Regularity Learning

Author: Cheng Yuzhen
Li Min
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2023
Field of study

Anomaly detection in traffic surveillance videos is a challenging task due to the ambiguity of anomaly definition and the complexity of scenes. In this paper, we propose to detect anomalous trajectories for vehicle behavior analysis via learning regularities in data. First, we train a sequence-to-sequence model under the autoencoder architecture and propose a new reconstruction error function for model optimization and anomaly evaluation. As such, the model is forced to learn the regular trajectory patterns in an unsupervised manner. Then, at the inference stage, we use the learned model to encode the test trajectory sample into a compact representation and generate a new trajectory sequence in the learned regular pattern. An anomaly score is computed based on the deviation of the generated trajectory from the test sample. Finally, we can find out the anomalous trajectories with an adaptive threshold. We evaluate the proposed method on two real-world traffic datasets and the experiments show favorable results against state-of-the-art algorithms. This paper\u27s research on sequence-to-sequence regularity learning can provide theoretical and practical support for pattern anomaly detection

FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests

Author: Abdari Ali
Falcon Alex
Serra Giuseppe
Publication venue
Publication date: 06/09/2023
Field of study

Nowadays, many people frequently have to search for new accommodation options. Searching for a suitable apartment is a time-consuming process, especially because visiting them is often mandatory to assess the truthfulness of the advertisements found on the Web. While this process could be alleviated by visiting the apartments in the metaverse, the Web-based recommendation platforms are not suitable for the task. To address this shortcoming, in this paper, we define a new problem called text-to-apartment recommendation, which requires ranking the apartments based on their relevance to a textual query expressing the user's interests. To tackle this problem, we introduce FArMARe, a multi-task approach that supports cross-modal contrastive training with a furniture-aware objective. Since public datasets related to indoor scenes do not contain detailed descriptions of the furniture, we collect and annotate a dataset comprising more than 6000 apartments. A thorough experimentation with three different methods and two raw feature extraction procedures reveals the effectiveness of FArMARe in dealing with the problem at hand.Comment: accepted for presentation at the ICCV2023 CV4Metaverse worksho

arXiv.org e-Print Archive

Our Deep CNN Face Matchers Have Developed Achromatopsia

Author: Annan Joyce
Bhatta Aman
Bowyer Kevin W.
King Micheal C.
Mery Domingo
Wu Haiyu
Publication venue
Publication date: 10/09/2023
Field of study

Modern deep CNN face matchers are trained on datasets containing color images. We show that such matchers achieve essentially the same accuracy on the grayscale or the color version of a set of test images. We then consider possible causes for deep CNN face matchers ``not seeing color''. Popular web-scraped face datasets actually have 30 to 60\% of their identities with one or more grayscale images. We analyze whether this grayscale element in the training set impacts the accuracy achieved, and conclude that it does not. Further, we show that even with a 100\% grayscale training set, comparable accuracy is achieved on color or grayscale test images. Then we show that the skin region of an individual's images in a web-scraped training set exhibit significant variation in their mapping to color space. This suggests that color, at least for web-scraped, in-the-wild face datasets, carries limited identity-related information for training state-of-the-art matchers. Finally, we verify that comparable accuracy is achieved from training using single-channel grayscale images, implying that a larger dataset can be used within the same memory limit, with a less computationally intensive early layer

arXiv.org e-Print Archive

Learning Global-Local Correspondence with Semantic Bottleneck for Logical Anomaly Detection

Author: Luo Donghao
Luo Wei
Qiang Zhenfeng
Yao Haiming
Yu Wenyong
Zhang Xiaotian
Publication venue
Publication date: 10/03/2023
Field of study

This paper presents a novel framework, named Global-Local Correspondence Framework (GLCF), for visual anomaly detection with logical constraints. Visual anomaly detection has become an active research area in various real-world applications, such as industrial anomaly detection and medical disease diagnosis. However, most existing methods focus on identifying local structural degeneration anomalies and often fail to detect high-level functional anomalies that involve logical constraints. To address this issue, we propose a two-branch approach that consists of a local branch for detecting structural anomalies and a global branch for detecting logical anomalies. To facilitate local-global feature correspondence, we introduce a novel semantic bottleneck enabled by the visual Transformer. Moreover, we develop feature estimation networks for each branch separately to detect anomalies. Our proposed framework is validated using various benchmarks, including industrial datasets, Mvtec AD, Mvtec Loco AD, and the Retinal-OCT medical dataset. Experimental results show that our method outperforms existing methods, particularly in detecting logical anomalies.Comment: Submission to IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOG

arXiv.org e-Print Archive

Spatiotemporal Self-supervised Learning for Point Clouds in the Wild

Author: Ke Wei
Salzmann Mathieu
Süsstrunk Sabine
Wu Yanhao
Zhang Tong
Publication venue
Publication date: 28/03/2023
Field of study

Self-supervised learning (SSL) has the potential to benefit many applications, particularly those where manually annotating data is cumbersome. One such situation is the semantic segmentation of point clouds. In this context, existing methods employ contrastive learning strategies and define positive pairs by performing various augmentation of point clusters in a single frame. As such, these methods do not exploit the temporal nature of LiDAR data. In this paper, we introduce an SSL strategy that leverages positive pairs in both the spatial and temporal domain. To this end, we design (i) a point-to-cluster learning strategy that aggregates spatial information to distinguish objects; and (ii) a cluster-to-cluster learning strategy based on unsupervised object tracking that exploits temporal correspondences. We demonstrate the benefits of our approach via extensive experiments performed by self-supervised training on two large-scale LiDAR datasets and transferring the resulting models to other point cloud segmentation benchmarks. Our results evidence that our method outperforms the state-of-the-art point cloud SSL methods.Comment: CVPR accepte

arXiv.org e-Print Archive