26 research outputs found
DETEKSI PENYAKIT TUBERKULOSIS MELALUI SEGMENTASI CITRA MENGGUNAKAN ALGORITMA K-MEANS
Tuberkulosis telah menjadi penyakit yang sangat berbahaya, penularan yang cepat dan mudah menjadi penyakit menular paling berbahaya di dunia saat ini. Deteksi bakteri mycobacterium tuberkulosis pun diperlukan untuk mempercepat diagnosis pasien, agar pasien dapat segera diobati dan penularan dapat dihentikan. Dalam penelitian ini, sebuah pendekatan segmentasi citra yang menggabungkan model warna LAB dan algoritma clustering K-Means diajukan untuk memisahkan dengan akurat area yang berisi bakteri tuberculosis dalam citra dari latar belakang. Pertama-tama, citra mikroskopis diubah ke dalam ruang warna LAB guna mengekstraksi komponen warna yang paling sensitif terhadap perbedaan intensitas dalam citra bakteri mycobacterium tuberculosis. Selanjutnya, melalui penerapan algoritma K-Means clustering, piksel-piksel citra dikelompokkan menjadi beberapa kelompok berdasarkan perbedaan intensitasnya. Hasil eksperimen menunjukkan bahwa pendekatan ini mampu mengisolasi area yang berisi bakteri mycobacterium tuberculosis dalam citra mikroskopis dengan akurasi dan efisiensi yang tinggi. Meskipun hasil akurasi yang tinggi didapatkan dengan cara diamati secara visual, penting untuk dicatat bahwa validasi akurasi segmentasi ini menjadi tantangan karena kurangnya cara yang objektif untuk memvalidasi keberadaan bakteri tuberkulosis dalam citra hasil. Namun, hasil penelitian ini memberikan indikasi yang kuat bahwa pendekatan segmentasi yang diusulkan ini memiliki potensi sebagai langkah awal dalam pengembangan sistem deteksi otomatis bakteri tuberculosis yang lebih canggih
PRSNet: A Masked Self-Supervised Learning Pedestrian Re-Identification Method
In recent years, self-supervised learning has attracted widespread academic
debate and addressed many of the key issues of computer vision. The present
research focus is on how to construct a good agent task that allows for
improved network learning of advanced semantic information on images so that
model reasoning is accelerated during pre-training of the current task. In
order to solve the problem that existing feature extraction networks are
pre-trained on the ImageNet dataset and cannot extract the fine-grained
information in pedestrian images well, and the existing pre-task of contrast
self-supervised learning may destroy the original properties of pedestrian
images, this paper designs a pre-task of mask reconstruction to obtain a
pre-training model with strong robustness and uses it for the pedestrian
re-identification task. The training optimization of the network is performed
by improving the triplet loss based on the centroid, and the mask image is
added as an additional sample to the loss calculation, so that the network can
better cope with the pedestrian matching in practical applications after the
training is completed. This method achieves about 5% higher mAP on Marker1501
and CUHK03 data than existing self-supervised learning pedestrian
re-identification methods, and about 1% higher for Rank1, and ablation
experiments are conducted to demonstrate the feasibility of this method. Our
model code is located at https://github.com/ZJieX/prsnet
SimSwap: An Efficient Framework For High Fidelity Face Swapping
We propose an efficient framework, called Simple Swap (SimSwap), aiming for
generalized and high fidelity face swapping. In contrast to previous approaches
that either lack the ability to generalize to arbitrary identity or fail to
preserve attributes like facial expression and gaze direction, our framework is
capable of transferring the identity of an arbitrary source face into an
arbitrary target face while preserving the attributes of the target face. We
overcome the above defects in the following two ways. First, we present the ID
Injection Module (IIM) which transfers the identity information of the source
face into the target face at feature level. By using this module, we extend the
architecture of an identity-specific face swapping algorithm to a framework for
arbitrary face swapping. Second, we propose the Weak Feature Matching Loss
which efficiently helps our framework to preserve the facial attributes in an
implicit way. Extensive experiments on wild faces demonstrate that our SimSwap
is able to achieve competitive identity performance while preserving attributes
better than previous state-of-the-art methods. The code is already available on
github: https://github.com/neuralchen/SimSwap.Comment: Accepted by ACMMM 202
AnoDODE: Anomaly Detection with Diffusion ODE
Anomaly detection is the process of identifying atypical data samples that
significantly deviate from the majority of the dataset. In the realm of
clinical screening and diagnosis, detecting abnormalities in medical images
holds great importance. Typically, clinical practice provides access to a vast
collection of normal images, while abnormal images are relatively scarce. We
hypothesize that abnormal images and their associated features tend to manifest
in low-density regions of the data distribution. Following this assumption, we
turn to diffusion ODEs for unsupervised anomaly detection, given their
tractability and superior performance in density estimation tasks. More
precisely, we propose a new anomaly detection method based on diffusion ODEs by
estimating the density of features extracted from multi-scale medical images.
Our anomaly scoring mechanism depends on computing the negative log-likelihood
of features extracted from medical images at different scales, quantified in
bits per dimension. Furthermore, we propose a reconstruction-based anomaly
localization suitable for our method. Our proposed method not only identifie
anomalies but also provides interpretability at both the image and pixel
levels. Through experiments on the BraTS2021 medical dataset, our proposed
method outperforms existing methods. These results confirm the effectiveness
and robustness of our method.Comment: 11 pages, 5 figure
Pattern Anomaly Detection based on Sequence-to-Sequence Regularity Learning
Anomaly detection in traffic surveillance videos is a challenging task due to the ambiguity of anomaly definition and the complexity of scenes. In this paper, we propose to detect anomalous trajectories for vehicle behavior analysis via learning regularities in data. First, we train a sequence-to-sequence model under the autoencoder architecture and propose a new reconstruction error function for model optimization and anomaly evaluation. As such, the model is forced to learn the regular trajectory patterns in an unsupervised manner. Then, at the inference stage, we use the learned model to encode the test trajectory sample into a compact representation and generate a new trajectory sequence in the learned regular pattern. An anomaly score is computed based on the deviation of the generated trajectory from the test sample. Finally, we can find out the anomalous trajectories with an adaptive threshold. We evaluate the proposed method on two real-world traffic datasets and the experiments show favorable results against state-of-the-art algorithms. This paper\u27s research on sequence-to-sequence regularity learning can provide theoretical and practical support for pattern anomaly detection
FArMARe: a Furniture-Aware Multi-task methodology for Recommending Apartments based on the user interests
Nowadays, many people frequently have to search for new accommodation
options. Searching for a suitable apartment is a time-consuming process,
especially because visiting them is often mandatory to assess the truthfulness
of the advertisements found on the Web. While this process could be alleviated
by visiting the apartments in the metaverse, the Web-based recommendation
platforms are not suitable for the task. To address this shortcoming, in this
paper, we define a new problem called text-to-apartment recommendation, which
requires ranking the apartments based on their relevance to a textual query
expressing the user's interests. To tackle this problem, we introduce FArMARe,
a multi-task approach that supports cross-modal contrastive training with a
furniture-aware objective. Since public datasets related to indoor scenes do
not contain detailed descriptions of the furniture, we collect and annotate a
dataset comprising more than 6000 apartments. A thorough experimentation with
three different methods and two raw feature extraction procedures reveals the
effectiveness of FArMARe in dealing with the problem at hand.Comment: accepted for presentation at the ICCV2023 CV4Metaverse worksho
Our Deep CNN Face Matchers Have Developed Achromatopsia
Modern deep CNN face matchers are trained on datasets containing color
images. We show that such matchers achieve essentially the same accuracy on the
grayscale or the color version of a set of test images. We then consider
possible causes for deep CNN face matchers ``not seeing color''. Popular
web-scraped face datasets actually have 30 to 60\% of their identities with one
or more grayscale images. We analyze whether this grayscale element in the
training set impacts the accuracy achieved, and conclude that it does not.
Further, we show that even with a 100\% grayscale training set, comparable
accuracy is achieved on color or grayscale test images. Then we show that the
skin region of an individual's images in a web-scraped training set exhibit
significant variation in their mapping to color space. This suggests that
color, at least for web-scraped, in-the-wild face datasets, carries limited
identity-related information for training state-of-the-art matchers. Finally,
we verify that comparable accuracy is achieved from training using
single-channel grayscale images, implying that a larger dataset can be used
within the same memory limit, with a less computationally intensive early
layer
Learning Global-Local Correspondence with Semantic Bottleneck for Logical Anomaly Detection
This paper presents a novel framework, named Global-Local Correspondence
Framework (GLCF), for visual anomaly detection with logical constraints. Visual
anomaly detection has become an active research area in various real-world
applications, such as industrial anomaly detection and medical disease
diagnosis. However, most existing methods focus on identifying local structural
degeneration anomalies and often fail to detect high-level functional anomalies
that involve logical constraints. To address this issue, we propose a
two-branch approach that consists of a local branch for detecting structural
anomalies and a global branch for detecting logical anomalies. To facilitate
local-global feature correspondence, we introduce a novel semantic bottleneck
enabled by the visual Transformer. Moreover, we develop feature estimation
networks for each branch separately to detect anomalies. Our proposed framework
is validated using various benchmarks, including industrial datasets, Mvtec AD,
Mvtec Loco AD, and the Retinal-OCT medical dataset. Experimental results show
that our method outperforms existing methods, particularly in detecting logical
anomalies.Comment: Submission to IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO
TECHNOLOG
Spatiotemporal Self-supervised Learning for Point Clouds in the Wild
Self-supervised learning (SSL) has the potential to benefit many
applications, particularly those where manually annotating data is cumbersome.
One such situation is the semantic segmentation of point clouds. In this
context, existing methods employ contrastive learning strategies and define
positive pairs by performing various augmentation of point clusters in a single
frame. As such, these methods do not exploit the temporal nature of LiDAR data.
In this paper, we introduce an SSL strategy that leverages positive pairs in
both the spatial and temporal domain. To this end, we design (i) a
point-to-cluster learning strategy that aggregates spatial information to
distinguish objects; and (ii) a cluster-to-cluster learning strategy based on
unsupervised object tracking that exploits temporal correspondences. We
demonstrate the benefits of our approach via extensive experiments performed by
self-supervised training on two large-scale LiDAR datasets and transferring the
resulting models to other point cloud segmentation benchmarks. Our results
evidence that our method outperforms the state-of-the-art point cloud SSL
methods.Comment: CVPR accepte