44 research outputs found

    DETEKSI PENYAKIT TUBERKULOSIS MELALUI SEGMENTASI CITRA MENGGUNAKAN ALGORITMA K-MEANS

    Get PDF
    Tuberkulosis telah menjadi penyakit yang sangat berbahaya, penularan yang cepat dan mudah menjadi penyakit menular paling berbahaya di dunia saat ini. Deteksi bakteri mycobacterium tuberkulosis pun diperlukan untuk mempercepat diagnosis pasien, agar pasien dapat segera diobati dan penularan dapat dihentikan. Dalam penelitian ini, sebuah pendekatan segmentasi citra yang menggabungkan model warna LAB dan algoritma clustering K-Means diajukan untuk memisahkan dengan akurat area yang berisi bakteri tuberculosis dalam citra dari latar belakang. Pertama-tama, citra mikroskopis diubah ke dalam ruang warna LAB guna mengekstraksi komponen warna yang paling sensitif terhadap perbedaan intensitas dalam citra bakteri mycobacterium tuberculosis. Selanjutnya, melalui penerapan algoritma K-Means clustering, piksel-piksel citra dikelompokkan menjadi beberapa kelompok berdasarkan perbedaan intensitasnya. Hasil eksperimen menunjukkan bahwa pendekatan ini mampu mengisolasi area yang berisi bakteri mycobacterium tuberculosis dalam citra mikroskopis dengan akurasi dan efisiensi yang tinggi. Meskipun hasil akurasi yang tinggi didapatkan dengan cara diamati secara visual, penting untuk dicatat bahwa validasi akurasi segmentasi ini menjadi tantangan karena kurangnya cara yang objektif untuk memvalidasi keberadaan bakteri tuberkulosis dalam citra hasil. Namun, hasil penelitian ini memberikan indikasi yang kuat bahwa pendekatan segmentasi yang diusulkan ini memiliki potensi sebagai langkah awal dalam pengembangan sistem deteksi otomatis bakteri tuberculosis yang lebih canggih

    Cross-resolution Face Recognition via Identity-Preserving Network and Knowledge Distillation

    Full text link
    Cross-resolution face recognition has become a challenging problem for modern deep face recognition systems. It aims at matching a low-resolution probe image with high-resolution gallery images registered in a database. Existing methods mainly leverage prior information from high-resolution images by either reconstructing facial details with super-resolution techniques or learning a unified feature space. To address this challenge, this paper proposes a new approach that enforces the network to focus on the discriminative information stored in the low-frequency components of a low-resolution image. A cross-resolution knowledge distillation paradigm is first employed as the learning framework. Then, an identity-preserving network, WaveResNet, and a wavelet similarity loss are designed to capture low-frequency details and boost performance. Finally, an image degradation model is conceived to simulate more realistic low-resolution training data. Consequently, extensive experimental results show that the proposed method consistently outperforms the baseline model and other state-of-the-art methods across a variety of image resolutions

    Feasibility of Universal Anomaly Detection without Knowing the Abnormality in Medical Images

    Full text link
    Many anomaly detection approaches, especially deep learning methods, have been recently developed to identify abnormal image morphology by only employing normal images during training. Unfortunately, many prior anomaly detection methods were optimized for a specific "known" abnormality (e.g., brain tumor, bone fraction, cell types). Moreover, even though only the normal images were used in the training process, the abnormal images were often employed during the validation process (e.g., epoch selection, hyper-parameter tuning), which might leak the supposed ``unknown" abnormality unintentionally. In this study, we investigated these two essential aspects regarding universal anomaly detection in medical images by (1) comparing various anomaly detection methods across four medical datasets, (2) investigating the inevitable but often neglected issues on how to unbiasedly select the optimal anomaly detection model during the validation phase using only normal images, and (3) proposing a simple decision-level ensemble method to leverage the advantage of different kinds of anomaly detection without knowing the abnormality. The results of our experiments indicate that none of the evaluated methods consistently achieved the best performance across all datasets. Our proposed method enhanced the robustness of performance in general (average AUC 0.956)

    Two-Step Active Learning for Instance Segmentation with Uncertainty and Diversity Sampling

    Full text link
    Training high-quality instance segmentation models requires an abundance of labeled images with instance masks and classifications, which is often expensive to procure. Active learning addresses this challenge by striving for optimum performance with minimal labeling cost by selecting the most informative and representative images for labeling. Despite its potential, active learning has been less explored in instance segmentation compared to other tasks like image classification, which require less labeling. In this study, we propose a post-hoc active learning algorithm that integrates uncertainty-based sampling with diversity-based sampling. Our proposed algorithm is not only simple and easy to implement, but it also delivers superior performance on various datasets. Its practical application is demonstrated on a real-world overhead imagery dataset, where it increases the labeling efficiency fivefold.Comment: UNCV ICCV 202

    PRSNet: A Masked Self-Supervised Learning Pedestrian Re-Identification Method

    Full text link
    In recent years, self-supervised learning has attracted widespread academic debate and addressed many of the key issues of computer vision. The present research focus is on how to construct a good agent task that allows for improved network learning of advanced semantic information on images so that model reasoning is accelerated during pre-training of the current task. In order to solve the problem that existing feature extraction networks are pre-trained on the ImageNet dataset and cannot extract the fine-grained information in pedestrian images well, and the existing pre-task of contrast self-supervised learning may destroy the original properties of pedestrian images, this paper designs a pre-task of mask reconstruction to obtain a pre-training model with strong robustness and uses it for the pedestrian re-identification task. The training optimization of the network is performed by improving the triplet loss based on the centroid, and the mask image is added as an additional sample to the loss calculation, so that the network can better cope with the pedestrian matching in practical applications after the training is completed. This method achieves about 5% higher mAP on Marker1501 and CUHK03 data than existing self-supervised learning pedestrian re-identification methods, and about 1% higher for Rank1, and ablation experiments are conducted to demonstrate the feasibility of this method. Our model code is located at https://github.com/ZJieX/prsnet

    Long Story Short: a Summarize-then-Search Method for Long Video Question Answering

    Full text link
    Large language models such as GPT-3 have demonstrated an impressive capability to adapt to new tasks without requiring task-specific training data. This capability has been particularly effective in settings such as narrative question answering, where the diversity of tasks is immense, but the available supervision data is small. In this work, we investigate if such language models can extend their zero-shot reasoning abilities to long multimodal narratives in multimedia content such as drama, movies, and animation, where the story plays an essential role. We propose Long Story Short, a framework for narrative video QA that first summarizes the narrative of the video to a short plot and then searches parts of the video relevant to the question. We also propose to enhance visual matching with CLIPCheck. Our model outperforms state-of-the-art supervised models by a large margin, highlighting the potential of zero-shot QA for long videos.Comment: Published in BMVC 202

    SimSwap: An Efficient Framework For High Fidelity Face Swapping

    Full text link
    We propose an efficient framework, called Simple Swap (SimSwap), aiming for generalized and high fidelity face swapping. In contrast to previous approaches that either lack the ability to generalize to arbitrary identity or fail to preserve attributes like facial expression and gaze direction, our framework is capable of transferring the identity of an arbitrary source face into an arbitrary target face while preserving the attributes of the target face. We overcome the above defects in the following two ways. First, we present the ID Injection Module (IIM) which transfers the identity information of the source face into the target face at feature level. By using this module, we extend the architecture of an identity-specific face swapping algorithm to a framework for arbitrary face swapping. Second, we propose the Weak Feature Matching Loss which efficiently helps our framework to preserve the facial attributes in an implicit way. Extensive experiments on wild faces demonstrate that our SimSwap is able to achieve competitive identity performance while preserving attributes better than previous state-of-the-art methods. The code is already available on github: https://github.com/neuralchen/SimSwap.Comment: Accepted by ACMMM 202

    CCFace: Classification Consistency for Low-Resolution Face Recognition

    Full text link
    In recent years, deep face recognition methods have demonstrated impressive results on in-the-wild datasets. However, these methods have shown a significant decline in performance when applied to real-world low-resolution benchmarks like TinyFace or SCFace. To address this challenge, we propose a novel classification consistency knowledge distillation approach that transfers the learned classifier from a high-resolution model to a low-resolution network. This approach helps in finding discriminative representations for low-resolution instances. To further improve the performance, we designed a knowledge distillation loss using the adaptive angular penalty inspired by the success of the popular angular margin loss function. The adaptive penalty reduces overfitting on low-resolution samples and alleviates the convergence issue of the model integrated with data augmentation. Additionally, we utilize an asymmetric cross-resolution learning approach based on the state-of-the-art semi-supervised representation learning paradigm to improve discriminability on low-resolution instances and prevent them from forming a cluster. Our proposed method outperforms state-of-the-art approaches on low-resolution benchmarks, with a three percent improvement on TinyFace while maintaining performance on high-resolution benchmarks.Comment: 2023 IEEE International Joint Conference on Biometrics (IJCB
    corecore