Search CORE

44 research outputs found

DETEKSI PENYAKIT TUBERKULOSIS MELALUI SEGMENTASI CITRA MENGGUNAKAN ALGORITMA K-MEANS

Author: Agung Ignatius Wiseto Prasetyo
Marco Niki
Sadikin Nanda Dwi Husna
Sadikin Nandi Dwi Husni
Wati Sesilia
Publication venue: Universitas Sriwijaya
Publication date: 16/10/2023
Field of study

Tuberkulosis telah menjadi penyakit yang sangat berbahaya, penularan yang cepat dan mudah menjadi penyakit menular paling berbahaya di dunia saat ini. Deteksi bakteri mycobacterium tuberkulosis pun diperlukan untuk mempercepat diagnosis pasien, agar pasien dapat segera diobati dan penularan dapat dihentikan. Dalam penelitian ini, sebuah pendekatan segmentasi citra yang menggabungkan model warna LAB dan algoritma clustering K-Means diajukan untuk memisahkan dengan akurat area yang berisi bakteri tuberculosis dalam citra dari latar belakang. Pertama-tama, citra mikroskopis diubah ke dalam ruang warna LAB guna mengekstraksi komponen warna yang paling sensitif terhadap perbedaan intensitas dalam citra bakteri mycobacterium tuberculosis. Selanjutnya, melalui penerapan algoritma K-Means clustering, piksel-piksel citra dikelompokkan menjadi beberapa kelompok berdasarkan perbedaan intensitasnya. Hasil eksperimen menunjukkan bahwa pendekatan ini mampu mengisolasi area yang berisi bakteri mycobacterium tuberculosis dalam citra mikroskopis dengan akurasi dan efisiensi yang tinggi. Meskipun hasil akurasi yang tinggi didapatkan dengan cara diamati secara visual, penting untuk dicatat bahwa validasi akurasi segmentasi ini menjadi tantangan karena kurangnya cara yang objektif untuk memvalidasi keberadaan bakteri tuberkulosis dalam citra hasil. Namun, hasil penelitian ini memberikan indikasi yang kuat bahwa pendekatan segmentasi yang diusulkan ini memiliki potensi sebagai langkah awal dalam pengembangan sistem deteksi otomatis bakteri tuberculosis yang lebih canggih

Universitas Sriwijaya (UNSRI): E-Journal

Cross-resolution Face Recognition via Identity-Preserving Network and Knowledge Distillation

Author: Ebrahimi Touradj
Lu Yuhang
Publication venue
Publication date: 05/09/2023
Field of study

Cross-resolution face recognition has become a challenging problem for modern deep face recognition systems. It aims at matching a low-resolution probe image with high-resolution gallery images registered in a database. Existing methods mainly leverage prior information from high-resolution images by either reconstructing facial details with super-resolution techniques or learning a unified feature space. To address this challenge, this paper proposes a new approach that enforces the network to focus on the discriminative information stored in the low-frequency components of a low-resolution image. A cross-resolution knowledge distillation paradigm is first employed as the learning framework. Then, an identity-preserving network, WaveResNet, and a wavelet similarity loss are designed to capture low-frequency details and boost performance. Finally, an image degradation model is conceived to simulate more realistic low-resolution training data. Consequently, extensive experimental results show that the proposed method consistently outperforms the baseline model and other state-of-the-art methods across a variety of image resolutions

arXiv.org e-Print Archive

Feasibility of Universal Anomaly Detection without Knowing the Abnormality in Medical Images

Author: Asad Zuhayr
Bao Shunxing
Coburn Lori A.
Cui Can
Deng Ruining
Huo Yuankai
Landman Bennett A.
Lau Ken S.
Liu Qi
Remedios Lucas W.
Roland Joseph T.
Tang Yucheng
Wang Yaohong
Wilson Keith T.
Publication venue
Publication date: 19/08/2023
Field of study

Many anomaly detection approaches, especially deep learning methods, have been recently developed to identify abnormal image morphology by only employing normal images during training. Unfortunately, many prior anomaly detection methods were optimized for a specific "known" abnormality (e.g., brain tumor, bone fraction, cell types). Moreover, even though only the normal images were used in the training process, the abnormal images were often employed during the validation process (e.g., epoch selection, hyper-parameter tuning), which might leak the supposed ``unknown" abnormality unintentionally. In this study, we investigated these two essential aspects regarding universal anomaly detection in medical images by (1) comparing various anomaly detection methods across four medical datasets, (2) investigating the inevitable but often neglected issues on how to unbiasedly select the optimal anomaly detection model during the validation phase using only normal images, and (3) proposing a simple decision-level ensemble method to leverage the advantage of different kinds of anomaly detection without knowing the abnormality. The results of our experiments indicate that none of the evaluated methods consistently achieved the best performance across all datasets. Our proposed method enhanced the robustness of performance in general (average AUC 0.956)

arXiv.org e-Print Archive

Two-Step Active Learning for Instance Segmentation with Uncertainty and Diversity Sampling

Author: Albro Stephen
Batmanghelich Kayhan
DeSalvo Giulia
Kothawade Suraj
Rashwan Abdullah
Tavakkol Sasan
Yin Xiaoqi
Yu Ke
Publication venue
Publication date: 27/09/2023
Field of study

Training high-quality instance segmentation models requires an abundance of labeled images with instance masks and classifications, which is often expensive to procure. Active learning addresses this challenge by striving for optimum performance with minimal labeling cost by selecting the most informative and representative images for labeling. Despite its potential, active learning has been less explored in instance segmentation compared to other tasks like image classification, which require less labeling. In this study, we propose a post-hoc active learning algorithm that integrates uncertainty-based sampling with diversity-based sampling. Our proposed algorithm is not only simple and easy to implement, but it also delivers superior performance on various datasets. Its practical application is demonstrated on a real-world overhead imagery dataset, where it increases the labeling efficiency fivefold.Comment: UNCV ICCV 202

arXiv.org e-Print Archive

PRSNet: A Masked Self-Supervised Learning Pedestrian Re-Identification Method

Author: Dong Zhicheng
Xiang Hao
Xiao Zhijie
Publication venue
Publication date: 11/03/2023
Field of study

In recent years, self-supervised learning has attracted widespread academic debate and addressed many of the key issues of computer vision. The present research focus is on how to construct a good agent task that allows for improved network learning of advanced semantic information on images so that model reasoning is accelerated during pre-training of the current task. In order to solve the problem that existing feature extraction networks are pre-trained on the ImageNet dataset and cannot extract the fine-grained information in pedestrian images well, and the existing pre-task of contrast self-supervised learning may destroy the original properties of pedestrian images, this paper designs a pre-task of mask reconstruction to obtain a pre-training model with strong robustness and uses it for the pedestrian re-identification task. The training optimization of the network is performed by improving the triplet loss based on the centroid, and the mask image is added as an additional sample to the loss calculation, so that the network can better cope with the pedestrian matching in practical applications after the training is completed. This method achieves about 5% higher mAP on Marker1501 and CUHK03 data than existing self-supervised learning pedestrian re-identification methods, and about 1% higher for Rank1, and ablation experiments are conducted to demonstrate the feasibility of this method. Our model code is located at https://github.com/ZJieX/prsnet

arXiv.org e-Print Archive

Computer vision - ACCV 2018: 14th Asian conference on computer vision, Perth, Australia, December 2-6, 2018, revised selected papers, part V

Author: Jawahar C V
Li Hongdong
Mori Greg
Schindler Konrad
Publication venue: Springer International Publishing AG
Publication date: 01/01/2019
Field of study

CERN Document Server

Long Story Short: a Summarize-then-Search Method for Long Video Question Answering

Author: Chung Jiwan
Yu Youngjae
Publication venue
Publication date: 02/11/2023
Field of study

Large language models such as GPT-3 have demonstrated an impressive capability to adapt to new tasks without requiring task-specific training data. This capability has been particularly effective in settings such as narrative question answering, where the diversity of tasks is immense, but the available supervision data is small. In this work, we investigate if such language models can extend their zero-shot reasoning abilities to long multimodal narratives in multimedia content such as drama, movies, and animation, where the story plays an essential role. We propose Long Story Short, a framework for narrative video QA that first summarizes the narrative of the video to a short plot and then searches parts of the video relevant to the question. We also propose to enhance visual matching with CLIPCheck. Our model outperforms state-of-the-art supervised models by a large margin, highlighting the potential of zero-shot QA for long videos.Comment: Published in BMVC 202

arXiv.org e-Print Archive

SimSwap: An Efficient Framework For High Fidelity Face Swapping

Author: Bao Jianmin
Brock Andrew
Deng Jiankang
Goodfellow Ian J.
Gulrajani Ishaan
He Kaiming
Huang Xun
Ioffe Sergey
Isola Phillip
Korshunova Iryna
Liu Ming-Yu
Liu Ziwei
Natsume Ryota
Nirkin Yuval
Park Taesung
Wang Ting-Chun
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/06/2021
Field of study

We propose an efficient framework, called Simple Swap (SimSwap), aiming for generalized and high fidelity face swapping. In contrast to previous approaches that either lack the ability to generalize to arbitrary identity or fail to preserve attributes like facial expression and gaze direction, our framework is capable of transferring the identity of an arbitrary source face into an arbitrary target face while preserving the attributes of the target face. We overcome the above defects in the following two ways. First, we present the ID Injection Module (IIM) which transfers the identity information of the source face into the target face at feature level. By using this module, we extend the architecture of an identity-specific face swapping algorithm to a framework for arbitrary face swapping. Second, we propose the Weak Feature Matching Loss which efficiently helps our framework to preserve the facial attributes in an implicit way. Extensive experiments on wild faces demonstrate that our SimSwap is able to achieve competitive identity performance while preserving attributes better than previous state-of-the-art methods. The code is already available on github: https://github.com/neuralchen/SimSwap.Comment: Accepted by ACMMM 202

arXiv.org e-Print Archive

Crossref

CCFace: Classification Consistency for Low-Resolution Face Recognition

Author: Kashiani Hossein
Malakshan Sahar Rahimi
Nasrabadi Nasser M.
Saadabadi Mohammad Saeed Ebrahimi
Publication venue
Publication date: 17/08/2023
Field of study

In recent years, deep face recognition methods have demonstrated impressive results on in-the-wild datasets. However, these methods have shown a significant decline in performance when applied to real-world low-resolution benchmarks like TinyFace or SCFace. To address this challenge, we propose a novel classification consistency knowledge distillation approach that transfers the learned classifier from a high-resolution model to a low-resolution network. This approach helps in finding discriminative representations for low-resolution instances. To further improve the performance, we designed a knowledge distillation loss using the adaptive angular penalty inspired by the success of the popular angular margin loss function. The adaptive penalty reduces overfitting on low-resolution samples and alleviates the convergence issue of the model integrated with data augmentation. Additionally, we utilize an asymmetric cross-resolution learning approach based on the state-of-the-art semi-supervised representation learning paradigm to improve discriminability on low-resolution instances and prevent them from forming a cluster. Our proposed method outperforms state-of-the-art approaches on low-resolution benchmarks, with a three percent improvement on TinyFace while maintaining performance on high-resolution benchmarks.Comment: 2023 IEEE International Joint Conference on Biometrics (IJCB

arXiv.org e-Print Archive