Search CORE

66 research outputs found

Few-shot adaptation for morphology-independent cell instance segmentation

Author: Brume Voke
Doretto Gianfranco
Zaveri Ram J.
Publication venue
Publication date: 26/02/2024
Field of study

Microscopy data collections are becoming larger and more frequent. Accurate and precise quantitative analysis tools like cell instance segmentation are necessary to benefit from them. This is challenging due to the variability in the data, which requires retraining the segmentation model to maintain high accuracy on new collections. This is needed especially for segmenting cells with elongated and non-convex morphology like bacteria. We propose to reduce the amount of annotation and computing power needed for retraining the model by introducing a few-shot domain adaptation approach that requires annotating only one to five cells of the new data to process and that quickly adapts the model to maintain high accuracy. Our results show a significant boost in accuracy after adaptation to very challenging bacteria datasets.Comment: ISBI 202

arXiv.org e-Print Archive

Automated Classification of Vowel Category and Speaker Type in the High-Frequency Spectrum

Author: Donai Jeremy J.
Doretto Gianfranco
Motiian Saeid
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2016
Field of study

The high-frequency region of vowel signals (above the third formant or F3) has received little research attention. Recent evidence, however, has documented the perceptual utility of high-frequency information in the speech signal above the traditional frequency bandwidth known to contain important cues for speech and speaker recognition. The purpose of this study was to determine if high-pass filtered vowels could be separated by vowel category and speaker type in a supervised learning framework. Mel frequency cepstral coefficients (MFCCs) were extracted from productions of six vowel categories produced by two male, two female, and two child speakers. Results revealed that the filtered vowels were well separated by vowel category and speaker type using MFCCs from the high-frequency spectrum. This demonstrates the presence of useful information for automated classification from the high-frequency region and is the first study to report findings of this nature in a supervised learning framework

Multidisciplinary Digital Publishing Institute

Crossref

Directory of Open Access Journals

PubMed Central

The Research Repository @ WVU (West Virginia University)

Self-supervised Interest Point Detection and Description for Fisheye and Perspective Images

Author: Doretto Gianfranco
Gu Yu
Mera-Trujillo Marcela
Patel Shivang
Publication venue
Publication date: 02/06/2023
Field of study

Keypoint detection and matching is a fundamental task in many computer vision problems, from shape reconstruction, to structure from motion, to AR/VR applications and robotics. It is a well-studied problem with remarkable successes such as SIFT, and more recent deep learning approaches. While great robustness is exhibited by these techniques with respect to noise, illumination variation, and rigid motion transformations, less attention has been placed on image distortion sensitivity. In this work, we focus on the case when this is caused by the geometry of the cameras used for image acquisition, and consider the keypoint detection and matching problem between the hybrid scenario of a fisheye and a projective image. We build on a state-of-the-art approach and derive a self-supervised procedure that enables training an interest point detector and descriptor network. We also collected two new datasets for additional training and testing in this unexplored scenario, and we demonstrate that current approaches are suboptimal because they are designed to work in traditional projective conditions, while the proposed approach turns out to be the most effective.Comment: CVPR Workshop on Omnidirectional Computer Vision, 202

arXiv.org e-Print Archive

Open-Fusion: Real-time Open-Vocabulary 3D Mapping and Queryable Scene Representation

Author: Doretto Gianfranco
Hanyu Taisei
Le Ngan
Nguyen Anh
Pham Thang
Tran Minh
Vo Khoa
Yamazaki Kashu
Publication venue
Publication date: 05/10/2023
Field of study

Precise 3D environmental mapping is pivotal in robotics. Existing methods often rely on predefined concepts during training or are time-intensive when generating semantic maps. This paper presents Open-Fusion, a groundbreaking approach for real-time open-vocabulary 3D mapping and queryable scene representation using RGB-D data. Open-Fusion harnesses the power of a pre-trained vision-language foundation model (VLFM) for open-set semantic comprehension and employs the Truncated Signed Distance Function (TSDF) for swift 3D scene reconstruction. By leveraging the VLFM, we extract region-based embeddings and their associated confidence maps. These are then integrated with 3D knowledge from TSDF using an enhanced Hungarian-based feature-matching mechanism. Notably, Open-Fusion delivers outstanding annotation-free 3D segmentation for open-vocabulary without necessitating additional 3D training. Benchmark tests on the ScanNet dataset against leading zero-shot methods highlight Open-Fusion's superiority. Furthermore, it seamlessly combines the strengths of region-based VLFM and TSDF, facilitating real-time 3D scene comprehension that includes object concepts and open-world semantics. We encourage the readers to view the demos on our project page: https://uark-aicv.github.io/OpenFusio

arXiv.org e-Print Archive

SAM3D: Segment Anything Model in Volumetric Medical Images

Author: Adjeroh Donald
Bui Nhat-Tan
Choudhary Arabinda
Doretto Gianfranco
Hoang Dinh-Hieu
Le Ngan
Patel Brijesh
Tran Minh-Triet
Publication venue
Publication date: 05/03/2024
Field of study

Image segmentation remains a pivotal component in medical image analysis, aiding in the extraction of critical information for precise diagnostic practices. With the advent of deep learning, automated image segmentation methods have risen to prominence, showcasing exceptional proficiency in processing medical imagery. Motivated by the Segment Anything Model (SAM)-a foundational model renowned for its remarkable precision and robust generalization capabilities in segmenting 2D natural images-we introduce SAM3D, an innovative adaptation tailored for 3D volumetric medical image analysis. Unlike current SAM-based methods that segment volumetric data by converting the volume into separate 2D slices for individual analysis, our SAM3D model processes the entire 3D volume image in a unified approach. Extensive experiments are conducted on multiple medical image datasets to demonstrate that our network attains competitive results compared with other state-of-the-art methods in 3D medical segmentation tasks while being significantly efficient in terms of parameters. Code and checkpoints are available at https://github.com/UARK-AICV/SAM3D.Comment: Accepted at ISBI 202

arXiv.org e-Print Archive

Current Topological and Machine Learning Applications for Bias Detection in Text

Author: Atalls Shadi
Carlsson Gunnar
Choudhary Ashok
Doretto Gianfranco
Farrelly Colleen
Hathaway Quincy A.
Himeur Yassine
Mansoor Wathiq
Paul Rahul
Singh Yashbir
Publication venue
Publication date: 22/11/2023
Field of study

Institutional bias can impact patient outcomes, educational attainment, and legal system navigation. Written records often reflect bias, and once bias is identified; it is possible to refer individuals for training to reduce bias. Many machine learning tools exist to explore text data and create predictive models that can search written records to identify real-time bias. However, few previous studies investigate large language model embeddings and geometric models of biased text data to understand geometry's impact on bias modeling accuracy. To overcome this issue, this study utilizes the RedditBias database to analyze textual biases. Four transformer models, including BERT and RoBERTa variants, were explored. Post-embedding, t-SNE allowed two-dimensional visualization of data. KNN classifiers differentiated bias types, with lower k-values proving more effective. Findings suggest BERT, particularly mini BERT, excels in bias classification, while multilingual models lag. The recommendation emphasizes refining monolingual models and exploring domain-specific biases

arXiv.org e-Print Archive

Artificial Intelligence

Author: Doretto Gianfranco
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2019
Field of study

The Research Repository @ WVU (West Virginia University)