192 research outputs found
Robust Brain MRI Image Classification with SIBOW-SVM
The majority of primary Central Nervous System (CNS) tumors in the brain are
among the most aggressive diseases affecting humans. Early detection of brain
tumor types, whether benign or malignant, glial or non-glial, is critical for
cancer prevention and treatment, ultimately improving human life expectancy.
Magnetic Resonance Imaging (MRI) stands as the most effective technique to
detect brain tumors by generating comprehensive brain images through scans.
However, human examination can be error-prone and inefficient due to the
complexity, size, and location variability of brain tumors. Recently, automated
classification techniques using machine learning (ML) methods, such as
Convolutional Neural Network (CNN), have demonstrated significantly higher
accuracy than manual screening, while maintaining low computational costs.
Nonetheless, deep learning-based image classification methods, including CNN,
face challenges in estimating class probabilities without proper model
calibration. In this paper, we propose a novel brain tumor image classification
method, called SIBOW-SVM, which integrates the Bag-of-Features (BoF) model with
SIFT feature extraction and weighted Support Vector Machines (wSVMs). This new
approach effectively captures hidden image features, enabling the
differentiation of various tumor types and accurate label predictions.
Additionally, the SIBOW-SVM is able to estimate the probabilities of images
belonging to each class, thereby providing high-confidence classification
decisions. We have also developed scalable and parallelable algorithms to
facilitate the practical implementation of SIBOW-SVM for massive images. As a
benchmark, we apply the SIBOW-SVM to a public data set of brain tumor MRI
images containing four classes: glioma, meningioma, pituitary, and normal. Our
results show that the new method outperforms state-of-the-art methods,
including CNN
Efficient Match Pair Retrieval for Large-scale UAV Images via Graph Indexed Global Descriptor
SfM (Structure from Motion) has been extensively used for UAV (Unmanned
Aerial Vehicle) image orientation. Its efficiency is directly influenced by
feature matching. Although image retrieval has been extensively used for match
pair selection, high computational costs are consumed due to a large number of
local features and the large size of the used codebook. Thus, this paper
proposes an efficient match pair retrieval method and implements an integrated
workflow for parallel SfM reconstruction. First, an individual codebook is
trained online by considering the redundancy of UAV images and local features,
which avoids the ambiguity of training codebooks from other datasets. Second,
local features of each image are aggregated into a single high-dimension global
descriptor through the VLAD (Vector of Locally Aggregated Descriptors)
aggregation by using the trained codebook, which remarkably reduces the number
of features and the burden of nearest neighbor searching in image indexing.
Third, the global descriptors are indexed via the HNSW (Hierarchical Navigable
Small World) based graph structure for the nearest neighbor searching. Match
pairs are then retrieved by using an adaptive threshold selection strategy and
utilized to create a view graph for divide-and-conquer based parallel SfM
reconstruction. Finally, the performance of the proposed solution has been
verified using three large-scale UAV datasets. The test results demonstrate
that the proposed solution accelerates match pair retrieval with a speedup
ratio ranging from 36 to 108 and improves the efficiency of SfM reconstruction
with competitive accuracy in both relative and absolute orientation
Prompt, Plan, Perform: LLM-based Humanoid Control via Quantized Imitation Learning
In recent years, reinforcement learning and imitation learning have shown
great potential for controlling humanoid robots' motion. However, these methods
typically create simulation environments and rewards for specific tasks,
resulting in the requirements of multiple policies and limited capabilities for
tackling complex and unknown tasks. To overcome these issues, we present a
novel approach that combines adversarial imitation learning with large language
models (LLMs). This innovative method enables the agent to learn reusable
skills with a single policy and solve zero-shot tasks under the guidance of
LLMs. In particular, we utilize the LLM as a strategic planner for applying
previously learned skills to novel tasks through the comprehension of
task-specific prompts. This empowers the robot to perform the specified actions
in a sequence. To improve our model, we incorporate codebook-based vector
quantization, allowing the agent to generate suitable actions in response to
unseen textual commands from LLMs. Furthermore, we design general reward
functions that consider the distinct motion features of humanoid robots,
ensuring the agent imitates the motion data while maintaining goal orientation
without additional guiding direction approaches or policies. To the best of our
knowledge, this is the first framework that controls humanoid robots using a
single learning policy network and LLM as a planner. Extensive experiments
demonstrate that our method exhibits efficient and adaptive ability in
complicated motion tasks
- …