43 research outputs found
An Alarm System For Segmentation Algorithm Based On Shape Model
It is usually hard for a learning system to predict correctly on rare events
that never occur in the training data, and there is no exception for
segmentation algorithms. Meanwhile, manual inspection of each case to locate
the failures becomes infeasible due to the trend of large data scale and
limited human resource. Therefore, we build an alarm system that will set off
alerts when the segmentation result is possibly unsatisfactory, assuming no
corresponding ground truth mask is provided. One plausible solution is to
project the segmentation results into a low dimensional feature space; then
learn classifiers/regressors to predict their qualities. Motivated by this, in
this paper, we learn a feature space using the shape information which is a
strong prior shared among different datasets and robust to the appearance
variation of input data.The shape feature is captured using a Variational
Auto-Encoder (VAE) network that trained with only the ground truth masks.
During testing, the segmentation results with bad shapes shall not fit the
shape prior well, resulting in large loss values. Thus, the VAE is able to
evaluate the quality of segmentation result on unseen data, without using
ground truth. Finally, we learn a regressor in the one-dimensional feature
space to predict the qualities of segmentation results. Our alarm system is
evaluated on several recent state-of-art segmentation algorithms for 3D medical
segmentation tasks. Compared with other standard quality assessment methods,
our system consistently provides more reliable prediction on the qualities of
segmentation results.Comment: Accepted to ICCV 2019 (10 pages, 4 figures
End-to-End Adversarial Shape Learning for Abdomen Organ Deep Segmentation
Automatic segmentation of abdomen organs using medical imaging has many
potential applications in clinical workflows. Recently, the state-of-the-art
performance for organ segmentation has been achieved by deep learning models,
i.e., convolutional neural network (CNN). However, it is challenging to train
the conventional CNN-based segmentation models that aware of the shape and
topology of organs. In this work, we tackle this problem by introducing a novel
end-to-end shape learning architecture -- organ point-network. It takes deep
learning features as inputs and generates organ shape representations as points
that located on organ surface. We later present a novel adversarial shape
learning objective function to optimize the point-network to capture shape
information better. We train the point-network together with a CNN-based
segmentation model in a multi-task fashion so that the shared network
parameters can benefit from both shape learning and segmentation tasks. We
demonstrate our method with three challenging abdomen organs including liver,
spleen, and pancreas. The point-network generates surface points with
fine-grained details and it is found critical for improving organ segmentation.
Consequently, the deep segmentation model is improved by the introduced shape
learning as significantly better Dice scores are observed for spleen and
pancreas segmentation.Comment: Accepted to International Workshop on Machine Learning in Medical
Imaging (MLMI2019
DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding
This paper presents DavarOCR, an open-source toolbox for OCR and document
understanding tasks. DavarOCR currently implements 19 advanced algorithms,
covering 9 different task forms. DavarOCR provides detailed usage instructions
and the trained models for each algorithm. Compared with the previous
opensource OCR toolbox, DavarOCR has relatively more complete support for the
sub-tasks of the cutting-edge technology of document understanding. In order to
promote the development and application of OCR technology in academia and
industry, we pay more attention to the use of modules that different
sub-domains of technology can share. DavarOCR is publicly released at
https://github.com/hikopensource/Davar-Lab-OCR.Comment: Short paper, Accept by ACM MM202
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models
Large language models (LLMs) have recently demonstrated remarkable
capabilities to comprehend human intentions, engage in reasoning, and design
planning-like behavior. To further unleash the power of LLMs to accomplish
complex tasks, there is a growing trend to build agent framework that equips
LLMs, such as ChatGPT, with tool-use abilities to connect with massive external
APIs. In this work, we introduce ModelScope-Agent, a general and customizable
agent framework for real-world applications, based on open-source LLMs as
controllers. It provides a user-friendly system library, with customizable
engine design to support model training on multiple open-source LLMs, while
also enabling seamless integration with both model APIs and common APIs in a
unified way. To equip the LLMs with tool-use abilities, a comprehensive
framework has been proposed spanning over tool-use data collection, tool
retrieval, tool registration, memory control, customized model training, and
evaluation for practical real-world applications. Finally, we showcase
ModelScopeGPT, a real-world intelligent assistant of ModelScope Community based
on the ModelScope-Agent framework, which is able to connect open-source LLMs
with more than 1000 public AI models and localized community knowledge in
ModelScope. The ModelScope-Agent
library\footnote{https://github.com/modelscope/modelscope-agent} and online
demo\footnote{https://modelscope.cn/studios/damo/ModelScopeGPT/summary} are now
publicly available
CancerUniT: Towards a Single Unified Model for Effective Detection, Segmentation, and Diagnosis of Eight Major Cancers Using a Large Collection of CT Scans
Human readers or radiologists routinely perform full-body multi-organ
multi-disease detection and diagnosis in clinical practice, while most medical
AI systems are built to focus on single organs with a narrow list of a few
diseases. This might severely limit AI's clinical adoption. A certain number of
AI models need to be assembled non-trivially to match the diagnostic process of
a human reading a CT scan. In this paper, we construct a Unified Tumor
Transformer (CancerUniT) model to jointly detect tumor existence & location and
diagnose tumor characteristics for eight major cancers in CT scans. CancerUniT
is a query-based Mask Transformer model with the output of multi-tumor
prediction. We decouple the object queries into organ queries, tumor detection
queries and tumor diagnosis queries, and further establish hierarchical
relationships among the three groups. This clinically-inspired architecture
effectively assists inter- and intra-organ representation learning of tumors
and facilitates the resolution of these complex, anatomically related
multi-organ cancer image reading tasks. CancerUniT is trained end-to-end using
a curated large-scale CT images of 10,042 patients including eight major types
of cancers and occurring non-cancer tumors (all are pathology-confirmed with 3D
tumor masks annotated by radiologists). On the test set of 631 patients,
CancerUniT has demonstrated strong performance under a set of clinically
relevant evaluation metrics, substantially outperforming both multi-disease
methods and an assembly of eight single-organ expert models in tumor detection,
segmentation, and diagnosis. This moves one step closer towards a universal
high performance cancer screening tool.Comment: ICCV 2023 Camera Ready Versio
Cytosolic delivery of siRNA by ultra-high affinity dsRNA binding proteins
Abstract Protein-based methods of siRNA delivery are capable of uniquely specific targeting, but are limited by technical challenges such as low potency or poor biophysical properties. Here, we engineered a series of ultra-high affinity siRNA binders based on the viral protein p19 and developed them into siRNA carriers targeted to the epidermal growth factor receptor (EGFR). Combined in trans with a previously described endosome-disrupting agent composed of the pore-forming protein Perfringolysin O (PFO), potent silencing was achieved in vitro with no detectable cytotoxicity. Despite concerns that excessively strong siRNA binding could prevent the discharge of siRNA from its carrier, higher affinity continually led to stronger silencing. We found that this improvement was due to both increased uptake of siRNA into the cell and improved pharmacodynamics inside the cell. Mathematical modeling predicted the existence of an affinity optimum that maximizes silencing, after which siRNA sequestration decreases potency. Our study characterizing the affinity dependence of silencing suggests that siRNA-carrier affinity can significantly affect the intracellular fate of siRNA and may serve as a handle for improving the efficiency of delivery. The two-agent delivery system presented here possesses notable biophysical properties and potency, and provide a platform for the cytosolic delivery of nucleic acids
The Medical Segmentation Decathlon
International challenges have become the de facto standard for comparative
assessment of image analysis algorithms given a specific task. Segmentation is
so far the most widely investigated medical image processing task, but the
various segmentation challenges have typically been organized in isolation,
such that algorithm development was driven by the need to tackle a single
specific clinical problem. We hypothesized that a method capable of performing
well on multiple tasks will generalize well to a previously unseen task and
potentially outperform a custom-designed solution. To investigate the
hypothesis, we organized the Medical Segmentation Decathlon (MSD) - a
biomedical image analysis challenge, in which algorithms compete in a multitude
of both tasks and modalities. The underlying data set was designed to explore
the axis of difficulties typically encountered when dealing with medical
images, such as small data sets, unbalanced labels, multi-site data and small
objects. The MSD challenge confirmed that algorithms with a consistent good
performance on a set of tasks preserved their good average performance on a
different set of previously unseen tasks. Moreover, by monitoring the MSD
winner for two years, we found that this algorithm continued generalizing well
to a wide range of other clinical problems, further confirming our hypothesis.
Three main conclusions can be drawn from this study: (1) state-of-the-art image
segmentation algorithms are mature, accurate, and generalize well when
retrained on unseen tasks; (2) consistent algorithmic performance across
multiple tasks is a strong surrogate of algorithmic generalizability; (3) the
training of accurate AI segmentation models is now commoditized to non AI
experts