4 research outputs found
Attention and Pooling based Sigmoid Colon Segmentation in 3D CT images
Segmentation of the sigmoid colon is a crucial aspect of treating
diverticulitis. It enables accurate identification and localisation of
inflammation, which in turn helps healthcare professionals make informed
decisions about the most appropriate treatment options. This research presents
a novel deep learning architecture for segmenting the sigmoid colon from
Computed Tomography (CT) images using a modified 3D U-Net architecture. Several
variations of the 3D U-Net model with modified hyper-parameters were examined
in this study. Pyramid pooling (PyP) and channel-spatial Squeeze and Excitation
(csSE) were also used to improve the model performance. The networks were
trained using manually annotated sigmoid colon. A five-fold cross-validation
procedure was used on a test dataset to evaluate the network's performance. As
indicated by the maximum Dice similarity coefficient (DSC) of 56.92+/-1.42%,
the application of PyP and csSE techniques improves segmentation precision. We
explored ensemble methods including averaging, weighted averaging, majority
voting, and max ensemble. The results show that average and majority voting
approaches with a threshold value of 0.5 and consistent weight distribution
among the top three models produced comparable and optimal results with DSC of
88.11+/-3.52%. The results indicate that the application of a modified 3D U-Net
architecture is effective for segmenting the sigmoid colon in Computed
Tomography (CT) images. In addition, the study highlights the potential
benefits of integrating ensemble methods to improve segmentation precision.Comment: 8 Pages, 6 figures, Accepted at IEEE DICTA 202
Relation preserving triplet mining for stabilising the triplet loss in re-identification systems
Object appearances change dramatically with pose variations. This creates a
challenge for embedding schemes that seek to map instances with the same object
ID to locations that are as close as possible. This issue becomes significantly
heightened in complex computer vision tasks such as re-identification(reID). In
this paper, we suggest that these dramatic appearance changes are indications
that an object ID is composed of multiple natural groups, and it is
counterproductive to forcefully map instances from different groups to a common
location. This leads us to introduce Relation Preserving Triplet Mining (RPTM),
a feature-matching guided triplet mining scheme, that ensures that triplets
will respect the natural subgroupings within an object ID. We use this triplet
mining mechanism to establish a pose-aware, well-conditioned triplet loss by
implicitly enforcing view consistency. This allows a single network to be
trained with fixed parameters across datasets while providing state-of-the-art
results. Code is available at https://github.com/adhirajghosh/RPTM_reid.Comment: WACV 202