56 research outputs found
Towards Quantitative Endoscopy with Vision Intelligence
In this thesis, we work on topics related to quantitative endoscopy with vision-based intelligence. Specifically, our works revolve around the topic of video reconstruction in endoscopy, where many challenges exist, such as texture scarceness, illumination variation, multimodality, etc., and these prevent prior works from working effectively and robustly. To this end, we propose to combine the strength of expressivity of deep learning approaches and the rigorousness and accuracy of non-linear optimization algorithms to develop a series of methods to confront such challenges towards quantitative endoscopy. We first propose a retrospective sparse reconstruction method that can estimate a high-accuracy and density point cloud and high-completeness camera trajectory from a monocular endoscopic video with state-of-the-art performance. To enable this, replacing the role of a hand-crafted local descriptor, a deep image feature descriptor is developed to boost the feature matching performance in a typical sparse reconstruction algorithm. A retrospective surface reconstruction pipeline is then proposed to estimate a textured surface model from a monocular endoscopic video, where self-supervised depth and descriptor learning and surface fusion technique is involved. We show that the proposed method performs superior to a popular dense reconstruction method and the estimate reconstructions are in good agreement with the surface models obtained from CT scans. To align video-reconstructed surface models with pre-operative imaging such as CT, we introduce a global point cloud registration algorithm that is robust to resolution mismatch that often happens in such multi-modal scenarios. Specifically, a geometric feature descriptor is developed where a novel network normalization technique is used to help a 3D network produce more consistent and distinctive geometric features for samples with different resolutions. The proposed geometric descriptor achieves state-of-the-art performance, based on our evaluation. Last but not least, a real-time SLAM system that estimates a surface geometry and camera trajectory from a monocular endoscopic video is developed, where deep representations for geometry and appearance and non-linear factor graph optimization are used. We show that the proposed SLAM system performs favorably compared with a state-of-the-art feature-based SLAM system
Temporal similarity metrics for latent network reconstruction: The role of time-lag decay
When investigating the spreading of a piece of information or the diffusion
of an innovation, we often lack information on the underlying propagation
network. Reconstructing the hidden propagation paths based on the observed
diffusion process is a challenging problem which has recently attracted
attention from diverse research fields. To address this reconstruction problem,
based on static similarity metrics commonly used in the link prediction
literature, we introduce new node-node temporal similarity metrics. The new
metrics take as input the time-series of multiple independent spreading
processes, based on the hypothesis that two nodes are more likely to be
connected if they were often infected at similar points in time. This
hypothesis is implemented by introducing a time-lag function which penalizes
distant infection times. We find that the choice of this time-lag strongly
affects the metrics' reconstruction accuracy, depending on the network's
clustering coefficient and we provide an extensive comparative analysis of
static and temporal similarity metrics for network reconstruction. Our findings
shed new light on the notion of similarity between pairs of nodes in complex
networks
Learning to Count Isomorphisms with Graph Neural Networks
Subgraph isomorphism counting is an important problem on graphs, as many
graph-based tasks exploit recurring subgraph patterns. Classical methods
usually boil down to a backtracking framework that needs to navigate a huge
search space with prohibitive computational costs. Some recent studies resort
to graph neural networks (GNNs) to learn a low-dimensional representation for
both the query and input graphs, in order to predict the number of subgraph
isomorphisms on the input graph. However, typical GNNs employ a node-centric
message passing scheme that receives and aggregates messages on nodes, which is
inadequate in complex structure matching for isomorphism counting. Moreover, on
an input graph, the space of possible query graphs is enormous, and different
parts of the input graph will be triggered to match different queries. Thus,
expecting a fixed representation of the input graph to match diversely
structured query graphs is unrealistic. In this paper, we propose a novel GNN
called Count-GNN for subgraph isomorphism counting, to deal with the above
challenges. At the edge level, given that an edge is an atomic unit of encoding
graph structures, we propose an edge-centric message passing scheme, where
messages on edges are propagated and aggregated based on the edge adjacency to
preserve fine-grained structural information. At the graph level, we modulate
the input graph representation conditioned on the query, so that the input
graph can be adapted to each query individually to improve their matching.
Finally, we conduct extensive experiments on a number of benchmark datasets to
demonstrate the superior performance of Count-GNN.Comment: AAAI-23 main trac
- …