73 research outputs found

    Non-Iterative Scribble-Supervised Learning with Pacing Pseudo-Masks for Medical Image Segmentation

    Full text link
    Scribble-supervised medical image segmentation tackles the limitation of sparse masks. Conventional approaches alternate between: labeling pseudo-masks and optimizing network parameters. However, such iterative two-stage paradigm is unwieldy and could be trapped in poor local optima since the networks undesirably regress to the erroneous pseudo-masks. To address these issues, we propose a non-iterative method where a stream of varying (pacing) pseudo-masks teach a network via consistency training, named PacingPseudo. Our motivation lies first in a non-iterative process. Interestingly, it can be achieved gracefully by a siamese architecture, wherein a stream of pseudo-masks naturally assimilate a stream of predicted masks during training. Second, we make the consistency training effective with two necessary designs: (i) entropy regularization to obtain high-confidence pseudo-masks for effective teaching; and (ii) distorted augmentations to create discrepancy between the pseudo-mask and predicted-mask streams for consistency regularization. Third, we devise a new memory bank mechanism that provides an extra source of ensemble features to complement scarce labeled pixels. The efficacy of the proposed PacingPseudo is validated on three public medical image datasets, including the segmentation tasks of abdominal multi-organs, cardiac structures, and myocardium. Extensive experiments demonstrate our PacingPseudo improves the baseline by large margins and consistently outcompetes several previous methods. In some cases, our PacingPseudo achieves comparable performance with its fully-supervised counterparts, showing the feasibility of our method for the challenging scribble-supervised segmentation applications. The code and scribble annotations will be publicly available.Comment: 12 pages, 8 figure

    AI as Extraherics: Fostering Higher-order Thinking Skills in Human-AI Interaction

    Full text link
    As artificial intelligence (AI) technologies, including generative AI, continue to evolve, concerns have arisen about over-reliance on AI, which may lead to human deskilling and diminished cognitive engagement. Over-reliance on AI can also lead users to accept information given by AI without performing critical examinations, causing negative consequences, such as misleading users with hallucinated contents. This paper introduces extraheric AI, a human-AI interaction conceptual framework that fosters users\u27 higher-order thinking skills, such as creativity, critical thinking, and problem-solving, during task completion. Unlike existing human-AI interaction designs, which replace or augment human cognition, extraheric AI fosters cognitive engagement by posing questions or providing alternative perspectives to users, rather than direct answers. We discuss interaction strategies, evaluation methods aligned with cognitive load theory and Bloom\u27s taxonomy, and future research directions to ensure that human cognitive skills remain a crucial element in AI-integrated environments, promoting a balanced partnership between humans and AI

    Online Streaming Video Super-Resolution with Convolutional Look-Up Table

    Full text link
    Online video streaming has fundamental limitations on the transmission bandwidth and computational capacity and super-resolution is a promising potential solution. However, applying existing video super-resolution methods to online streaming is non-trivial. Existing video codecs and streaming protocols (\eg, WebRTC) dynamically change the video quality both spatially and temporally, which leads to diverse and dynamic degradations. Furthermore, online streaming has a strict requirement for latency that most existing methods are less applicable. As a result, this paper focuses on the rarely exploited problem setting of online streaming video super resolution. To facilitate the research on this problem, a new benchmark dataset named LDV-WebRTC is constructed based on a real-world online streaming system. Leveraging the new benchmark dataset, we proposed a novel method specifically for online video streaming, which contains a convolution and Look-Up Table (LUT) hybrid model to achieve better performance-latency trade-off. To tackle the changing degradations, we propose a mixture-of-expert-LUT module, where a set of LUT specialized in different degradations are built and adaptively combined to handle different degradations. Experiments show our method achieves 720P video SR around 100 FPS, while significantly outperforms existing LUT-based methods and offers competitive performance compared to efficient CNN-based methods

    Software-Defined Virtual Synchronous Condenser

    Full text link
    Synchronous condensers (SCs) play important roles in integrating wind energy into relatively weak power grids. However, the design of SCs usually depends on specific application requirements and may not be adaptive enough to the frequently-changing grid conditions caused by the transition from conventional to renewable power generation. This paper devises a software-defined virtual synchronous condenser (SDViSC) method to address the challenges. Our contributions are fourfold: 1) design of a virtual synchronous condenser (ViSC) to enable full converter wind turbines to provide built-in SC functionalities; 2) engineering SDViSCs to transfer hardware-based ViSC controllers into software services, where a Tustin transformation-based software-defined control algorithm guarantees accurate tracking of fast dynamics under limited communication bandwidth; 3) a software-defined networking-enhanced SDViSC communication scheme to allow enhanced communication reliability and reduced communication bandwidth occupation; and 4) Prototype of SDViSC on our real-time, cyber-in-the-loop digital twin of large-wind-farm in an RTDS environment. Extensive test results validate the excellent performance of SDViSC to support reliable and resilient operations of wind farms under various physical and cyber conditions

    VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness

    Full text link
    Finetuning a pretrained vision model (PVM) is a common technique for learning downstream vision tasks. However, the conventional finetuning process with randomly sampled data points results in diminished training efficiency. To address this drawback, we propose a novel approach, Vision-language Collaborative Active Finetuning (VeCAF). With the emerging availability of labels and natural language annotations of images through web-scale crawling or controlled generation, VeCAF makes use of these information to perform parametric data selection for PVM finetuning. VeCAF incorporates the finetuning objective to select significant data points that effectively guide the PVM towards faster convergence to meet the performance goal. This process is assisted by the inherent semantic richness of the text embedding space which we use to augment image features. Furthermore, the flexibility of text-domain augmentation allows VeCAF to handle out-of-distribution scenarios without external data. Extensive experiments show the leading performance and high computational efficiency of VeCAF that is superior to baselines in both in-distribution and out-of-distribution image classification tasks. On ImageNet, VeCAF uses up to 3.3x less training batches to reach the target performance compared to full finetuning, and achieves an accuracy improvement of 2.7% over the state-of-the-art active finetuning method with the same number of batches.Comment: 13 page

    Characterizing the Structural Pattern Predicting Medication Response in Herpes Zoster Patients Using Multivoxel Pattern Analysis

    Get PDF
    Herpes zoster (HZ) can cause a blistering skin rash with severe neuropathic pain. Pharmacotherapy is the most common treatment for HZ patients. However, most patients are usually the elderly or those that are immunocompromised, and thus often suffer from side effects or easily get intractable post-herpetic neuralgia (PHN) if medication fails. It is challenging for clinicians to tailor treatment to patients, due to the lack of prognosis information on the neurological pathogenesis that underlies HZ. In the current study, we aimed at characterizing the brain structural pattern of HZ before treatment with medication that could help predict medication responses. High-resolution structural magnetic resonance imaging (MRI) scans of 14 right-handed HZ patients (aged 61.0 ± 7.0, 8 males) with poor response and 15 (aged 62.6 ± 8.3, 5 males) age- (p = 0.58), gender-matched (p = 0.20) patients responding well, were acquired and analyzed. Multivoxel pattern analysis (MVPA) with a searchlight algorithm and support vector machine (SVM), was applied to identify the spatial pattern of the gray matter (GM) volume, with high predicting accuracy. The predictive regions, with an accuracy higher than 79%, were located within the cerebellum, posterior insular cortex (pIC), middle and orbital frontal lobes (mFC and OFC), anterior and middle cingulum (ACC and MCC), precuneus (PCu) and cuneus. Among these regions, mFC, pIC and MCC displayed significant increases of GM volumes in patients with poor response, compared to those with a good response. The combination of sMRI and MVPA might be a useful tool to explore the neuroanatomical imaging biomarkers of HZ-related pain associated with medication responses

    Cuckoo search algorithm with deep search

    Full text link

    Recurrent Feature Propagation and Edge Skip-Connections for Automatic Abdominal Organ Segmentation

    Full text link
    Automatic segmentation of abdominal organs in computed tomography (CT) images can support radiation therapy and image-guided surgery workflows. Developing of such automatic solutions remains challenging mainly owing to complex organ interactions and blurry boundaries in CT images. To address these issues, we focus on effective spatial context modeling and explicit edge segmentation priors. Accordingly, we propose a 3D network with four main components trained end-to-end including shared encoder, edge detector, decoder with edge skip-connections (ESCs) and recurrent feature propagation head (RFP-Head). To capture wide-range spatial dependencies, the RFP-Head propagates and harvests local features through directed acyclic graphs (DAGs) formulated with recurrent connections in an efficient slice-wise manner, with regard to spatial arrangement of image units. To leverage edge information, the edge detector learns edge prior knowledge specifically tuned for semantic segmentation by exploiting intermediate features from the encoder with the edge supervision. The ESCs then aggregate the edge knowledge with multi-level decoder features to learn a hierarchy of discriminative features explicitly modeling complementarity between organs' interiors and edges for segmentation. We conduct extensive experiments on two challenging abdominal CT datasets with eight annotated organs. Experimental results show that the proposed network outperforms several state-of-the-art models, especially for the segmentation of small and complicated structures (gallbladder, esophagus, stomach, pancreas and duodenum). The code will be publicly available
    corecore