122 research outputs found

    Aberrant resting-state brain activity in Huntington's disease: A voxel-based meta-analysis

    Get PDF
    IntroductionFunctional neuroimaging could provide abundant information of underling pathophysiological mechanisms of the clinical triad including motor, cognitive and psychiatric impairment in Huntington's Disease (HD).MethodsWe performed a voxel-based meta-analysis using anisotropic effect size-signed differential mapping (AES-SDM) method.Results6 studies (78 symptomatic HD, 102 premanifest HD and 131 healthy controls) were included in total. Altered resting-state brain activity was primarily detected in the bilateral medial part of superior frontal gyrus, bilateral anterior cingulate/paracingulate gyrus, left insula, left striatum, right cortico-spinal projections area, right inferior temporal gyrus area, right thalamus, right cerebellum and right gyrus rectus area. Premanifest and symptomatic HD patients showed different alterative pattern in the subgroup analyses.DiscussionThe robust and consistent abnormalities in the specific brain regions identified in the current study could help to understand the pathophysiology of HD and explore reliable neuroimaging biomarkers for monitoring disease progression, or even predicting the onset of premanifest HD patients

    Saliency-guided and Patch-based Mixup for Long-tailed Skin Cancer Image Classification

    Full text link
    Medical image datasets often exhibit long-tailed distributions due to the inherent challenges in medical data collection and annotation. In long-tailed contexts, some common disease categories account for most of the data, while only a few samples are available in the rare disease categories, resulting in poor performance of deep learning methods. To address this issue, previous approaches have employed class re-sampling or re-weighting techniques, which often encounter challenges such as overfitting to tail classes or difficulties in optimization during training. In this work, we propose a novel approach, namely \textbf{S}aliency-guided and \textbf{P}atch-based \textbf{Mix}up (SPMix) for long-tailed skin cancer image classification. Specifically, given a tail-class image and a head-class image, we generate a new tail-class image by mixing them under the guidance of saliency mapping, which allows for preserving and augmenting the discriminative features of the tail classes without any interference of the head-class features. Extensive experiments are conducted on the ISIC2018 dataset, demonstrating the superiority of SPMix over existing state-of-the-art methods.Comment: IEEE ISBI202

    A Secure and Efficient Multi-Object Grasping Detection Approach for Robotic Arms

    Full text link
    Robotic arms are widely used in automatic industries. However, with wide applications of deep learning in robotic arms, there are new challenges such as the allocation of grasping computing power and the growing demand for security. In this work, we propose a robotic arm grasping approach based on deep learning and edge-cloud collaboration. This approach realizes the arbitrary grasp planning of the robot arm and considers the grasp efficiency and information security. In addition, the encoder and decoder trained by GAN enable the images to be encrypted while compressing, which ensures the security of privacy. The model achieves 92% accuracy on the OCID dataset, the image compression ratio reaches 0.03%, and the structural difference value is higher than 0.91

    Multiscale Superpixel Structured Difference Graph Convolutional Network for VL Representation

    Full text link
    Within the multimodal field, the key to integrating vision and language lies in establishing a good alignment strategy. Recently, benefiting from the success of self-supervised learning, significant progress has been made in multimodal semantic representation based on pre-trained models for vision and language. However, there is still room for improvement in visual semantic representation. The lack of spatial semantic coherence and vulnerability to noise makes it challenging for current pixel or patch-based methods to accurately extract complex scene boundaries. To this end, this paper develops superpixel as a comprehensive compact representation of learnable image data, which effectively reduces the number of visual primitives for subsequent processing by clustering perceptually similar pixels. To mine more precise topological relations, we propose a Multiscale Difference Graph Convolutional Network (MDGCN). It parses the entire image as a fine-to-coarse hierarchical structure of constituent visual patterns, and captures multiscale features by progressively merging adjacent superpixels as graph nodes. Moreover, we predict the differences between adjacent nodes through the graph structure, facilitating key information aggregation of graph nodes to reason actual semantic relations. Afterward, we design a multi-level fusion rule in a bottom-up manner to avoid understanding deviation by learning complementary spatial information at different regional scales. Our proposed method can be well applied to multiple downstream task learning. Extensive experiments demonstrate that our method is competitive with other state-of-the-art methods in visual reasoning. Our code will be released upon publication

    Graph-guided Architecture Search for Real-time Semantic Segmentation

    Full text link
    Designing a lightweight semantic segmentation network often requires researchers to find a trade-off between performance and speed, which is always empirical due to the limited interpretability of neural networks. In order to release researchers from these tedious mechanical trials, we propose a Graph-guided Architecture Search (GAS) pipeline to automatically search real-time semantic segmentation networks. Unlike previous works that use a simplified search space and stack a repeatable cell to form a network, we introduce a novel search mechanism with new search space where a lightweight model can be effectively explored through the cell-level diversity and latencyoriented constraint. Specifically, to produce the cell-level diversity, the cell-sharing constraint is eliminated through the cell-independent manner. Then a graph convolution network (GCN) is seamlessly integrated as a communication mechanism between cells. Finally, a latency-oriented constraint is endowed into the search process to balance the speed and performance. Extensive experiments on Cityscapes and CamVid datasets demonstrate that GAS achieves the new state-of-the-art trade-off between accuracy and speed. In particular, on Cityscapes dataset, GAS achieves the new best performance of 73.5% mIoU with speed of 108.4 FPS on Titan Xp.Comment: CVPR202

    Controllable Multi-Objective Re-ranking with Policy Hypernetworks

    Full text link
    Multi-stage ranking pipelines have become widely used strategies in modern recommender systems, where the final stage aims to return a ranked list of items that balances a number of requirements such as user preference, diversity, novelty etc. Linear scalarization is arguably the most widely used technique to merge multiple requirements into one optimization objective, by summing up the requirements with certain preference weights. Existing final-stage ranking methods often adopt a static model where the preference weights are determined during offline training and kept unchanged during online serving. Whenever a modification of the preference weights is needed, the model has to be re-trained, which is time and resources inefficient. Meanwhile, the most appropriate weights may vary greatly for different groups of targeting users or at different time periods (e.g., during holiday promotions). In this paper, we propose a framework called controllable multi-objective re-ranking (CMR) which incorporates a hypernetwork to generate parameters for a re-ranking model according to different preference weights. In this way, CMR is enabled to adapt the preference weights according to the environment changes in an online manner, without retraining the models. Moreover, we classify practical business-oriented tasks into four main categories and seamlessly incorporate them in a new proposed re-ranking model based on an Actor-Evaluator framework, which serves as a reliable real-world testbed for CMR. Offline experiments based on the dataset collected from Taobao App showed that CMR improved several popular re-ranking models by using them as underlying models. Online A/B tests also demonstrated the effectiveness and trustworthiness of CMR

    Evaluating Open-QA Evaluation

    Full text link
    This study focuses on the evaluation of the Open Question Answering (Open-QA) task, which can directly estimate the factuality of large language models (LLMs). Current automatic evaluation methods have shown limitations, indicating that human evaluation still remains the most reliable approach. We introduce a new task, Evaluating QA Evaluation (QA-Eval) and the corresponding dataset EVOUNA, designed to assess the accuracy of AI-generated answers in relation to standard answers within Open-QA. Our evaluation of these methods utilizes human-annotated results to measure their performance. Specifically, the work investigates methods that show high correlation with human evaluations, deeming them more reliable. We also discuss the pitfalls of current methods and methods to improve LLM-based evaluators. We believe this new QA-Eval task and corresponding dataset EVOUNA will facilitate the development of more effective automatic evaluation tools and prove valuable for future research in this area. All resources are available at \url{https://github.com/wangcunxiang/QA-Eval} and it is under the Apache-2.0 License

    Lumen contour segmentation in ivoct based on n-type cnn

    Get PDF
    Automatic segmentation of lumen contour plays an important role in medical imaging and diagnosis, which is the first step towards the evaluation of morphology of vessels under analysis and the identification of possible atherosclerotic lesions. Meanwhile, quantitative information can only be obtained with segmentation, contributing to the appearance of novel methods which can be successfully applied to intravascular optical coherence tomography (IVOCT) images. This paper proposed a new end-to-end neural network (N-Net) for the automatic lumen segmentation, using multi-scale features based deep neural network, for IVOCT images. The architecture of the N-Net contains a multi-scale input layer, a N-type convolution network layer and a cross-entropy loss function. The multi-scale input layer in the proposed N-Net is designed to avoid the loss of information caused by pooling in traditional U-Net and also enriches the detailed information in each layer. The N-type convolutional network is proposed as the framework in the whole deep architecture. Finally, the loss function guarantees the degree of fidelity between the output of proposed method and the manually labeled output. In order to enlarge the training set, data augmentation is also introduced. We evaluated our method against loss, accuracy, recall, dice similarity coefficient, jaccard similarity coefficient and specificity. The experimental results presented in this paper demonstrate the superior performance of the proposed N-Net architecture, comparing to some existing networks, for enhancing the precision of automatic lumen segmentation and increasing the detailed information of edges of the vascular lumen

    MetaGPT: Meta Programming for Multi-Agent Collaborative Framework

    Full text link
    Recently, remarkable progress has been made in automated task-solving through the use of multi-agents driven by large language models (LLMs). However, existing works primarily focuses on simple tasks lacking exploration and investigation in complicated tasks mainly due to the hallucination problem. This kind of hallucination gets amplified infinitely as multiple intelligent agents interact with each other, resulting in failures when tackling complicated problems.Therefore, we introduce MetaGPT, an innovative framework that infuses effective human workflows as a meta programming approach into LLM-driven multi-agent collaboration. In particular, MetaGPT first encodes Standardized Operating Procedures (SOPs) into prompts, fostering structured coordination. And then, it further mandates modular outputs, bestowing agents with domain expertise paralleling human professionals to validate outputs and reduce compounded errors. In this way, MetaGPT leverages the assembly line work model to assign diverse roles to various agents, thus establishing a framework that can effectively and cohesively deconstruct complex multi-agent collaborative problems. Our experiments conducted on collaborative software engineering tasks illustrate MetaGPT's capability in producing comprehensive solutions with higher coherence relative to existing conversational and chat-based multi-agent systems. This underscores the potential of incorporating human domain knowledge into multi-agents, thus opening up novel avenues for grappling with intricate real-world challenges. The GitHub repository of this project is made publicly available on: https://github.com/geekan/MetaGP
    corecore