474 research outputs found

    Double-Flow GAN model for the reconstruction of perceived faces from brain activities

    Full text link
    Face plays an important role in human's visual perception, and reconstructing perceived faces from brain activities is challenging because of its difficulty in extracting high-level features and maintaining consistency of multiple face attributes, such as expression, identity, gender, etc. In this study, we proposed a novel reconstruction framework, which we called Double-Flow GAN, that can enhance the capability of discriminator and handle imbalances in images from certain domains that are too easy for generators. We also designed a pretraining process that uses features extracted from images as conditions for making it possible to pretrain the conditional reconstruction model from fMRI in a larger pure image dataset. Moreover, we developed a simple pretrained model to perform fMRI alignment to alleviate the problem of cross-subject reconstruction due to the variations of brain structure among different subjects. We conducted experiments by using our proposed method and state-of-the-art reconstruction models. Our results demonstrated that our method showed significant reconstruction performance, outperformed the previous reconstruction models, and exhibited a good generation ability

    Mining Gaze for Contrastive Learning toward Computer-Assisted Diagnosis

    Full text link
    Obtaining large-scale radiology reports can be difficult for medical images due to various reasons, limiting the effectiveness of contrastive pre-training in the medical image domain and underscoring the need for alternative methods. In this paper, we propose eye-tracking as an alternative to text reports, as it allows for the passive collection of gaze signals without disturbing radiologist's routine diagnosis process. By tracking the gaze of radiologists as they read and diagnose medical images, we can understand their visual attention and clinical reasoning. When a radiologist has similar gazes for two medical images, it may indicate semantic similarity for diagnosis, and these images should be treated as positive pairs when pre-training a computer-assisted diagnosis (CAD) network through contrastive learning. Accordingly, we introduce the Medical contrastive Gaze Image Pre-training (McGIP) as a plug-and-play module for contrastive learning frameworks. McGIP uses radiologist's gaze to guide contrastive pre-training. We evaluate our method using two representative types of medical images and two common types of gaze data. The experimental results demonstrate the practicality of McGIP, indicating its high potential for various clinical scenarios and applications.Comment: *These authors contributed equally. Accepted by AAAI 202

    Federated PAC-Bayesian Learning on Non-IID data

    Full text link
    Existing research has either adapted the Probably Approximately Correct (PAC) Bayesian framework for federated learning (FL) or used information-theoretic PAC-Bayesian bounds while introducing their theorems, but few considering the non-IID challenges in FL. Our work presents the first non-vacuous federated PAC-Bayesian bound tailored for non-IID local data. This bound assumes unique prior knowledge for each client and variable aggregation weights. We also introduce an objective function and an innovative Gibbs-based algorithm for the optimization of the derived bound. The results are validated on real-world datasets

    Growth of nonsymmetric operads

    Full text link
    The paper concerns the Gelfand-Kirillov dimension and the generating series of nonsymmetric operads. An analogue of Bergman's gap theorem is proved, namely, no finitely generated locally finite nonsymmetric operad has Gelfand-Kirillov dimension strictly between 11 and 22. For every r∈{0}∪{1}∪[2,∞)r\in \{0\}\cup \{1\}\cup [2,\infty) or r=∞r=\infty, we construct a single-element generated nonsymmetric operad with Gelfand-Kirillov dimension rr. We also provide counterexamples to two expectations of Khoroshkin and Piontkovski about the generating series of operads.Comment: 32 pages, 9 figure

    ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models

    Full text link
    Large language models (LLMs) have recently demonstrated their potential in clinical applications, providing valuable medical knowledge and advice. For example, a large dialog LLM like ChatGPT has successfully passed part of the US medical licensing exam. However, LLMs currently have difficulty processing images, making it challenging to interpret information from medical images, which are rich in information that supports clinical decisions. On the other hand, computer-aided diagnosis (CAD) networks for medical images have seen significant success in the medical field by using advanced deep-learning algorithms to support clinical decision-making. This paper presents a method for integrating LLMs into medical-image CAD networks. The proposed framework uses LLMs to enhance the output of multiple CAD networks, such as diagnosis networks, lesion segmentation networks, and report generation networks, by summarizing and reorganizing the information presented in natural language text format. The goal is to merge the strengths of LLMs' medical domain knowledge and logical reasoning with the vision understanding capability of existing medical-image CAD models to create a more user-friendly and understandable system for patients compared to conventional CAD systems. In the future, LLM's medical knowledge can be also used to improve the performance of vision-based medical-image CAD models

    Recordism: A social-scientific prospect of blockchain from social, legal, financial, and technological perspectives

    Full text link
    Blockchain has the potential to reform the architecture of cyberspace and transform the storage, circulation and exchange of information through decentralization, transparency and de-identification. Meaning that ordinary participants can become traders, miners, retailers, and customers simultaneously, breaking the barriers and reducing the information gap between participants in the community, contributing to the futuristic metaverse with an open progressive and equal ideology. Such information transformation empowered by blockchain also profoundly impacts our methodological cognition, legal governance on cyberspace and financial and technological development. This study explores the main question: what are the implications of the blockchain-driven information revolution for society and social sciences? In order to answer this main question, this paper chooses four perspectives, which are methodological, legal, financial and technical. By analysis of these four perspectives, this paper is expected to provide a more comprehensive analysis of the blockchain-driven impact on society, social sciences, and technology to contribute to current scholarships. Additionally, regarding blockchain as an innovative methodological cognition, it grows on top of other technologies while helping advance other technologies. This paper concludes that although there are few frictions between blockchain and current social architecture, blockchain is so much more than the technology itself, that can be a representative of the community, acting as the source of trust, watcher of governance, law enforcer for virtual activities, and an incubator for future technologies
    • …
    corecore