24 research outputs found

    Woodpecker: Hallucination Correction for Multimodal Large Language Models

    Full text link
    Hallucination is a big shadow hanging over the rapidly evolving Multimodal Large Language Models (MLLMs), referring to the phenomenon that the generated text is inconsistent with the image content. In order to mitigate hallucinations, existing studies mainly resort to an instruction-tuning manner that requires retraining the models with specific data. In this paper, we pave a different way, introducing a training-free method named Woodpecker. Like a woodpecker heals trees, it picks out and corrects hallucinations from the generated text. Concretely, Woodpecker consists of five stages: key concept extraction, question formulation, visual knowledge validation, visual claim generation, and hallucination correction. Implemented in a post-remedy manner, Woodpecker can easily serve different MLLMs, while being interpretable by accessing intermediate outputs of the five stages. We evaluate Woodpecker both quantitatively and qualitatively and show the huge potential of this new paradigm. On the POPE benchmark, our method obtains a 30.66%/24.33% improvement in accuracy over the baseline MiniGPT-4/mPLUG-Owl. The source code is released at https://github.com/BradyFU/Woodpecker.Comment: 16 pages, 7 figures. Code Website: https://github.com/BradyFU/Woodpecke

    A Cross Sectional Examination of the Relation Between Depression and Frequency of Leisure Time Physical Exercise among the Elderly in Jinan, China

    No full text
    Depression has become a major global public health problem. Many studies have shown the positive effects of physical exercise on depression. However, few studies have examined the relationship between frequency of leisure time physical exercise and depression without considering the time and intensity of exercise among middle-aged and elderly people of urban communities in northern China. We conducted a cross-sectional survey that included 1604 participants among urban residents aged 50 years or older in China to evaluate how the frequency of physical exercise was related to depression. Our study showed that the prevalence of depression in the urban community of Jinan is 16.52%. For physical exercise, the odds ratios (ORs) and 95% confidence intervals (CIs) for 1~2 times per week, 3~4 times per week and ≥5 times per week were 1.137 (0.661, 1.953), 0.516 (0.304, 0.875) and 0.548 (0.392, 0.768) respectively, with adjustment for age, gender, marital status, BMI, hypertension, previously diagnosed type 2 diabetes, triglyceride, total cholesterol, soy food intake, milk food intake, vegetable and fruit intake and meat intake. We concluded that physically exercising three times a week is associated with a low prevalence of depression

    Predicting Station-Level Short-Term Passenger Flow in a Citywide Metro Network Using Spatiotemporal Graph Convolutional Neural Networks

    No full text
    Predicting the passenger flow of metro networks is of great importance for traffic management and public safety. However, such predictions are very challenging, as passenger flow is affected by complex spatial dependencies (nearby and distant) and temporal dependencies (recent and periodic). In this paper, we propose a novel deep-learning-based approach, named STGCNNmetro (spatiotemporal graph convolutional neural networks for metro), to collectively predict two types of passenger flow volumes—inflow and outflow—in each metro station of a city. Specifically, instead of representing metro stations by grids and employing conventional convolutional neural networks (CNNs) to capture spatiotemporal dependencies, STGCNNmetro transforms the city metro network to a graph and makes predictions using graph convolutional neural networks (GCNNs). First, we apply stereogram graph convolution operations to seamlessly capture the irregular spatiotemporal dependencies along the metro network. Second, a deep structure composed of GCNNs is constructed to capture the distant spatiotemporal dependencies at the citywide level. Finally, we integrate three temporal patterns (recent, daily, and weekly) and fuse the spatiotemporal dependencies captured from these patterns to form the final prediction values. The STGCNNmetro model is an end-to-end framework which can accept raw passenger flow-volume data, automatically capture the effective features of the citywide metro network, and output predictions. We test this model by predicting the short-term passenger flow volume in the citywide metro network of Shanghai, China. Experiments show that the STGCNNmetro model outperforms seven well-known baseline models (LSVR, PCA-kNN, NMF-kNN, Bayesian, MLR, M-CNN, and LSTM). We additionally explore the sensitivity of the model to its parameters and discuss the distribution of prediction errors

    Short-Term Prediction of Bus Passenger Flow Based on a Hybrid Optimized LSTM Network

    No full text
    The accurate prediction of bus passenger flow is the key to public transport management and the smart city. A long short-term memory network, a deep learning method for modeling sequences, is an efficient way to capture the time dependency of passenger flow. In recent years, an increasing number of researchers have sought to apply the LSTM model to passenger flow prediction. However, few of them pay attention to the optimization procedure during model training. In this article, we propose a hybrid, optimized LSTM network based on Nesterov accelerated adaptive moment estimation (Nadam) and the stochastic gradient descent algorithm (SGD). This method trains the model with high efficiency and accuracy, solving the problems of inefficient training and misconvergence that exist in complex models. We employ a hybrid optimized LSTM network to predict the actual passenger flow in Qingdao, China and compare the prediction results with those obtained by non-hybrid LSTM models and conventional methods. In particular, the proposed model brings about a 4%–20% extra performance improvements compared with those of non-hybrid LSTM models. We have also tried combinations of other optimization algorithms and applications in different models, finding that optimizing LSTM by switching Nadam to SGD is the best choice. The sensitivity of the model to its parameters is also explored, which provides guidance for applying this model to bus passenger flow data modelling. The good performance of the proposed model in different temporal and spatial scales shows that it is more robust and effective, which can provide insightful support and guidance for dynamic bus scheduling and regional coordination scheduling

    Just Noticeable Visual Redundancy Forecasting: A Deep Multimodal-Driven Approach

    No full text
    Just noticeable difference (JND) refers to the maximum visual change that human eyes cannot perceive, and it has a wide range of applications in multimedia systems. However, most existing JND approaches only focus on a single modality, and rarely consider the complementary effects of multimodal information. In this article, we investigate the JND modeling from an end-to-end homologous multimodal perspective, namely hmJND-Net. Specifically, we explore three important visually sensitive modalities, including saliency, depth, and segmentation. To better utilize homologous multimodal information, we establish an effective fusion method via summation enhancement and subtractive offset, and align homologous multimodal features based on a self-attention driven encoder-decoder paradigm. Extensive experimental results on eight different benchmark datasets validate the superiority of our hmJND-Net over eight representative methods

    Comparative Analysis of Parallel Hybrid Magnet Memory Machines with Different PM Arrangements

    No full text
    This paper presents a comparative analysis of two parallel hybrid magnet memory machines (PHMMMs) with different permanent magnet (PM) arrangements. The proposed machines are both geometrically characterized by a parallel U-shaped hybrid PM configuration and several q-axis magnetic barriers. The configurations and operating principles of the investigated machines are introduced firstly. The effect of magnet arrangements on the performance of the proposed machines is then evaluated with a simplified magnetic circuit model. Furthermore, the electromagnetic characteristics of the proposed machines are investigated and compared by the finite-element method (FEM). The experiments on one prototype are carried out to validate the FEM results

    Integrative Analysis of Transcriptome-Wide Association Study and Gene-Based Association Analysis Identifies In Silico Candidate Genes Associated with Juvenile Idiopathic Arthritis

    No full text
    Genome-wide association study (GWAS) of Juvenile idiopathic arthritis (JIA) suffers from low power due to limited sample size and the interpretation challenge due to most signals located in non-coding regions. Gene-level analysis could alleviate these issues. Using GWAS summary statistics, we performed two typical gene-level analysis of JIA, transcriptome-wide association studies (TWAS) using FUnctional Summary-based ImputatiON (FUSION) and gene-based analysis using eQTL Multi-marker Analysis of GenoMic Annotation (eMAGMA), followed by comprehensive enrichment analysis. Among 33 overlapped significant genes from these two methods, 11 were previously reported, including TYK2 (PFUSION = 5.12 × 10−6, PeMAGMA = 1.94 × 10−7 for whole blood), IL-6R (PFUSION = 8.63 × 10−7, PeMAGMA = 2.74 × 10−6 for cells EBV-transformed lymphocytes), and Fas (PFUSION = 5.21 × 10−5, PeMAGMA = 1.08 × 10−6 for muscle skeletal). Some newly plausible JIA-associated genes are also reported, including IL-27 (PFUSION = 2.10 × 10−7, PeMAGMA = 3.93 × 10−8 for Liver), LAT (PFUSION = 1.53 × 10−4, PeMAGMA = 4.62 × 10−7 for Artery Aorta), and MAGI3 (PFUSION = 1.30 × 10−5, PeMAGMA = 1.73 × 10−7 for Muscle Skeletal). Enrichment analysis further highlighted 4 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and 10 Gene Ontology (GO) terms. Our findings can benefit the understanding of genetic determinants and potential therapeutic targets for JIA
    corecore