52 research outputs found

    Driving Policy Prediction based on Deep Learning Models

    Full text link
    In this project, we implemented an end-to-end system that takes in combined visual features of video frames from a normal camera and depth information from a cloud points scanner, and predicts driving policies (vehicle speed and steering angle). We verified the safety of our system by comparing the predicted results with standard behaviors by real-world experienced drivers. Our test results show that the predictions can be considered as accurate in at lease half of the testing cases (50% 80%, depending on the model), and using combined features improved the performance in most cases than using video frames only.Comment: 5 pages, 9 figure

    DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents

    Full text link
    Vision-language pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text. While existing vision-language pretraining models primarily focus on understanding single image associated with a single piece of text, they often ignore the alignment at the intra-document level, consisting of multiple sentences with multiple images. In this work, we propose DocumentCLIP, a salience-aware contrastive learning framework to enforce vision-language pretraining models to comprehend the interaction between images and longer text within documents. Our model is beneficial for the real-world multimodal document understanding like news article, magazines, product descriptions, which contain linguistically and visually richer content. To the best of our knowledge, we are the first to explore multimodal intra-document links by contrastive learning. In addition, we collect a large Wikipedia dataset for pretraining, which provides various topics and structures. Experiments show DocumentCLIP not only outperforms the state-of-the-art baselines in the supervised setting, but also achieves the best zero-shot performance in the wild after human evaluation. Our code is available at https://github.com/FuxiaoLiu/DocumentCLIP.Comment: 8 pages, 5 figures. In submissio

    COVID-VTS: Fact Extraction and Verification on Short Video Platforms

    Full text link
    We introduce a new benchmark, COVID-VTS, for fact-checking multi-modal information involving short-duration videos with COVID19- focused information from both the real world and machine generation. We propose, TwtrDetective, an effective model incorporating cross-media consistency checking to detect token-level malicious tampering in different modalities, and generate explanations. Due to the scarcity of training data, we also develop an efficient and scalable approach to automatically generate misleading video posts by event manipulation or adversarial matching. We investigate several state-of-the-art models and demonstrate the superiority of TwtrDetective.Comment: 11 pages, 5 figures, accepted to EACL202

    Towards Understanding In-Context Learning with Contrastive Demonstrations and Saliency Maps

    Full text link
    We investigate the role of various demonstration components in the in-context learning (ICL) performance of large language models (LLMs). Specifically, we explore the impacts of ground-truth labels, input distribution, and complementary explanations, particularly when these are altered or perturbed. We build on previous work, which offers mixed findings on how these elements influence ICL. To probe these questions, we employ explainable NLP (XNLP) methods and utilize saliency maps of contrastive demonstrations for both qualitative and quantitative analysis. Our findings reveal that flipping ground-truth labels significantly affects the saliency, though it's more noticeable in larger LLMs. Our analysis of the input distribution at a granular level reveals that changing sentiment-indicative terms in a sentiment analysis task to neutral ones does not have as substantial an impact as altering ground-truth labels. Finally, we find that the effectiveness of complementary explanations in boosting ICL performance is task-dependent, with limited benefits seen in sentiment analysis tasks compared to symbolic reasoning tasks. These insights are critical for understanding the functionality of LLMs and guiding the development of effective demonstrations, which is increasingly relevant in light of the growing use of LLMs in applications such as ChatGPT. Our research code is publicly available at https://github.com/paihengxu/XICL.Comment: 10 pages, 5 figure

    Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

    Full text link
    Despite the promising progress in multi-modal tasks, current large multi-modal models (LMMs) are prone to hallucinating inconsistent descriptions with respect to the associated image and human instructions. This paper addresses this issue by introducing the first large and diverse visual instruction tuning dataset, named Large-scale Robust Visual (LRV)-Instruction. Our dataset comprises 400k visual instructions generated by GPT4, covering 16 vision-and-language tasks with open-ended instructions and answers. Unlike existing studies that primarily focus on positive instruction samples, we design LRV-Instruction to include both positive and negative instructions for more robust visual instruction tuning. Our negative instructions are designed at three semantic levels: (i) Nonexistent Object Manipulation, (ii) Existent Object Manipulation and (iii) Knowledge Manipulation. To efficiently measure the hallucination generated by LMMs, we propose GPT4-Assisted Visual Instruction Evaluation (GAVIE), a stable approach to evaluate visual instruction tuning like human experts. GAVIE does not require human-annotated groundtruth answers and can adapt to diverse instruction formats. We conduct comprehensive experiments to investigate the hallucination of LMMs. Our results demonstrate existing LMMs exhibit significant hallucinations when presented with our negative instructions, particularly Existent Object and Knowledge Manipulation instructions. Moreover, we successfully mitigate hallucination by finetuning MiniGPT4 and mPLUG-Owl on LRV-Instruction while improving performance on several public datasets compared to state-of-the-art methods. Additionally, we observed that a balanced ratio of positive and negative instances in the training data leads to a more robust model.Comment: 40 pages, 32 figures. Under Revie

    Fatigue Behavior of an Al-12.7Si-0.7Mg Alloy Processed by Extrusion and Heat Treatment

    Get PDF
    The fatigue life of a hot extruded Al-12.7Si-0.7Mg alloy under T1, T4, and T6 conditions was studied. The microstructure and tensile properties of the alloy were investigated in order to analyze the fatigue behavior. The results of the fatigue test showed that an extruded Al-12.7Si-0.7Mg alloy provided greater fatigue life compared to a cast Al-Si alloy, which was explained by the refined microstructure characterized by fine Si particles uniformly distributed in the Al matrix of fine equiaxed grains promoted by hot extrusion. The fatigue property of the alloy in T6 treatment was higher than that in the T4 and T1 conditions due to strengthening precipitation

    Akkermansia muciniphila Enhances Egg Quality and the Lipid Profile of Egg Yolk by Improving Lipid Metabolism

    Get PDF
    Akkermansia muciniphila (A. muciniphila) has shown potential as a probiotic for the prevention and treatment of non-alcoholic fatty liver disease in both humans and mice. However, relatively little is known about the effects of A. muciniphila on lipid metabolism, productivity, and product quality in laying hens. In this study, we explored whether A. muciniphila supplementation could improve lipid metabolism and egg quality in laying hens and sought to identify the underlying mechanism. In the first experiment, 80 Hy-Line Brown laying hens were divided into four groups, one of which was fed a normal diet (control group), while the other three groups were administered a high-energy, low-protein diet to induce fatty liver hemorrhagic syndrome (FLHS). Among the three FLHS groups, one was treated with phosphate-buffered saline, one with live A. muciniphila, and one with pasteurized A. muciniphila. In the second experiment, 140 Hy-Line Brown laying hens were divided into two groups and respectively fed a basal diet supplemented or not with A. muciniphila lyophilized powder. The results showed that, in laying hens with FLHS, treatment with either live or pasteurized A. muciniphila efficiently decreased body weight, abdominal fat deposition, and lipid content in both serum and the liver; downregulated the mRNA expression of lipid synthesis-related genes and upregulated that of lipid transport-related genes in the liver; promoted the growth of short-chain fatty acids (SCFAs)-producing microorganisms and increased the cecal SCFAs content; and improved the yolk lipid profile. Additionally, the supplementation of lyophilized powder of A. muciniphila to aged laying hens reduced abdominal fat deposition and total cholesterol (TC) levels in both serum and the liver, suppressed the mRNA expression of cholesterol synthesis-related genes in the liver, reduced TC content in the yolk, increased eggshell thickness, and reshaped the composition of the gut microbiota. Collectively, our findings demonstrated that A. muciniphila can modulate lipid metabolism, thereby, promoting laying hen health as well as egg quality and nutritive value. Live, pasteurized, and lyophilized A. muciniphila preparations all have the potential for use as additives for improving laying hen production

    Research progress of E3 ubiquitin ligase regulating biological behavior of human placental trophoblast cells

    Get PDF
    E3 ubiquitin ligases are important components of the ubiquitin protease system. This family includes many proteins, which can catalyze the ubiquitination of a variety of protein substrates and promote the degradation of them by the proteasome system. Recent studies have shown that E3 ubiquitin ligase plays a key role in the process of fetal development and placental formation. It affects the biological behavior of placental trophoblast cells, leading to a series of pregnancy complications that threaten mothers and babies greatly. This review focuses on the regulation, target and mechanism of E3 ubiquitin ligase on the biological behavior of human placental trophoblast cells
    • …
    corecore