Search CORE

134 research outputs found

Research on Building Mechanism of System for Intelligent Service Mobile Robot

Author: Ma Jiachen
Xie Wei
Yang Mingli
Publication venue: 'IntechOpen'
Publication date: 02/12/2011
Field of study

IntechOpen

Crossref

Dolphins: Multimodal Language Model for Driving

Author: Cao Yulong
Ma Yingzi
Pavone Marco
Sun Jiachen
Xiao Chaowei
Publication venue
Publication date: 01/12/2023
Field of study

The quest for fully autonomous vehicles (AVs) capable of navigating complex real-world scenarios with human-like understanding and responsiveness. In this paper, we introduce Dolphins, a novel vision-language model architected to imbibe human-like abilities as a conversational driving assistant. Dolphins is adept at processing multimodal inputs comprising video (or image) data, text instructions, and historical control signals to generate informed outputs corresponding to the provided instructions. Building upon the open-sourced pretrained Vision-Language Model, OpenFlamingo, we first enhance Dolphins's reasoning capabilities through an innovative Grounded Chain of Thought (GCoT) process. Then we tailored Dolphins to the driving domain by constructing driving-specific instruction data and conducting instruction tuning. Through the utilization of the BDD-X dataset, we designed and consolidated four distinct AV tasks into Dolphins to foster a holistic understanding of intricate driving scenarios. As a result, the distinctive features of Dolphins are characterized into two dimensions: (1) the ability to provide a comprehensive understanding of complex and long-tailed open-world driving scenarios and solve a spectrum of AV tasks, and (2) the emergence of human-like capabilities including gradient-free instant adaptation via in-context learning and error recovery via reflection.Comment: The project page is available at https://vlm-driver.github.io

arXiv.org e-Print Archive

Improved OOD Generalization via Conditional Invariant Regularizer

Author: Li Zhenguo
Ma Zhi-Ming
Sun Jiachen
Wang Ruoyu
Yi Mingyang
Publication venue
Publication date: 14/07/2022
Field of study

Recently, generalization on out-of-distribution (OOD) data with correlation shift has attracted great attention. The correlation shift is caused by the spurious attributes that correlate to the class label, as the correlation between them may vary in training and test data. For such a problem, we show that given the class label, the conditionally independent models of spurious attributes are OOD generalizable. Based on this, a metric Conditional Spurious Variation (CSV) which controls OOD generalization error, is proposed to measure such conditional independence. To improve the OOD generalization, we regularize the training process with the proposed CSV. Under mild assumptions, our training objective can be formulated as a nonconvex-concave mini-max problem. An algorithm with provable convergence rate is proposed to solve the problem. Extensive empirical results verify our algorithm's efficacy in improving OOD generalization

arXiv.org e-Print Archive

Cache-Enabled in Cooperative Cognitive Radio Networks for Transmission Performance

Author: Ma Chaofan
Man Jiabao
Song Houbing
Xu Huifang
Yang Jiachen
Zheng Gan
Publication venue: Scholarly Commons
Publication date: 22/07/2019
Field of study

The proliferation of mobile devices that support the acceleration of data services (especially smartphones) has resulted in a dramatic increase in mobile traffic. Mobile data also increased exponentially, already exceeding the throughput of the backhaul. To improve spectrum utilization and increase mobile network traffic, in combination with content caching, we study the cooperation between primary and secondary networks via content caching. We consider that the secondary base station assists the primary user by pre-caching some popular primary contents. Thus, the secondary base station can obtain more licensed bandwidth to serve its own user. We mainly focus on the time delay from the backhaul link to the secondary base station. First, in terms of the content caching and the transmission strategies, we provide a cooperation scheme to maximize the secondary user’s effective data transmission rates under the constraint of the primary users target rate. Then, we investigate the impact of the caching allocation and prove that the formulated problem is a concave problem with regard to the caching capacity allocation for any given power allocation. Furthermore, we obtain the joint caching and power allocation by an effective bisection search algorithm. Finally, our results show that the content caching cooperation scheme can achieve significant performance gain for the primary and secondary systems over the traditional two-hop relay cooperation without caching

Loughborough University Institutional Repository

Embry-Riddle Aeronautical University

Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022

Author: Ba Zhongjie
Kapoor Ashish
Lei Jiachen
Ma Shuang
Ren Kui
Vemprala Sai
Publication venue
Publication date: 18/11/2022
Field of study

In this report, we present our approach and empirical results of applying masked autoencoders in two egocentric video understanding tasks, namely, Object State Change Classification and PNR Temporal Localization, of Ego4D Challenge 2022. As team TheSSVL, we ranked 2nd place in both tasks. Our code will be made available.Comment: 5 page

arXiv.org e-Print Archive

DataElixir: Purifying Poisoned Dataset to Mitigate Backdoor Attacks via Diffusion Models

Author: Chen Kai
Lan Yibing
Lv Peizhuo
Ma Hualong
Meng Guozhu
Zhou Jiachen
Publication venue
Publication date: 19/12/2023
Field of study

Dataset sanitization is a widely adopted proactive defense against poisoning-based backdoor attacks, aimed at filtering out and removing poisoned samples from training datasets. However, existing methods have shown limited efficacy in countering the ever-evolving trigger functions, and often leading to considerable degradation of benign accuracy. In this paper, we propose DataElixir, a novel sanitization approach tailored to purify poisoned datasets. We leverage diffusion models to eliminate trigger features and restore benign features, thereby turning the poisoned samples into benign ones. Specifically, with multiple iterations of the forward and reverse process, we extract intermediary images and their predicted labels for each sample in the original dataset. Then, we identify anomalous samples in terms of the presence of label transition of the intermediary images, detect the target label by quantifying distribution discrepancy, select their purified images considering pixel and feature distance, and determine their ground-truth labels by training a benign model. Experiments conducted on 9 popular attacks demonstrates that DataElixir effectively mitigates various complex attacks while exerting minimal impact on benign accuracy, surpassing the performance of baseline defense methods.Comment: Accepted by AAAI202

arXiv.org e-Print Archive