Search CORE

200 research outputs found

Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks

Author: Gao Zhimin
Li Wanqing
Liu Song
Ogunbona Philip
Wang Pichao
Zhang Yuyao
Publication venue
Publication date: 01/01/2016
Field of study

This paper addresses the problem of continuous gesture recognition from sequences of depth maps using convolutional neutral networks (ConvNets). The proposed method first segments individual gestures from a depth sequence based on quantity of movement (QOM). For each segmented gesture, an Improved Depth Motion Map (IDMM), which converts the depth sequence into one image, is constructed and fed to a ConvNet for recognition. The IDMM effectively encodes both spatial and temporal information and allows the fine-tuning with existing ConvNet models for classification without introducing millions of parameters to learn. The proposed method is evaluated on the Large-scale Continuous Gesture Recognition of the ChaLearn Looking at People (LAP) challenge 2016. It achieved the performance of 0.2655 (Mean Jaccard Index) and ranked

3^{rd}

place in this challenge

arXiv.org e-Print Archive

Crossref

Research Online

Research on Keywords Variations in Linguistics Based on TF-IDF and N-gram

Author: Li Yuyao
Liu Xingyu
Wen Xueyi
Publication venue: University of Zagreb, Faculty of Electrical Engineering and Computing
Publication date: 01/01/2022
Field of study

The rapid development of natural language processing (NLP) holds great promise for bridging the divide among languages. One of its main innovative applications is to use broad data to explore the historical trend of a subject. However, since Saussure pioneered modern linguistics, there is relatively inadequate research work done in the linguistic research on the field\u27s variations to comprehensively reveal the linguistic trends. To trace the changes in linguistic research hotspots, we use a dataset of more than 30,000 linguistics-related literature with their titles from the Web of Science and apply NLP techniques to the data consisting of their keywords and publication years. It is found that the co-occurrence relationship between keywords, NGRAM, and their relationship with years can effectively present changes in linguistic research themes. This research is supposed to provide further insights and new methods that can be applied in the field of linguistics and related disciplines

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Learning-based Single-step Quantitative Susceptibility Mapping Reconstruction Without Brain Extraction

Author: Cao Steven
Guan Xiaojun
Liu Chunlei
Wei Hongjiang
Yan Fuhua
Yeom Kristen W.
Zhang Yuyao
Publication venue
Publication date: 15/05/2019
Field of study

Quantitative susceptibility mapping (QSM) estimates the underlying tissue magnetic susceptibility from MRI gradient-echo phase signal and typically requires several processing steps. These steps involve phase unwrapping, brain volume extraction, background phase removal and solving an ill-posed inverse problem. The resulting susceptibility map is known to suffer from inaccuracy near the edges of the brain tissues, in part due to imperfect brain extraction, edge erosion of the brain tissue and the lack of phase measurement outside the brain. This inaccuracy has thus hindered the application of QSM for measuring the susceptibility of tissues near the brain edges, e.g., quantifying cortical layers and generating superficial venography. To address these challenges, we propose a learning-based QSM reconstruction method that directly estimates the magnetic susceptibility from total phase images without the need for brain extraction and background phase removal, referred to as autoQSM. The neural network has a modified U-net structure and is trained using QSM maps computed by a two-step QSM method. 209 healthy subjects with ages ranging from 11 to 82 years were employed for patch-wise network training. The network was validated on data dissimilar to the training data, e.g. in vivo mouse brain data and brains with lesions, which suggests that the network has generalized and learned the underlying mathematical relationship between magnetic field perturbation and magnetic susceptibility. AutoQSM was able to recover magnetic susceptibility of anatomical structures near the edges of the brain including the veins covering the cortical surface, spinal cord and nerve tracts near the mouse brain boundaries. The advantages of high-quality maps, no need for brain volume extraction and high reconstruction speed demonstrate its potential for future applications.Comment: 26 page

arXiv.org e-Print Archive

eScholarship - University of California

NeuRI: Diversifying DNN Generation via Inductive Rule Inference

Author: Liu Jiawei
Peng Jinjun
Wang Yuyao
Zhang Lingming
Publication venue
Publication date: 04/09/2023
Field of study

Deep Learning (DL) is prevalently used in various industries to improve decision-making and automate processes, driven by the ever-evolving DL libraries and compilers. The correctness of DL systems is crucial for trust in DL applications. As such, the recent wave of research has been studying the automated synthesis of test-cases (i.e., DNN models and their inputs) for fuzzing DL systems. However, existing model generators only subsume a limited number of operators, lacking the ability to pervasively model operator constraints. To address this challenge, we propose NeuRI, a fully automated approach for generating valid and diverse DL models composed of hundreds of types of operators. NeuRI adopts a three-step process: (i) collecting valid and invalid API traces from various sources; (ii) applying inductive program synthesis over the traces to infer the constraints for constructing valid models; and (iii) using hybrid model generation which incorporates both symbolic and concrete operators. Our evaluation shows that NeuRI improves branch coverage of TensorFlow and PyTorch by 24% and 15% over the state-of-the-art model-level fuzzers. NeuRI finds 100 new bugs for PyTorch and TensorFlow in four months, with 81 already fixed or confirmed. Of these, 9 bugs are labelled as high priority or security vulnerability, constituting 10% of all high-priority bugs of the period. Open-source developers regard error-inducing tests reported by us as "high-quality" and "common in practice"

arXiv.org e-Print Archive

Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation

Author: Liu Jiawei
Wang Yuyao
Xia Chunqiu Steven
Zhang Lingming
Publication venue
Publication date: 02/05/2023
Field of study

Program synthesis has been long studied with recent approaches focused on directly using the power of Large Language Models (LLMs) to generate code according to user intent written in natural language. Code evaluation datasets, containing curated synthesis problems with input/output test-cases, are used to measure the performance of various LLMs on code synthesis. However, test-cases in these datasets can be limited in both quantity and quality for fully assessing the functional correctness of the generated code. Such limitation in the existing benchmarks begs the following question: In the era of LLMs, is the code generated really correct? To answer this, we propose EvalPlus -- a code synthesis benchmarking framework to rigorously evaluate the functional correctness of LLM-synthesized code. In short, EvalPlus takes in the base evaluation dataset and uses an automatic input generation step to produce and diversify large amounts of new test inputs using both LLM-based and mutation-based input generators to further validate the synthesized code. We extend the popular HUMANEVAL benchmark and build HUMANEVAL+ with 81x additionally generated tests. Our extensive evaluation across 14 popular LLMs demonstrates that HUMANEVAL+ is able to catch significant amounts of previously undetected wrong code synthesized by LLMs, reducing the pass@k by 15.1% on average! Moreover, we even found several incorrect ground-truth implementations in HUMANEVAL. Our work not only indicates that prior popular code synthesis evaluation results do not accurately reflect the true performance of LLMs for code synthesis but also opens up a new direction to improve programming benchmarks through automated test input generation

arXiv.org e-Print Archive

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

Author: Du Simon S.
Liu Yuyao
Shi Ruizhe
Xu Huazhe
Ze Yanjie
Publication venue
Publication date: 27/11/2023
Field of study

Offline reinforcement learning (RL) aims to find a near-optimal policy using pre-collected datasets. In real-world scenarios, data collection could be costly and risky; therefore, offline RL becomes particularly challenging when the in-domain data is limited. Given recent advances in Large Language Models (LLMs) and their few-shot learning prowess, this paper introduces

\textbf{La}

nguage Models for

\textbf{Mo}

tion Control (

\textbf{LaMo}

), a general framework based on Decision Transformers to effectively use pre-trained Language Models (LMs) for offline RL. Our framework highlights four crucial components: (1) Initializing Decision Transformers with sequentially pre-trained LMs, (2) employing the LoRA fine-tuning method, in contrast to full-weight fine-tuning, to combine the pre-trained knowledge from LMs and in-domain knowledge effectively, (3) using the non-linear MLP transformation instead of linear projections, to generate embeddings, and (4) integrating an auxiliary language prediction loss during fine-tuning to stabilize the LMs and retain their original abilities on languages. Empirical results indicate

\textbf{LaMo}

achieves state-of-the-art performance in sparse-reward tasks and closes the gap between value-based offline RL methods and decision transformers in dense-reward tasks. In particular, our method demonstrates superior performance in scenarios with limited data samples.Comment: 24 pages, 16 table

arXiv.org e-Print Archive

Graph Descriptive Order Improves Reasoning with Large Language Model

Author: Chen Lizhe
Cheng Xueqi
Feng Wenjie
Ge Yuyao
Liu Shenghua
Mei Lingrui
Publication venue
Publication date: 24/02/2024
Field of study

In recent years, large language models have achieved state-of-the-art performance across multiple domains. However, the progress in the field of graph reasoning with LLM remains limited. Our work delves into this gap by thoroughly investigating graph reasoning with LLMs. In this work, we reveal the impact of the order of graph description on LLMs' graph reasoning performance, which significantly affects LLMs' reasoning abilities. By altering this order, we enhance the performance of LLMs from 42.22\% to 70\%. Furthermore, we introduce the Scaled Graph Reasoning benchmark for assessing LLMs' performance across various graph sizes and evaluate the relationship between LLMs' graph reasoning abilities and graph size. We discover that the graph reasoning performance of LLMs does not monotonically decrease with the increase in graph size. The experiments span several mainstream models, including GPT-3.5, LLaMA-2-7B, and LLaMA-2-13B, to offer a comprehensive evaluation

arXiv.org e-Print Archive

Prompt2NeRF-PIL: Fast NeRF Generation via Pretrained Implicit Latent

Author: Liu Jianmeng
Meng Zeyuan
Tai Yu-Wing
Tang Chi-Keung
Zhang Yuyao
Publication venue
Publication date: 05/12/2023
Field of study

This paper explores promptable NeRF generation (e.g., text prompt or single image prompt) for direct conditioning and fast generation of NeRF parameters for the underlying 3D scenes, thus undoing complex intermediate steps while providing full 3D generation with conditional control. Unlike previous diffusion-CLIP-based pipelines that involve tedious per-prompt optimizations, Prompt2NeRF-PIL is capable of generating a variety of 3D objects with a single forward pass, leveraging a pre-trained implicit latent space of NeRF parameters. Furthermore, in zero-shot tasks, our experiments demonstrate that the NeRFs produced by our method serve as semantically informative initializations, significantly accelerating the inference process of existing prompt-to-NeRF methods. Specifically, we will show that our approach speeds up the text-to-NeRF model DreamFusion and the 3D reconstruction speed of the image-to-NeRF method Zero-1-to-3 by 3 to 5 times

arXiv.org e-Print Archive