200 research outputs found
Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks
This paper addresses the problem of continuous gesture recognition from
sequences of depth maps using convolutional neutral networks (ConvNets). The
proposed method first segments individual gestures from a depth sequence based
on quantity of movement (QOM). For each segmented gesture, an Improved Depth
Motion Map (IDMM), which converts the depth sequence into one image, is
constructed and fed to a ConvNet for recognition. The IDMM effectively encodes
both spatial and temporal information and allows the fine-tuning with existing
ConvNet models for classification without introducing millions of parameters to
learn. The proposed method is evaluated on the Large-scale Continuous Gesture
Recognition of the ChaLearn Looking at People (LAP) challenge 2016. It achieved
the performance of 0.2655 (Mean Jaccard Index) and ranked place in
this challenge
Research on Keywords Variations in Linguistics Based on TF-IDF and N-gram
The rapid development of natural language processing (NLP) holds great promise for bridging the divide among languages. One of its main innovative applications is to use broad data to explore the historical trend of a subject. However, since Saussure pioneered modern linguistics, there is relatively inadequate research work done in the linguistic research on the field\u27s variations to comprehensively reveal the linguistic trends. To trace the changes in linguistic research hotspots, we use a dataset of more than 30,000 linguistics-related literature with their titles from the Web of Science and apply NLP techniques to the data consisting of their keywords and publication years. It is found that the co-occurrence relationship between keywords, NGRAM, and their relationship with years can effectively present changes in linguistic research themes. This research is supposed to provide further insights and new methods that can be applied in the field of linguistics and related disciplines
Learning-based Single-step Quantitative Susceptibility Mapping Reconstruction Without Brain Extraction
Quantitative susceptibility mapping (QSM) estimates the underlying tissue
magnetic susceptibility from MRI gradient-echo phase signal and typically
requires several processing steps. These steps involve phase unwrapping, brain
volume extraction, background phase removal and solving an ill-posed inverse
problem. The resulting susceptibility map is known to suffer from inaccuracy
near the edges of the brain tissues, in part due to imperfect brain extraction,
edge erosion of the brain tissue and the lack of phase measurement outside the
brain. This inaccuracy has thus hindered the application of QSM for measuring
the susceptibility of tissues near the brain edges, e.g., quantifying cortical
layers and generating superficial venography. To address these challenges, we
propose a learning-based QSM reconstruction method that directly estimates the
magnetic susceptibility from total phase images without the need for brain
extraction and background phase removal, referred to as autoQSM. The neural
network has a modified U-net structure and is trained using QSM maps computed
by a two-step QSM method. 209 healthy subjects with ages ranging from 11 to 82
years were employed for patch-wise network training. The network was validated
on data dissimilar to the training data, e.g. in vivo mouse brain data and
brains with lesions, which suggests that the network has generalized and
learned the underlying mathematical relationship between magnetic field
perturbation and magnetic susceptibility. AutoQSM was able to recover magnetic
susceptibility of anatomical structures near the edges of the brain including
the veins covering the cortical surface, spinal cord and nerve tracts near the
mouse brain boundaries. The advantages of high-quality maps, no need for brain
volume extraction and high reconstruction speed demonstrate its potential for
future applications.Comment: 26 page
NeuRI: Diversifying DNN Generation via Inductive Rule Inference
Deep Learning (DL) is prevalently used in various industries to improve
decision-making and automate processes, driven by the ever-evolving DL
libraries and compilers. The correctness of DL systems is crucial for trust in
DL applications. As such, the recent wave of research has been studying the
automated synthesis of test-cases (i.e., DNN models and their inputs) for
fuzzing DL systems. However, existing model generators only subsume a limited
number of operators, lacking the ability to pervasively model operator
constraints. To address this challenge, we propose NeuRI, a fully automated
approach for generating valid and diverse DL models composed of hundreds of
types of operators. NeuRI adopts a three-step process: (i) collecting valid and
invalid API traces from various sources; (ii) applying inductive program
synthesis over the traces to infer the constraints for constructing valid
models; and (iii) using hybrid model generation which incorporates both
symbolic and concrete operators. Our evaluation shows that NeuRI improves
branch coverage of TensorFlow and PyTorch by 24% and 15% over the
state-of-the-art model-level fuzzers. NeuRI finds 100 new bugs for PyTorch and
TensorFlow in four months, with 81 already fixed or confirmed. Of these, 9 bugs
are labelled as high priority or security vulnerability, constituting 10% of
all high-priority bugs of the period. Open-source developers regard
error-inducing tests reported by us as "high-quality" and "common in practice"
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Program synthesis has been long studied with recent approaches focused on
directly using the power of Large Language Models (LLMs) to generate code
according to user intent written in natural language. Code evaluation datasets,
containing curated synthesis problems with input/output test-cases, are used to
measure the performance of various LLMs on code synthesis. However, test-cases
in these datasets can be limited in both quantity and quality for fully
assessing the functional correctness of the generated code. Such limitation in
the existing benchmarks begs the following question: In the era of LLMs, is the
code generated really correct? To answer this, we propose EvalPlus -- a code
synthesis benchmarking framework to rigorously evaluate the functional
correctness of LLM-synthesized code. In short, EvalPlus takes in the base
evaluation dataset and uses an automatic input generation step to produce and
diversify large amounts of new test inputs using both LLM-based and
mutation-based input generators to further validate the synthesized code. We
extend the popular HUMANEVAL benchmark and build HUMANEVAL+ with 81x
additionally generated tests. Our extensive evaluation across 14 popular LLMs
demonstrates that HUMANEVAL+ is able to catch significant amounts of previously
undetected wrong code synthesized by LLMs, reducing the pass@k by 15.1% on
average! Moreover, we even found several incorrect ground-truth implementations
in HUMANEVAL. Our work not only indicates that prior popular code synthesis
evaluation results do not accurately reflect the true performance of LLMs for
code synthesis but also opens up a new direction to improve programming
benchmarks through automated test input generation
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning
Offline reinforcement learning (RL) aims to find a near-optimal policy using
pre-collected datasets. In real-world scenarios, data collection could be
costly and risky; therefore, offline RL becomes particularly challenging when
the in-domain data is limited. Given recent advances in Large Language Models
(LLMs) and their few-shot learning prowess, this paper introduces
nguage Models for tion Control (), a
general framework based on Decision Transformers to effectively use pre-trained
Language Models (LMs) for offline RL. Our framework highlights four crucial
components: (1) Initializing Decision Transformers with sequentially
pre-trained LMs, (2) employing the LoRA fine-tuning method, in contrast to
full-weight fine-tuning, to combine the pre-trained knowledge from LMs and
in-domain knowledge effectively, (3) using the non-linear MLP transformation
instead of linear projections, to generate embeddings, and (4) integrating an
auxiliary language prediction loss during fine-tuning to stabilize the LMs and
retain their original abilities on languages. Empirical results indicate
achieves state-of-the-art performance in sparse-reward tasks
and closes the gap between value-based offline RL methods and decision
transformers in dense-reward tasks. In particular, our method demonstrates
superior performance in scenarios with limited data samples.Comment: 24 pages, 16 table
Graph Descriptive Order Improves Reasoning with Large Language Model
In recent years, large language models have achieved state-of-the-art
performance across multiple domains. However, the progress in the field of
graph reasoning with LLM remains limited. Our work delves into this gap by
thoroughly investigating graph reasoning with LLMs. In this work, we reveal the
impact of the order of graph description on LLMs' graph reasoning performance,
which significantly affects LLMs' reasoning abilities. By altering this order,
we enhance the performance of LLMs from 42.22\% to 70\%. Furthermore, we
introduce the Scaled Graph Reasoning benchmark for assessing LLMs' performance
across various graph sizes and evaluate the relationship between LLMs' graph
reasoning abilities and graph size. We discover that the graph reasoning
performance of LLMs does not monotonically decrease with the increase in graph
size. The experiments span several mainstream models, including GPT-3.5,
LLaMA-2-7B, and LLaMA-2-13B, to offer a comprehensive evaluation
Prompt2NeRF-PIL: Fast NeRF Generation via Pretrained Implicit Latent
This paper explores promptable NeRF generation (e.g., text prompt or single
image prompt) for direct conditioning and fast generation of NeRF parameters
for the underlying 3D scenes, thus undoing complex intermediate steps while
providing full 3D generation with conditional control. Unlike previous
diffusion-CLIP-based pipelines that involve tedious per-prompt optimizations,
Prompt2NeRF-PIL is capable of generating a variety of 3D objects with a single
forward pass, leveraging a pre-trained implicit latent space of NeRF
parameters. Furthermore, in zero-shot tasks, our experiments demonstrate that
the NeRFs produced by our method serve as semantically informative
initializations, significantly accelerating the inference process of existing
prompt-to-NeRF methods. Specifically, we will show that our approach speeds up
the text-to-NeRF model DreamFusion and the 3D reconstruction speed of the
image-to-NeRF method Zero-1-to-3 by 3 to 5 times
- …