Search CORE

190 research outputs found

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

Author: Held William
Liu Yanchen
Yang Diyi
Publication venue
Publication date: 22/05/2023
Field of study

Existing large language models (LLMs) that mainly focus on Standard American English (SAE) often lead to significantly worse performance when being applied to other English dialects. While existing mitigations tackle discrepancies for individual target dialects, they assume access to high-accuracy dialect identification systems. The boundaries between dialects are inherently flexible, making it difficult to categorize language into discrete predefined categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic Aggregation), a modular approach to imbue SAE-trained models with multi-dialectal robustness by composing adapters which handle specific linguistic features. The compositional architecture of DADA allows for both targeted adaptation to specific dialect variants and simultaneous adaptation to various dialects. We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects

arXiv.org e-Print Archive

Custom Sine Waves Are Enough for Imitation Learning of Bipedal Gaits with Different Styles

Author: Liu Yanchen
Wu Qi
Zhang Chong
Publication venue
Publication date: 08/04/2022
Field of study

Not until recently, robust bipedal locomotion has been achieved through reinforcement learning. However, existing implementations rely heavily on insights and efforts from human experts, which is costly for the iterative design of robot systems. Also, styles of the learned motion are strictly limited to that of the reference. In this paper, we propose a new way to learn bipedal locomotion from a simple sine wave as the reference for foot heights. With the naive human insight that the two feet should be lifted up alternatively and periodically, we experimentally demonstrate on the Cassie robot that, a simple reward function is able to make the robot learn to walk end-to-end and efficiently without any explicit knowledge of the model. With custom sine waves, the learned gait pattern can also have customized styles. Codes will be released at github.com/WooQi57/sin-cassie-rl.Comment: 7 pages, 11 figures, submitted to ICM

arXiv.org e-Print Archive

A mathematical theory of resolution limits for super-resolution of positive sources

Author: Ammari Habib
He Yanchen
Liu Ping
Publication venue
Publication date: 24/11/2022
Field of study

The superresolving capacity for number and location recoveries in the super-resolution of positive sources is analyzed in this work. Specifically, we introduce the computational resolution limit for respectively the number detection and location recovery in the one-dimensional super-resolution problem and quantitatively characterize their dependency on the cutoff frequency, signal-to-noise ratio, and the sparsity of the sources. As a direct consequence, we show that targeting at the sparest positive solution in the super-resolution already provides the optimal resolution order. These results are generalized to multi-dimensional spaces. Our estimates indicate that there exist phase transitions in the corresponding reconstructions, which are confirmed by numerical experiments. Our theory fills in an important puzzle towards fully understanding the super-resolution of positive sources

arXiv.org e-Print Archive

Repository for Publications and Research Data

Hexagon -- On Machine Parallelism/Skew Check

Author: Liu Yanchen
Philips Greg
Sullivan Byron
Publication venue: DigitalCommons@URI
Publication date: 01/01/2019
Field of study

With industrial development, precision engineering has a wide range of application. However, in the precision machining of materials, Subtle changes can cause parts to exceed the tolerances. This project is to design a solution that measure the relative parallelism (skew) of 2 sides of the triangular beam during the final milling step. The tolerance of relative parallelism must less than 25 micrometers. The triangular beam is used in Hexagon’s Global S, a coordinate measuring machine (CMM). The air bearings hold the triangular beam and move along with the beam’s surface. The moving direction according to the machine self is X-axis. Therefore, the relative parallelism of 2 sides will affect the measurement accuracy of x-axis. The requirements of the solution need to mount on the milling machine and be non-contact. The parts cannot be moved or touched. Non- contact displacement sensors were considered to be the solution. After thorough and detailed researching and selecting, Omega inductive sensor LD 701-5/10 was be purchased and be tested. However, AR-700 laser displacement sensor from Acuity, LJ-V7060 displacement sensor from Keyence and CapaNCDT 6019 Capacitive sensor from Micro-epsilon were also be considered during the time. By comparing the advantages and disadvantages of displacement sensors from the performance, noise, accuracy, precision, the price and so on, the inductive displacement sensor was chosen. The displacement sensor will measure the side face of the triangular beam and need to be perpendicular to the side face during measuring process. The measuring process is very simple. Before measuring, the first surface needs to be milled. Assume that the first surface is completely horizontal and smooth. After milling first surface, rotating the triangular beam and do the measurement. The sensor will measure the skew of the first milled face. If the maximum and minimum difference displayed by the sensor is within 25 microns, then the position of rotated beam is correct, and the triangular beam can continue to be processed. If it is bigger than 25 microns, it means the position of rotated beam need to be adjusted. It is same principle for second and third face. Through improvement of the triangular beam measurements technology, the processing efficiency is improved, and material is saved

DigitalCommons@URI

Recommended from our members

Investigating the Margins: Bernard of Parma’s Glossa ordinaria on Religious Marginality in the High Middle Ages

Author: Liu Yanchen
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2022
Field of study

The Glossa ordinaria compiled by Bernard of Parma (d. 1266) on Pope Gregory IX’s 1234 Decretales, commonly known as the Liber extra, is among the most influential canon law commentaries during the High and Late Middle Ages. Interrogating this source, this dissertation examines the legal status of selected marginal religious groups in medieval Europe—apostates, heretics, Jews, Muslims, and practitioners of magic. Soon after its emergence, Bernard’s Glossa was studied by law school students—that is, future Church judges, lawyers, inquisitors, and even popes—from the mid-thirteenth century on, and was the standard commentary copied into the margins of manuscripts of the Decretales. Yet, modern scholarship ignores this source almost entirely. This study treats this issue through transcription, translation, comparison, and analysis of texts from selected medieval manuscripts of the Decretales and the Glossa, including the earliest surviving exemplars (c. 1240). It explicates the Romano-canonical judicial terminology and principles employed by the Glossa. Furthermore, it scrutinizes the Glossa’s manner of using legal allegations and tracks the excerpts which it inherits from commentarial literature. Finally, it examines how the Glossa treats the selected marginal religious groups, and thus uncovers how this source can serve as a window for us into medieval society from the perspective of the learned or academic law. More broadly, this work contributes to a fuller understanding of the development of medieval canonical science, the operation of the ecclesiastical-legal system, and the mechanism through which the institutional Church defined its own religious boundaries

Columbia University Academic Commons

Classification of C3 and C4 Vegetation Types Using MODIS and ETM+ Blended High Spatio-Temporal Resolution Data

Author: Bo Yanchen
He Yaqian
Liu Xiaolong
Zhang Jian
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2015
Field of study

The distribution of C3 and C4 vegetation plays an important role in the global carbon cycle and climate change. Knowledge of the distribution of C3 and C4 vegetation at a high spatial resolution over local or regional scales helps us to understand their ecological functions and climate dependencies. In this study, we classified C3 and C4 vegetation at a high resolution for spatially heterogeneous landscapes. First, we generated a high spatial and temporal land surface reflectance dataset by blending MODIS (Moderate Resolution Imaging Spectroradiometer) and ETM+ (Enhanced Thematic Mapper Plus) data. The blended data exhibited a high correlation (R2 = 0.88) with the satellite derived ETM+ data. The time-series NDVI (Normalized Difference Vegetation Index) data were then generated using the blended high spatio-temporal resolution data to capture the phenological differences between the C3 and C4 vegetation. The time-series NDVI revealed that the C3 vegetation turns green earlier in spring than the C4 vegetation, and senesces later in autumn than the C4 vegetation. C4 vegetation has a higher NDVI value than the C3 vegetation during summer time. Based on the distinguished characteristics, the time-series NDVI was used to extract the C3 and C4 classification features. Five features were selected from the 18 classification features according to the ground investigation data, and subsequently used for the C3 and C4 classification. The overall accuracy of the C3 and C4 vegetation classification was 85.75% with a kappa of 0.725 in our study area

Crossref

Directory of Open Access Journals

The Research Repository @ WVU (West Virginia University)

SISSA: Real-time Monitoring of Hardware Functional Safety and Cybersecurity with In-vehicle SOME/IP Ethernet Traffic

Author: Li Xingyu
Li Yufeng
Liu Qi
Liu Yanchen
Sun Ke
Publication venue
Publication date: 20/02/2024
Field of study

Scalable service-Oriented Middleware over IP (SOME/IP) is an Ethernet communication standard protocol in the Automotive Open System Architecture (AUTOSAR), promoting ECU-to-ECU communication over the IP stack. However, SOME/IP lacks a robust security architecture, making it susceptible to potential attacks. Besides, random hardware failure of ECU will disrupt SOME/IP communication. In this paper, we propose SISSA, a SOME/IP communication traffic-based approach for modeling and analyzing in-vehicle functional safety and cyber security. Specifically, SISSA models hardware failures with the Weibull distribution and addresses five potential attacks on SOME/IP communication, including Distributed Denial-of-Services, Man-in-the-Middle, and abnormal communication processes, assuming a malicious user accesses the in-vehicle network. Subsequently, SISSA designs a series of deep learning models with various backbones to extract features from SOME/IP sessions among ECUs. We adopt residual self-attention to accelerate the model's convergence and enhance detection accuracy, determining whether an ECU is under attack, facing functional failure, or operating normally. Additionally, we have created and annotated a dataset encompassing various classes, including indicators of attack, functionality, and normalcy. This contribution is noteworthy due to the scarcity of publicly accessible datasets with such characteristics.Extensive experimental results show the effectiveness and efficiency of SISSA

arXiv.org e-Print Archive

Investigating the Fairness of Large Language Models for Predictions on Tabular Data

Author: Gautam Srishti
Lakkaraju Himabindu
Liu Yanchen
Ma Jiaqi
Publication venue
Publication date: 23/10/2023
Field of study

Recent literature has suggested the potential of using large language models (LLMs) to make predictions for tabular tasks. However, LLMs have been shown to exhibit harmful social biases that reflect the stereotypes and inequalities present in the society. To this end, as well as the widespread use of tabular data in many high-stake applications, it is imperative to explore the following questions: what sources of information do LLMs draw upon when making predictions for tabular tasks; whether and to what extent are LLM predictions for tabular tasks influenced by social biases and stereotypes; and what are the consequential implications for fairness? Through a series of experiments, we delve into these questions and show that LLMs tend to inherit social biases from their training data which significantly impact their fairness in tabular prediction tasks. Furthermore, our investigations show that in the context of bias mitigation, though in-context learning and fine-tuning have a moderate effect, the fairness metric gap between different subgroups is still larger than that in traditional machine learning models, such as Random Forest and shallow Neural Networks. This observation emphasizes that the social biases are inherent within the LLMs themselves and inherited from their pre-training corpus, not only from the downstream task datasets. Besides, we demonstrate that label-flipping of in-context examples can significantly reduce biases, further highlighting the presence of inherent bias within LLMs

arXiv.org e-Print Archive

Linking in situ LAI and Fine Resolution Remote Sensing Data to Map Reference LAI over Cropland and Grassland Using Geostatistical Regression Method

Author: Bo Yanchen
Chai Leilei
He Yaqian
Li Aihua
Liu Xiaolong
Publication venue: 'IUScholarWorks'
Publication date: 01/08/2016
Field of study

Leaf Area Index (LAI) is an important parameter of vegetation structure. A number of moderate resolution LAI products have been produced in urgent need of large scale vegetation monitoring. High resolution LAI reference maps are necessary to validate these LAI products. This study used a geostatistical regression (GR) method to estimate LAI reference maps by linking in situ LAI and Landsat TM/ETM+ and SPOT-HRV data over two cropland and two grassland sites. To explore the discrepancies of employing different vegetation indices (VIs) on estimating LAI reference maps, this study established the GR models for different VIs, including difference vegetation index (DVI), normalized difference vegetation index (NDVI), and ratio vegetation index (RVI). To further assess the performance of the GR model, the results from the GR and Reduced Major Axis (RMA) models were compared. The results show that the performance of the GR model varies between the cropland and grassland sites. At the cropland sites, the GR model based on DVI provides the best estimation, while at the grassland sites, the GR model based on DVI performs poorly. Compared to the RMA model, the GR model improves the accuracy of reference LAI maps in terms of root mean square errors (RMSE) and bia

Boise State University - ScholarWorks