198 research outputs found
DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules
Existing large language models (LLMs) that mainly focus on Standard American
English (SAE) often lead to significantly worse performance when being applied
to other English dialects. While existing mitigations tackle discrepancies for
individual target dialects, they assume access to high-accuracy dialect
identification systems. The boundaries between dialects are inherently
flexible, making it difficult to categorize language into discrete predefined
categories. In this paper, we propose DADA (Dialect Adaptation via Dynamic
Aggregation), a modular approach to imbue SAE-trained models with
multi-dialectal robustness by composing adapters which handle specific
linguistic features. The compositional architecture of DADA allows for both
targeted adaptation to specific dialect variants and simultaneous adaptation to
various dialects. We show that DADA is effective for both single task and
instruction finetuned language models, offering an extensible and interpretable
framework for adapting existing LLMs to different English dialects
Custom Sine Waves Are Enough for Imitation Learning of Bipedal Gaits with Different Styles
Not until recently, robust bipedal locomotion has been achieved through
reinforcement learning. However, existing implementations rely heavily on
insights and efforts from human experts, which is costly for the iterative
design of robot systems. Also, styles of the learned motion are strictly
limited to that of the reference. In this paper, we propose a new way to learn
bipedal locomotion from a simple sine wave as the reference for foot heights.
With the naive human insight that the two feet should be lifted up
alternatively and periodically, we experimentally demonstrate on the Cassie
robot that, a simple reward function is able to make the robot learn to walk
end-to-end and efficiently without any explicit knowledge of the model. With
custom sine waves, the learned gait pattern can also have customized styles.
Codes will be released at github.com/WooQi57/sin-cassie-rl.Comment: 7 pages, 11 figures, submitted to ICM
Hexagon -- On Machine Parallelism/Skew Check
With industrial development, precision engineering has a wide range of application. However, in the precision machining of materials, Subtle changes can cause parts to exceed the tolerances. This project is to design a solution that measure the relative parallelism (skew) of 2 sides of the triangular beam during the final milling step. The tolerance of relative parallelism must less than 25 micrometers. The triangular beam is used in Hexagon’s Global S, a coordinate measuring machine (CMM). The air bearings hold the triangular beam and move along with the beam’s surface. The moving direction according to the machine self is X-axis. Therefore, the relative parallelism of 2 sides will affect the measurement accuracy of x-axis. The requirements of the solution need to mount on the milling machine and be non-contact. The parts cannot be moved or touched. Non- contact displacement sensors were considered to be the solution. After thorough and detailed researching and selecting, Omega inductive sensor LD 701-5/10 was be purchased and be tested. However, AR-700 laser displacement sensor from Acuity, LJ-V7060 displacement sensor from Keyence and CapaNCDT 6019 Capacitive sensor from Micro-epsilon were also be considered during the time. By comparing the advantages and disadvantages of displacement sensors from the performance, noise, accuracy, precision, the price and so on, the inductive displacement sensor was chosen. The displacement sensor will measure the side face of the triangular beam and need to be perpendicular to the side face during measuring process. The measuring process is very simple. Before measuring, the first surface needs to be milled. Assume that the first surface is completely horizontal and smooth. After milling first surface, rotating the triangular beam and do the measurement. The sensor will measure the skew of the first milled face. If the maximum and minimum difference displayed by the sensor is within 25 microns, then the position of rotated beam is correct, and the triangular beam can continue to be processed. If it is bigger than 25 microns, it means the position of rotated beam need to be adjusted. It is same principle for second and third face. Through improvement of the triangular beam measurements technology, the processing efficiency is improved, and material is saved
A mathematical theory of resolution limits for super-resolution of positive sources
The superresolving capacity for number and location recoveries in the
super-resolution of positive sources is analyzed in this work. Specifically, we
introduce the computational resolution limit for respectively the number
detection and location recovery in the one-dimensional super-resolution problem
and quantitatively characterize their dependency on the cutoff frequency,
signal-to-noise ratio, and the sparsity of the sources. As a direct
consequence, we show that targeting at the sparest positive solution in the
super-resolution already provides the optimal resolution order. These results
are generalized to multi-dimensional spaces. Our estimates indicate that there
exist phase transitions in the corresponding reconstructions, which are
confirmed by numerical experiments. Our theory fills in an important puzzle
towards fully understanding the super-resolution of positive sources
Recommended from our members
Investigating the Margins: Bernard of Parma’s Glossa ordinaria on Religious Marginality in the High Middle Ages
The Glossa ordinaria compiled by Bernard of Parma (d. 1266) on Pope Gregory IX’s 1234 Decretales, commonly known as the Liber extra, is among the most influential canon law commentaries during the High and Late Middle Ages. Interrogating this source, this dissertation examines the legal status of selected marginal religious groups in medieval Europe—apostates, heretics, Jews, Muslims, and practitioners of magic. Soon after its emergence, Bernard’s Glossa was studied by law school students—that is, future Church judges, lawyers, inquisitors, and even popes—from the mid-thirteenth century on, and was the standard commentary copied into the margins of manuscripts of the Decretales. Yet, modern scholarship ignores this source almost entirely.
This study treats this issue through transcription, translation, comparison, and analysis of texts from selected medieval manuscripts of the Decretales and the Glossa, including the earliest surviving exemplars (c. 1240). It explicates the Romano-canonical judicial terminology and principles employed by the Glossa. Furthermore, it scrutinizes the Glossa’s manner of using legal allegations and tracks the excerpts which it inherits from commentarial literature.
Finally, it examines how the Glossa treats the selected marginal religious groups, and thus uncovers how this source can serve as a window for us into medieval society from the perspective of the learned or academic law. More broadly, this work contributes to a fuller understanding of the development of medieval canonical science, the operation of the ecclesiastical-legal system, and the mechanism through which the institutional Church defined its own religious boundaries
Classification of C3 and C4 Vegetation Types Using MODIS and ETM+ Blended High Spatio-Temporal Resolution Data
The distribution of C3 and C4 vegetation plays an important role in the global carbon cycle and climate change. Knowledge of the distribution of C3 and C4 vegetation at a high spatial resolution over local or regional scales helps us to understand their ecological functions and climate dependencies. In this study, we classified C3 and C4 vegetation at a high resolution for spatially heterogeneous landscapes. First, we generated a high spatial and temporal land surface reflectance dataset by blending MODIS (Moderate Resolution Imaging Spectroradiometer) and ETM+ (Enhanced Thematic Mapper Plus) data. The blended data exhibited a high correlation (R2 = 0.88) with the satellite derived ETM+ data. The time-series NDVI (Normalized Difference Vegetation Index) data were then generated using the blended high spatio-temporal resolution data to capture the phenological differences between the C3 and C4 vegetation. The time-series NDVI revealed that the C3 vegetation turns green earlier in spring than the C4 vegetation, and senesces later in autumn than the C4 vegetation. C4 vegetation has a higher NDVI value than the C3 vegetation during summer time. Based on the distinguished characteristics, the time-series NDVI was used to extract the C3 and C4 classification features. Five features were selected from the 18 classification features according to the ground investigation data, and subsequently used for the C3 and C4 classification. The overall accuracy of the C3 and C4 vegetation classification was 85.75% with a kappa of 0.725 in our study area
SISSA: Real-time Monitoring of Hardware Functional Safety and Cybersecurity with In-vehicle SOME/IP Ethernet Traffic
Scalable service-Oriented Middleware over IP (SOME/IP) is an Ethernet
communication standard protocol in the Automotive Open System Architecture
(AUTOSAR), promoting ECU-to-ECU communication over the IP stack. However,
SOME/IP lacks a robust security architecture, making it susceptible to
potential attacks. Besides, random hardware failure of ECU will disrupt SOME/IP
communication. In this paper, we propose SISSA, a SOME/IP communication
traffic-based approach for modeling and analyzing in-vehicle functional safety
and cyber security. Specifically, SISSA models hardware failures with the
Weibull distribution and addresses five potential attacks on SOME/IP
communication, including Distributed Denial-of-Services, Man-in-the-Middle, and
abnormal communication processes, assuming a malicious user accesses the
in-vehicle network. Subsequently, SISSA designs a series of deep learning
models with various backbones to extract features from SOME/IP sessions among
ECUs. We adopt residual self-attention to accelerate the model's convergence
and enhance detection accuracy, determining whether an ECU is under attack,
facing functional failure, or operating normally. Additionally, we have created
and annotated a dataset encompassing various classes, including indicators of
attack, functionality, and normalcy. This contribution is noteworthy due to the
scarcity of publicly accessible datasets with such characteristics.Extensive
experimental results show the effectiveness and efficiency of SISSA
Investigating the Fairness of Large Language Models for Predictions on Tabular Data
Recent literature has suggested the potential of using large language models
(LLMs) to make predictions for tabular tasks. However, LLMs have been shown to
exhibit harmful social biases that reflect the stereotypes and inequalities
present in the society. To this end, as well as the widespread use of tabular
data in many high-stake applications, it is imperative to explore the following
questions: what sources of information do LLMs draw upon when making
predictions for tabular tasks; whether and to what extent are LLM predictions
for tabular tasks influenced by social biases and stereotypes; and what are the
consequential implications for fairness? Through a series of experiments, we
delve into these questions and show that LLMs tend to inherit social biases
from their training data which significantly impact their fairness in tabular
prediction tasks. Furthermore, our investigations show that in the context of
bias mitigation, though in-context learning and fine-tuning have a moderate
effect, the fairness metric gap between different subgroups is still larger
than that in traditional machine learning models, such as Random Forest and
shallow Neural Networks. This observation emphasizes that the social biases are
inherent within the LLMs themselves and inherited from their pre-training
corpus, not only from the downstream task datasets. Besides, we demonstrate
that label-flipping of in-context examples can significantly reduce biases,
further highlighting the presence of inherent bias within LLMs
Linking in situ LAI and Fine Resolution Remote Sensing Data to Map Reference LAI over Cropland and Grassland Using Geostatistical Regression Method
Leaf Area Index (LAI) is an important parameter of vegetation structure. A number of moderate resolution LAI products have been produced in urgent need of large scale vegetation monitoring. High resolution LAI reference maps are necessary to validate these LAI products. This study used a geostatistical regression (GR) method to estimate LAI reference maps by linking in situ LAI and Landsat TM/ETM+ and SPOT-HRV data over two cropland and two grassland sites. To explore the discrepancies of employing different vegetation indices (VIs) on estimating LAI reference maps, this study established the GR models for different VIs, including difference vegetation index (DVI), normalized difference vegetation index (NDVI), and ratio vegetation index (RVI). To further assess the performance of the GR model, the results from the GR and Reduced Major Axis (RMA) models were compared. The results show that the performance of the GR model varies between the cropland and grassland sites. At the cropland sites, the GR model based on DVI provides the best estimation, while at the grassland sites, the GR model based on DVI performs poorly. Compared to the RMA model, the GR model improves the accuracy of reference LAI maps in terms of root mean square errors (RMSE) and bia
- …