95 research outputs found
Model People Auscultation System Based on Capacitive Sensor
The medical teaching needs auscultation training, so a model people auscultation training system was designed based on capacitive sensing principle. The PIC32 CPU with charging time measuring unit was used as the system core. Capacitance sensors were set in different parts of the model, the sampled signal was digitalized and processed, the cancelling jitter algorithm and dynamic average filtering was used for improving signal, and then the simulation audio was played. At the same time, the acquisition data was sent to the workstation through Zigbee RF module for being processed. The experience results showed that the system could simulate the audio signal from the different model parts, and it’s useful for raising the training effect; the algorithms of dynamic average filtering and cancelling dithering played important role for keeping on the system stable
Causality-Aided Trade-off Analysis for Machine Learning Fairness
There has been an increasing interest in enhancing the fairness of machine
learning (ML). Despite the growing number of fairness-improving methods, we
lack a systematic understanding of the trade-offs among factors considered in
the ML pipeline when fairness-improving methods are applied. This understanding
is essential for developers to make informed decisions regarding the provision
of fair ML services. Nonetheless, it is extremely difficult to analyze the
trade-offs when there are multiple fairness parameters and other crucial
metrics involved, coupled, and even in conflict with one another.
This paper uses causality analysis as a principled method for analyzing
trade-offs between fairness parameters and other crucial metrics in ML
pipelines. To ractically and effectively conduct causality analysis, we propose
a set of domain-specific optimizations to facilitate accurate causal discovery
and a unified, novel interface for trade-off analysis based on well-established
causal inference methods. We conduct a comprehensive empirical study using
three real-world datasets on a collection of widelyused fairness-improving
techniques. Our study obtains actionable suggestions for users and developers
of fair ML. We further demonstrate the versatile usage of our approach in
selecting the optimal fairness-improving method, paving the way for more
ethical and socially responsible AI technologies
Benchmarking and Explaining Large Language Model-based Code Generation: A Causality-Centric Approach
While code generation has been widely used in various software development
scenarios, the quality of the generated code is not guaranteed. This has been a
particular concern in the era of large language models (LLMs)- based code
generation, where LLMs, deemed a complex and powerful black-box model, is
instructed by a high-level natural language specification, namely a prompt, to
generate code. Nevertheless, effectively evaluating and explaining the code
generation capability of LLMs is inherently challenging, given the complexity
of LLMs and the lack of transparency.
Inspired by the recent progress in causality analysis and its application in
software engineering, this paper launches a causality analysis-based approach
to systematically analyze the causal relations between the LLM input prompts
and the generated code. To handle various technical challenges in this study,
we first propose a novel causal graph-based representation of the prompt and
the generated code, which is established over the fine-grained,
human-understandable concepts in the input prompts. The formed causal graph is
then used to identify the causal relations between the prompt and the derived
code. We illustrate the insights that our framework can provide by studying
over 3 popular LLMs with over 12 prompt adjustment strategies. The results of
these studies illustrate the potential of our technique to provide insights
into LLM effectiveness, and aid end-users in understanding predictions.
Additionally, we demonstrate that our approach provides actionable insights to
improve the quality of the LLM-generated code by properly calibrating the
prompt
An Empirical Study on Large Language Models in Accuracy and Robustness under Chinese Industrial Scenarios
Recent years have witnessed the rapid development of large language models
(LLMs) in various domains. To better serve the large number of Chinese users,
many commercial vendors in China have adopted localization strategies, training
and providing local LLMs specifically customized for Chinese users.
Furthermore, looking ahead, one of the key future applications of LLMs will be
practical deployment in industrial production by enterprises and users in those
sectors. However, the accuracy and robustness of LLMs in industrial scenarios
have not been well studied. In this paper, we present a comprehensive empirical
study on the accuracy and robustness of LLMs in the context of the Chinese
industrial production area. We manually collected 1,200 domain-specific
problems from 8 different industrial sectors to evaluate LLM accuracy.
Furthermore, we designed a metamorphic testing framework containing four
industrial-specific stability categories with eight abilities, totaling 13,631
questions with variants to evaluate LLM robustness. In total, we evaluated 9
different LLMs developed by Chinese vendors, as well as four different LLMs
developed by global vendors. Our major findings include: (1) Current LLMs
exhibit low accuracy in Chinese industrial contexts, with all LLMs scoring less
than 0.6. (2) The robustness scores vary across industrial sectors, and local
LLMs overall perform worse than global ones. (3) LLM robustness differs
significantly across abilities. Global LLMs are more robust under
logical-related variants, while advanced local LLMs perform better on problems
related to understanding Chinese industrial terminology. Our study results
provide valuable guidance for understanding and promoting the industrial domain
capabilities of LLMs from both development and industrial enterprise
perspectives. The results further motivate possible research directions and
tooling support
Synteny analysis in Rosids with a walnut physical map reveals slow genome evolution in long-lived woody perennials.
BackgroundMutations often accompany DNA replication. Since there may be fewer cell cycles per year in the germlines of long-lived than short-lived angiosperms, the genomes of long-lived angiosperms may be diverging more slowly than those of short-lived angiosperms. Here we test this hypothesis.ResultsWe first constructed a genetic map for walnut, a woody perennial. All linkage groups were short, and recombination rates were greatly reduced in the centromeric regions. We then used the genetic map to construct a walnut bacterial artificial chromosome (BAC) clone-based physical map, which contained 15,203 exonic BAC-end sequences, and quantified with it synteny between the walnut genome and genomes of three long-lived woody perennials, Vitis vinifera, Populus trichocarpa, and Malus domestica, and three short-lived herbs, Cucumis sativus, Medicago truncatula, and Fragaria vesca. Each measure of synteny we used showed that the genomes of woody perennials were less diverged from the walnut genome than those of herbs. We also estimated the nucleotide substitution rate at silent codon positions in the walnut lineage. It was one-fifth and one-sixth of published nucleotide substitution rates in the Medicago and Arabidopsis lineages, respectively. We uncovered a whole-genome duplication in the walnut lineage, dated it to the neighborhood of the Cretaceous-Tertiary boundary, and allocated the 16 walnut chromosomes into eight homoeologous pairs. We pointed out that during polyploidy-dysploidy cycles, the dominant tendency is to reduce the chromosome number.ConclusionSlow rates of nucleotide substitution are accompanied by slow rates of synteny erosion during genome divergence in woody perennials
On the Feasibility of Specialized Ability Stealing for Large Language Code Models
Recent progress in large language code models (LLCMs) has led to a dramatic
surge in the use of software development. Nevertheless, it is widely known that
training a well-performed LLCM requires a plethora of workforce for collecting
the data and high quality annotation. Additionally, the training dataset may be
proprietary (or partially open source to the public), and the training process
is often conducted on a large-scale cluster of GPUs with high costs. Inspired
by the recent success of imitation attacks in stealing computer vision and
natural language models, this work launches the first imitation attack on
LLCMs: by querying a target LLCM with carefully-designed queries and collecting
the outputs, the adversary can train an imitation model that manifests close
behavior with the target LLCM. We systematically investigate the effectiveness
of launching imitation attacks under different query schemes and different LLCM
tasks. We also design novel methods to polish the LLCM outputs, resulting in an
effective imitation training process. We summarize our findings and provide
lessons harvested in this study that can help better depict the attack surface
of LLCMs. Our research contributes to the growing body of knowledge on
imitation attacks and defenses in deep neural models, particularly in the
domain of code related tasks.Comment: 11 page
VRPTEST: Evaluating Visual Referring Prompting in Large Multimodal Models
With recent advancements in Large Multimodal Models (LMMs) across various
domains, a novel prompting method called visual referring prompting has
emerged, showing significant potential in enhancing human-computer interaction
within multimodal systems. This method offers a more natural and flexible
approach to human interaction with these systems compared to traditional text
descriptions or coordinates. However, the categorization of visual referring
prompting remains undefined, and its impact on the performance of LMMs has yet
to be formally examined. In this study, we conduct the first comprehensive
analysis of LMMs using a variety of visual referring prompting strategies. We
introduce a benchmark dataset called VRPTEST, comprising 3 different visual
tasks and 2,275 images, spanning diverse combinations of prompt strategies.
Using VRPTEST, we conduct a comprehensive evaluation of eight versions of
prominent open-source and proprietary foundation models, including two early
versions of GPT-4V. We develop an automated assessment framework based on
software metamorphic testing techniques to evaluate the accuracy of LMMs
without the need for human intervention or manual labeling. We find that the
current proprietary models generally outperform the open-source ones, showing
an average accuracy improvement of 22.70%; however, there is still potential
for improvement. Moreover, our quantitative analysis shows that the choice of
prompt strategy significantly affects the accuracy of LMMs, with variations
ranging from -17.5% to +7.3%. Further case studies indicate that an appropriate
visual referring prompting strategy can improve LMMs' understanding of context
and location information, while an unsuitable one might lead to answer
rejection. We also provide insights on minimizing the negative impact of visual
referring prompting on LMMs.Comment: 13 page
- …