594 research outputs found
Examining Scientific Writing Styles from the Perspective of Linguistic Complexity
Publishing articles in high-impact English journals is difficult for scholars
around the world, especially for non-native English-speaking scholars (NNESs),
most of whom struggle with proficiency in English. In order to uncover the
differences in English scientific writing between native English-speaking
scholars (NESs) and NNESs, we collected a large-scale data set containing more
than 150,000 full-text articles published in PLoS between 2006 and 2015. We
divided these articles into three groups according to the ethnic backgrounds of
the first and corresponding authors, obtained by Ethnea, and examined the
scientific writing styles in English from a two-fold perspective of linguistic
complexity: (1) syntactic complexity, including measurements of sentence length
and sentence complexity; and (2) lexical complexity, including measurements of
lexical diversity, lexical density, and lexical sophistication. The
observations suggest marginal differences between groups in syntactical and
lexical complexity.Comment: 6 figure
Mitigating Propagation Failures in PINNs using Evolutionary Sampling
Despite the success of physics-informed neural networks (PINNs) in
approximating partial differential equations (PDEs), it is known that PINNs can
sometimes fail to converge to the correct solution in problems involving
complicated PDEs. This is reflected in several recent studies on characterizing
and mitigating the ``failure modes'' of PINNs. While most of these studies have
focused on balancing loss functions or adaptively tuning PDE coefficients, what
is missing is a thorough understanding of the connection between failure modes
of PINNs and sampling strategies used for training PINNs. In this paper, we
provide a novel perspective of failure modes of PINNs by hypothesizing that the
training of PINNs rely on successful ``propagation'' of solution from initial
and/or boundary condition points to interior points. We show that PINNs with
poor sampling strategies can get stuck at trivial solutions if there are
propagation failures. We additionally demonstrate that propagation failures are
characterized by highly imbalanced PDE residual fields where very high
residuals are observed over very narrow regions. To mitigate propagation
failures, we propose a novel evolutionary sampling (Evo) method that can
incrementally accumulate collocation points in regions of high PDE residuals
with little to no computational overhead. We provide an extension of Evo to
respect the principle of causality while solving time-dependent PDEs. We
theoretically analyze the behavior of Evo and empirically demonstrate its
efficacy and efficiency in comparison with baselines on a variety of PDE
problems.Comment: 34 pages, 46 figures, 2 table
Beyond Discriminative Regions: Saliency Maps as Alternatives to CAMs for Weakly Supervised Semantic Segmentation
In recent years, several Weakly Supervised Semantic Segmentation (WS3)
methods have been proposed that use class activation maps (CAMs) generated by a
classifier to produce pseudo-ground truths for training segmentation models.
While CAMs are good at highlighting discriminative regions (DR) of an image,
they are known to disregard regions of the object that do not contribute to the
classifier's prediction, termed non-discriminative regions (NDR). In contrast,
attribution methods such as saliency maps provide an alternative approach for
assigning a score to every pixel based on its contribution to the
classification prediction. This paper provides a comprehensive comparison
between saliencies and CAMs for WS3. Our study includes multiple perspectives
on understanding their similarities and dissimilarities. Moreover, we provide
new evaluation metrics that perform a comprehensive assessment of WS3
performance of alternative methods w.r.t. CAMs. We demonstrate the
effectiveness of saliencies in addressing the limitation of CAMs through our
empirical studies on benchmark datasets. Furthermore, we propose random
cropping as a stochastic aggregation technique that improves the performance of
saliency, making it a strong alternative to CAM for WS3.Comment: 24 pages, 13 figures, 4 table
Stereotypical Images of STEM Professionals and STEM Career Interests in Chinese Elementary School Students
This study investigated stereotypical images of STEM professions and STEM career interest in Chinese elementary school students. The relationships between stereotypical images of STEM professionals and STEM career interests were also determined. Data for this study was gathered from two elementary schools in China, forming a convenience sample of 318 students enrolled from 3rd to 6th grade. Quantitative data of stereotypes about STEM professionals’ social skills, positive images of STEM professionals, views on STEM implications for society, and STEM career interests were gathered by a questionnaire with Likert scale. Follow-up structured interviews were performed with 12 participants. Elementary school students had strong stereotypes about STEM professionals’ social skills, slightly deep positive image of STEM professionals, and very positive views on STEM implications for society. However, their STEM career interests were not very high. Besides, elementary school students’ stereotypes about STEM professionals’ social skills have minor negative effects on their STEM career interests. Their positive image of STEM professionals and views on STEM implications for society have significant correlation with their STEM career interests
Bias Assessment and Mitigation in LLM-based Code Generation
Utilizing state-of-the-art Large Language Models (LLMs), automatic code
generation models play a pivotal role in enhancing the productivity and
efficiency of software development coding procedures. As the adoption of LLMs
becomes more widespread in software coding ecosystems, a pressing issue has
emerged: does the generated code contain social biases, such as those related
to age, gender, and race? This issue concerns the integrity, fairness, and
ethical foundation of software applications that depend on the code generated
by these models, yet is under-explored in the literature. This paper presents a
novel bias assessment framework that is specifically designed for code
generation tasks. Based on this framework, we conduct an extensive evaluation
on the bias of nine state-of-the-art LLM-based code generation models. Our
findings reveal that first, 31.45\% to 79.93\% code functions generated by our
evaluated code generation models are biased, and 9.68\% to 37.37\% code
functions' functionality are affected by the bias, which means biases not only
exist in code generation models but in some cases, directly affect the
functionality of the generated code, posing risks of unintended and possibly
harmful software behaviors. To mitigate bias from code generation models, we
propose three mitigation strategies, which can decrease the biased code ratio
to a very low level of 0.4\% to 4.57\%
- …