144 research outputs found
A Comprehensive Analysis of Team Streakiness in Major League Baseball: 1962-2016
A baseball team would be considered “streaky” if its record exhibits an unusually high number of consecutive wins or losses, compared to what might be expected if the team’s performance does not really depend on whether or not they won their previous game. If an average team in Major League Baseball (i.e., with a record of 81-81) is not streaky, we assume its win probability would be stable at around 50% for most games, outside of peculiar details of day-to-day outcomes, such as whether the game is home or away, who is the starting pitcher, and so on.
In this paper, we investigate win outcomes for every major league team between 1962 and present (the year both leagues expanded to play 162 games per season) in order to find out if teams exhibit any significant streakiness. We use a statistical “runs test” based on the observed sequences of winning streaks and losing streaks accumulated during the season. Overall, our findings are consistent with what we would expect if no teams exhibited a nonrandom streakiness that belied their overall record. That is, major league baseball teams, as a whole, are not streaky
Trusta: Reasoning about Assurance Cases with Formal Methods and Large Language Models
Assurance cases can be used to argue for the safety of products in safety
engineering. In safety-critical areas, the construction of assurance cases is
indispensable. Trustworthiness Derivation Trees (TDTs) enhance assurance cases
by incorporating formal methods, rendering it possible for automatic reasoning
about assurance cases. We present Trustworthiness Derivation Tree Analyzer
(Trusta), a desktop application designed to automatically construct and verify
TDTs. The tool has a built-in Prolog interpreter in its backend, and is
supported by the constraint solvers Z3 and MONA. Therefore, it can solve
constraints about logical formulas involving arithmetic, sets, Horn clauses
etc. Trusta also utilizes large language models to make the creation and
evaluation of assurance cases more convenient. It allows for interactive human
examination and modification. We evaluated top language models like
ChatGPT-3.5, ChatGPT-4, and PaLM 2 for generating assurance cases. Our tests
showed a 50%-80% similarity between machine-generated and human-created cases.
In addition, Trusta can extract formal constraints from text in natural
languages, facilitating an easier interpretation and validation process. This
extraction is subject to human review and correction, blending the best of
automated efficiency with human insight. To our knowledge, this marks the first
integration of large language models in automatic creating and reasoning about
assurance cases, bringing a novel approach to a traditional challenge. Through
several industrial case studies, Trusta has proven to quickly find some subtle
issues that are typically missed in manual inspection, demonstrating its
practical value in enhancing the assurance case development process.Comment: 38 page
Neural-Symbolic Entangled Framework for Complex Query Answering
Answering complex queries over knowledge graphs (KG) is an important yet
challenging task because of the KG incompleteness issue and cascading errors
during reasoning. Recent query embedding (QE) approaches to embed the entities
and relations in a KG and the first-order logic (FOL) queries into a low
dimensional space, answering queries by dense similarity search. However,
previous works mainly concentrate on the target answers, ignoring intermediate
entities' usefulness, which is essential for relieving the cascading error
problem in logical query answering. In addition, these methods are usually
designed with their own geometric or distributional embeddings to handle
logical operators like union, intersection, and negation, with the sacrifice of
the accuracy of the basic operator - projection, and they could not absorb
other embedding methods to their models. In this work, we propose a Neural and
Symbolic Entangled framework (ENeSy) for complex query answering, which enables
the neural and symbolic reasoning to enhance each other to alleviate the
cascading error and KG incompleteness. The projection operator in ENeSy could
be any embedding method with the capability of link prediction, and the other
FOL operators are handled without parameters. With both neural and symbolic
reasoning results contained, ENeSy answers queries in ensembles. ENeSy achieves
the SOTA performance on several benchmarks, especially in the setting of the
training model only with the link prediction task.Comment: Paper accepted by NeurIPS202
Self-Guard: Empower the LLM to Safeguard Itself
The jailbreak attack can bypass the safety measures of a Large Language Model
(LLM), generating harmful content. This misuse of LLM has led to negative
societal consequences. Currently, there are two main approaches to address
jailbreak attacks: safety training and safeguards. Safety training focuses on
further training LLM to enhance its safety. On the other hand, safeguards
involve implementing external models or filters to prevent harmful outputs.
However, safety training has constraints in its ability to adapt to new attack
types and often leads to a drop in model performance. Safeguards have proven to
be of limited help. To tackle these issues, we propose a novel approach called
Self-Guard, which combines the strengths of both safety methods. Self-Guard
includes two stages. In the first stage, we enhance the model's ability to
assess harmful content, and in the second stage, we instruct the model to
consistently perform harmful content detection on its own responses. The
experiment has demonstrated that Self-Guard is robust against jailbreak
attacks. In the bad case analysis, we find that LLM occasionally provides
harmless responses to harmful queries. Additionally, we evaluated the general
capabilities of the LLM before and after safety training, providing evidence
that Self-Guard does not result in the LLM's performance degradation. In
sensitivity tests, Self-Guard not only avoids inducing over-sensitivity in LLM
but also can even mitigate this issue
Political connections and tax-induced earnings management: Evidence from China
We use the occasion of a change in tax policy that raised the tax rate for many of the listed companies in China to examine tax-induced earnings management (TEM) from the perspective of political connections. We find that when the tax rate increased, only those affected firms with politically connected management engaged in TEM. This suggests that, in addition to motivation for managing earnings, capability of influencing tax authorities is also an important determinant of TEM. We also find that TEM helped the firms with politically connected management to reduce their tax burden
Cluster analysis for the overall health status of elderly, multimorbid patients with diabetes
PurposeTo evaluate the overall health status and health-related abilities and problems of elderly patients with diabetes and multimorbidity compared with those with diabetes only. Additionally, we aimed to identify different subgroups of elderly, multimorbid patients with diabetes.MethodsThis cross-sectional study included 538 elderly patients with diabetes. The participants completed a series of questionnaires on self-rated health (SRH), diabetes self-management, self-efficacy, health literacy, depression, and diabetes distress. Differences in health-related abilities and problems were compared between elderly patients with diabetes and multimorbidity and those with diabetes only, with adjustments for covariates using propensity score matching. A cluster analysis was also performed to identify the overall health status subgroups of elderly, multimorbid patients with diabetes. Additionally, we conducted a multinomial logistic regression analysis to examine the predictors of health-related abilities and problem-cluster group membership.ResultsElderly patients with diabetes and multimorbidity experienced more health-related abilities and problems than those with diabetes only, particularly within the domains of depression (p < 0.001), and diabetes distress. The level of health literacy (p < 0.001) and self-management (p = 0.013) in elderly, multimorbid patients with diabetes was also significantly higher than that in elderly patients with diabetes only. Cluster analysis of elderly, multimorbid patients with diabetes revealed three distinct overall health status clusters. Multinomial logistic regression analysis indicated that age (OR = 1.090, p = 0.043), sex (OR = 0.503, p = 0.024), living situation (OR = 2.769, p = 0.011), BMI (OR = 0.838, p = 0.034), regular exercise (OR = 2.912, p = 0.041 in poor vs. good; OR = 3.510, p < 0.001 in intermediate vs. good), and cerebral infarction (OR = 26.280, p < 0.001) independently and significantly predicted cluster membership.ConclusionCompared with elderly patients with diabetes only, those with diabetes and multimorbidity experienced more health-related abilities and problems within the domains of depression, and diabetes distress. Additionally, the level of health literacy and self-management in elderly, multimorbid patients with diabetes was significantly higher than that in those with diabetes only. Among the multimorbid diabetes group, old age, male sex, living without a partner, slightly lower BMIs, not exercising regularly, and experiencing cerebral infarctions were all positively correlated with worse overall health status
The Invasive MED/Q \u3cem\u3eBemisia tabaci\u3c/em\u3e Genome: A Tale of Gene Loss and Gene Gain
Background: Sweetpotato whitefly, Bemisia tabaci MED/Q and MEAM1/B, are two economically important invasive species that cause considerable damages to agriculture crops through direct feeding and indirect vectoring of plant pathogens. Recently, a draft genome of B. tabaci MED/Q has been assembled. In this study, we focus on the genomic comparison between MED/Q and MEAM1/B, with a special interest in MED/Q’s genomic signatures that may contribute to the highly invasive nature of this emerging insect pest.
Results: The genomes of both species share similarity in syntenic blocks, but have significant divergence in the gene coding sequence. Expansion of cytochrome P450 monooxygenases and UDP glycosyltransferases in MED/Q and MEAM1/B genome is functionally validated for mediating insecticide resistance in MED/Q using in vivo RNAi. The amino acid biosynthesis pathways in MED/Q genome are partitioned among the host and endosymbiont genomes in a manner distinct from other hemipterans. Evidence of horizontal gene transfer to the host genome may explain their obligate relationship. Putative loss-of-function in the immune deficiency-signaling pathway due to the gene loss is a shared ancestral trait among hemipteran insects.
Conclusions: The expansion of detoxification genes families, such as P450s, may contribute to the development of insecticide resistance traits and a broad host range in MED/Q and MEAM1/B, and facilitate species’ invasions into intensively managed cropping systems. Numerical and compositional changes in multiple gene families (gene loss and gene gain) in the MED/Q genome sets a foundation for future hypothesis testing that will advance our understanding of adaptation, viral transmission, symbiosis, and plant-insect-pathogen tritrophic interactions
- …