144 research outputs found

    A Comprehensive Analysis of Team Streakiness in Major League Baseball: 1962-2016

    Get PDF
    A baseball team would be considered “streaky” if its record exhibits an unusually high number of consecutive wins or losses, compared to what might be expected if the team’s performance does not really depend on whether or not they won their previous game. If an average team in Major League Baseball (i.e., with a record of 81-81) is not streaky, we assume its win probability would be stable at around 50% for most games, outside of peculiar details of day-to-day outcomes, such as whether the game is home or away, who is the starting pitcher, and so on. In this paper, we investigate win outcomes for every major league team between 1962 and present (the year both leagues expanded to play 162 games per season) in order to find out if teams exhibit any significant streakiness. We use a statistical “runs test” based on the observed sequences of winning streaks and losing streaks accumulated during the season. Overall, our findings are consistent with what we would expect if no teams exhibited a nonrandom streakiness that belied their overall record. That is, major league baseball teams, as a whole, are not streaky

    Trusta: Reasoning about Assurance Cases with Formal Methods and Large Language Models

    Full text link
    Assurance cases can be used to argue for the safety of products in safety engineering. In safety-critical areas, the construction of assurance cases is indispensable. Trustworthiness Derivation Trees (TDTs) enhance assurance cases by incorporating formal methods, rendering it possible for automatic reasoning about assurance cases. We present Trustworthiness Derivation Tree Analyzer (Trusta), a desktop application designed to automatically construct and verify TDTs. The tool has a built-in Prolog interpreter in its backend, and is supported by the constraint solvers Z3 and MONA. Therefore, it can solve constraints about logical formulas involving arithmetic, sets, Horn clauses etc. Trusta also utilizes large language models to make the creation and evaluation of assurance cases more convenient. It allows for interactive human examination and modification. We evaluated top language models like ChatGPT-3.5, ChatGPT-4, and PaLM 2 for generating assurance cases. Our tests showed a 50%-80% similarity between machine-generated and human-created cases. In addition, Trusta can extract formal constraints from text in natural languages, facilitating an easier interpretation and validation process. This extraction is subject to human review and correction, blending the best of automated efficiency with human insight. To our knowledge, this marks the first integration of large language models in automatic creating and reasoning about assurance cases, bringing a novel approach to a traditional challenge. Through several industrial case studies, Trusta has proven to quickly find some subtle issues that are typically missed in manual inspection, demonstrating its practical value in enhancing the assurance case development process.Comment: 38 page

    Neural-Symbolic Entangled Framework for Complex Query Answering

    Full text link
    Answering complex queries over knowledge graphs (KG) is an important yet challenging task because of the KG incompleteness issue and cascading errors during reasoning. Recent query embedding (QE) approaches to embed the entities and relations in a KG and the first-order logic (FOL) queries into a low dimensional space, answering queries by dense similarity search. However, previous works mainly concentrate on the target answers, ignoring intermediate entities' usefulness, which is essential for relieving the cascading error problem in logical query answering. In addition, these methods are usually designed with their own geometric or distributional embeddings to handle logical operators like union, intersection, and negation, with the sacrifice of the accuracy of the basic operator - projection, and they could not absorb other embedding methods to their models. In this work, we propose a Neural and Symbolic Entangled framework (ENeSy) for complex query answering, which enables the neural and symbolic reasoning to enhance each other to alleviate the cascading error and KG incompleteness. The projection operator in ENeSy could be any embedding method with the capability of link prediction, and the other FOL operators are handled without parameters. With both neural and symbolic reasoning results contained, ENeSy answers queries in ensembles. ENeSy achieves the SOTA performance on several benchmarks, especially in the setting of the training model only with the link prediction task.Comment: Paper accepted by NeurIPS202

    Self-Guard: Empower the LLM to Safeguard Itself

    Full text link
    The jailbreak attack can bypass the safety measures of a Large Language Model (LLM), generating harmful content. This misuse of LLM has led to negative societal consequences. Currently, there are two main approaches to address jailbreak attacks: safety training and safeguards. Safety training focuses on further training LLM to enhance its safety. On the other hand, safeguards involve implementing external models or filters to prevent harmful outputs. However, safety training has constraints in its ability to adapt to new attack types and often leads to a drop in model performance. Safeguards have proven to be of limited help. To tackle these issues, we propose a novel approach called Self-Guard, which combines the strengths of both safety methods. Self-Guard includes two stages. In the first stage, we enhance the model's ability to assess harmful content, and in the second stage, we instruct the model to consistently perform harmful content detection on its own responses. The experiment has demonstrated that Self-Guard is robust against jailbreak attacks. In the bad case analysis, we find that LLM occasionally provides harmless responses to harmful queries. Additionally, we evaluated the general capabilities of the LLM before and after safety training, providing evidence that Self-Guard does not result in the LLM's performance degradation. In sensitivity tests, Self-Guard not only avoids inducing over-sensitivity in LLM but also can even mitigate this issue

    Political connections and tax-induced earnings management: Evidence from China

    Get PDF
    We use the occasion of a change in tax policy that raised the tax rate for many of the listed companies in China to examine tax-induced earnings management (TEM) from the perspective of political connections. We find that when the tax rate increased, only those affected firms with politically connected management engaged in TEM. This suggests that, in addition to motivation for managing earnings, capability of influencing tax authorities is also an important determinant of TEM. We also find that TEM helped the firms with politically connected management to reduce their tax burden

    Cluster analysis for the overall health status of elderly, multimorbid patients with diabetes

    Get PDF
    PurposeTo evaluate the overall health status and health-related abilities and problems of elderly patients with diabetes and multimorbidity compared with those with diabetes only. Additionally, we aimed to identify different subgroups of elderly, multimorbid patients with diabetes.MethodsThis cross-sectional study included 538 elderly patients with diabetes. The participants completed a series of questionnaires on self-rated health (SRH), diabetes self-management, self-efficacy, health literacy, depression, and diabetes distress. Differences in health-related abilities and problems were compared between elderly patients with diabetes and multimorbidity and those with diabetes only, with adjustments for covariates using propensity score matching. A cluster analysis was also performed to identify the overall health status subgroups of elderly, multimorbid patients with diabetes. Additionally, we conducted a multinomial logistic regression analysis to examine the predictors of health-related abilities and problem-cluster group membership.ResultsElderly patients with diabetes and multimorbidity experienced more health-related abilities and problems than those with diabetes only, particularly within the domains of depression (p < 0.001), and diabetes distress. The level of health literacy (p < 0.001) and self-management (p = 0.013) in elderly, multimorbid patients with diabetes was also significantly higher than that in elderly patients with diabetes only. Cluster analysis of elderly, multimorbid patients with diabetes revealed three distinct overall health status clusters. Multinomial logistic regression analysis indicated that age (OR = 1.090, p = 0.043), sex (OR = 0.503, p = 0.024), living situation (OR = 2.769, p = 0.011), BMI (OR = 0.838, p = 0.034), regular exercise (OR = 2.912, p = 0.041 in poor vs. good; OR = 3.510, p < 0.001 in intermediate vs. good), and cerebral infarction (OR = 26.280, p < 0.001) independently and significantly predicted cluster membership.ConclusionCompared with elderly patients with diabetes only, those with diabetes and multimorbidity experienced more health-related abilities and problems within the domains of depression, and diabetes distress. Additionally, the level of health literacy and self-management in elderly, multimorbid patients with diabetes was significantly higher than that in those with diabetes only. Among the multimorbid diabetes group, old age, male sex, living without a partner, slightly lower BMIs, not exercising regularly, and experiencing cerebral infarctions were all positively correlated with worse overall health status

    The Invasive MED/Q \u3cem\u3eBemisia tabaci\u3c/em\u3e Genome: A Tale of Gene Loss and Gene Gain

    Get PDF
    Background: Sweetpotato whitefly, Bemisia tabaci MED/Q and MEAM1/B, are two economically important invasive species that cause considerable damages to agriculture crops through direct feeding and indirect vectoring of plant pathogens. Recently, a draft genome of B. tabaci MED/Q has been assembled. In this study, we focus on the genomic comparison between MED/Q and MEAM1/B, with a special interest in MED/Q’s genomic signatures that may contribute to the highly invasive nature of this emerging insect pest. Results: The genomes of both species share similarity in syntenic blocks, but have significant divergence in the gene coding sequence. Expansion of cytochrome P450 monooxygenases and UDP glycosyltransferases in MED/Q and MEAM1/B genome is functionally validated for mediating insecticide resistance in MED/Q using in vivo RNAi. The amino acid biosynthesis pathways in MED/Q genome are partitioned among the host and endosymbiont genomes in a manner distinct from other hemipterans. Evidence of horizontal gene transfer to the host genome may explain their obligate relationship. Putative loss-of-function in the immune deficiency-signaling pathway due to the gene loss is a shared ancestral trait among hemipteran insects. Conclusions: The expansion of detoxification genes families, such as P450s, may contribute to the development of insecticide resistance traits and a broad host range in MED/Q and MEAM1/B, and facilitate species’ invasions into intensively managed cropping systems. Numerical and compositional changes in multiple gene families (gene loss and gene gain) in the MED/Q genome sets a foundation for future hypothesis testing that will advance our understanding of adaptation, viral transmission, symbiosis, and plant-insect-pathogen tritrophic interactions
    corecore