3,285 research outputs found
Don't Forget Your ABC's: Evaluating the State-of-the-Art in Chat-Oriented Dialogue Systems
Despite tremendous advancements in dialogue systems, stable evaluation still
requires human judgments producing notoriously high-variance metrics due to
their inherent subjectivity. Moreover, methods and labels in dialogue
evaluation are not fully standardized, especially for open-domain chats, with a
lack of work to compare and assess the validity of those approaches. The use of
inconsistent evaluation can misinform the performance of a dialogue system,
which becomes a major hurdle to enhance it. Thus, a dimensional evaluation of
chat-oriented open-domain dialogue systems that reliably measures several
aspects of dialogue capabilities is desired. This paper presents a novel human
evaluation method to estimate the rates of many dialogue system behaviors. Our
method is used to evaluate four state-of-the-art open-domain dialogue systems
and compared with existing approaches. The analysis demonstrates that our
behavior method is more suitable than alternative Likert-style or comparative
approaches for dimensional evaluation of these systems.Comment: Accepted to ACL 2023; first two authors contributed equall
Leveraging Large Language Models for Automated Dialogue Analysis
Developing high-performing dialogue systems benefits from the automatic
identification of undesirable behaviors in system responses. However, detecting
such behaviors remains challenging, as it draws on a breadth of general
knowledge and understanding of conversational practices. Although recent
research has focused on building specialized classifiers for detecting specific
dialogue behaviors, the behavior coverage is still incomplete and there is a
lack of testing on real-world human-bot interactions. This paper investigates
the ability of a state-of-the-art large language model (LLM), ChatGPT-3.5, to
perform dialogue behavior detection for nine categories in real human-bot
dialogues. We aim to assess whether ChatGPT can match specialized models and
approximate human performance, thereby reducing the cost of behavior detection
tasks. Our findings reveal that neither specialized models nor ChatGPT have yet
achieved satisfactory results for this task, falling short of human
performance. Nevertheless, ChatGPT shows promising potential and often
outperforms specialized detection models. We conclude with an in-depth
examination of the prevalent shortcomings of ChatGPT, offering guidance for
future research to enhance LLM capabilities.Comment: Accepted to SIGDIAL 202
Factors Important to Older Adults Who Disagree With a Deprescribing Recommendation.
IMPORTANCE
Little is known about why older adults decline deprescribing recommendations, primarily because interventional studies rarely capture the reasons.
OBJECTIVE
To examine factors important to older adults who disagree with a deprescribing recommendation given by a primary care physician to a hypothetical patient experiencing polypharmacy.
DESIGN, SETTING, AND PARTICIPANTS
This online, vignette-based survey study was conducted from December 1, 2020, to March 31, 2021, with participants 65 years or older in the United Kingdom, the US, Australia, and the Netherlands. The primary outcome of the main study was disagreement with a deprescribing recommendation. A content analysis was subsequently conducted of the free-text reasons provided by participants who strongly disagreed or disagreed with deprescribing. Data were analyzed from August 22, 2022, to February 12, 2023.
MAIN OUTCOMES AND MEASURES
Attitudes, beliefs, fears, and recommended actions of older adults in response to deprescribing recommendations.
RESULTS
Of the 899 participants included in the analysis, the mean (SD) age was 71.5 (4.9) years; 456 participants (50.7%) were men. Attitudes, beliefs, and fears reported by participants included doubts about deprescribing (361 [40.2%]), valuing medications (139 [15.5%]), and a preference to avoid change (132 [14.7%]). Valuing medications was reported more commonly among participants who strongly disagreed compared with those who disagreed with deprescribing (48 of 205 [23.4%] vs 91 of 694 [13.1%], respectively; P < .001) or had personal experience with the same medication class as the vignette compared with no experience (93 of 517 [18.0%] vs 46 of 318 [12.1%], respectively; P = .02). Participants shared that improved communication (225 [25.0%]), alternative strategies (138 [15.4%]), and consideration of medication preferences (137 [15.2%]) may increase their agreement with deprescribing. Participants who disagreed compared with those who strongly disagreed were more interested in additional communication (196 [28.2%] vs 29 [14.2%], respectively; P < .001), alternative strategies (117 [16.9%] vs 21 [10.2%], respectively; P = .02), or consideration of medication preferences (122 [17.6%] vs 15 [7.3%], respectively; P < .001).
CONCLUSIONS AND RELEVANCE
In this survey study, older adults who disagreed with a deprescribing recommendation were more interested in additional communication, alternative strategies, or consideration of medication preferences compared with those who strongly disagreed. These findings suggest that identifying the degree of disagreement with deprescribing could be used to tailor patient-centered communication about deprescribing in older adults
Relationships between Endogenous Plasma Biomarkers of Constitutive Cytochrome P450 3A Activity and Single-Time-Point Oral Midazolam Microdose Phenotype in Healthy Subjects
Due to high basal interindividual variation in cytochrome P450 3A (CYP3A) activity and susceptibility to drug interactions, there has been interest in the application of efficient probe drug phenotyping strategies, as well as endogenous biomarkers for assessment of in vivo CYP3A activity. The biomarkers 4β-hydroxycholesterol (4βHC) and 6β-hydroxycortisol (6βHCL) are sensitive to CYP3A induction and inhibition. However, their utility for the assessment of constitutive CYP3A activity remains uncertain. We investigated whether endogenous plasma biomarkers (4βHC and 6βHCL) are associated with basal CYP3A metabolic activity in healthy subjects assessed by a convenient single-time-point oral midazolam (MDZ) phenotyping strategy. Plasma 4βHC and 6βHCL metabolic ratios (MRs) were analysed in 51 healthy adult participants. CYP3A activity was determined after administration of an oral MDZ microdose (100 μg). Simple linear and multiple linear regression analyses were performed to assess relationships between MDZ oral clearance, biomarkers and subject covariates. Among study subjects, basal MDZ oral clearance, 4βHC and 6βHCL MRs ranged 6.5-, 10- and 13-fold, respectively. Participant age and alcohol consumption were negatively associated with MDZ oral clearance (p = 0.03 and p = 0.045, respectively), while weight and female sex were associated with lower plasma 4βHC MR (p = 0.0003 and p = 0.032, respectively). Neither 4βHC nor 6βHCL MRs were associated with MDZ oral clearance. Plasma 4βHC and 6βHCL MRs do not relate to MDZ single-time-point metabolic phenotype in the assessment of constitutive CYP3A activity among healthy individuals
Train Small, Model Big: Scalable Physics Simulators via Reduced Order Modeling and Domain Decomposition
Numerous cutting-edge scientific technologies originate at the laboratory
scale, but transitioning them to practical industry applications is a
formidable challenge. Traditional pilot projects at intermediate scales are
costly and time-consuming. An alternative, the E-pilot, relies on high-fidelity
numerical simulations, but even these simulations can be computationally
prohibitive at larger scales. To overcome these limitations, we propose a
scalable, physics-constrained reduced order model (ROM) method. ROM identifies
critical physics modes from small-scale unit components, projecting governing
equations onto these modes to create a reduced model that retains essential
physics details. We also employ Discontinuous Galerkin Domain Decomposition
(DG-DD) to apply ROM to unit components and interfaces, enabling the
construction of large-scale global systems without data at such large scales.
This method is demonstrated on the Poisson and Stokes flow equations, showing
that it can solve equations about times faster with only
relative error. Furthermore, ROM takes one order of magnitude less memory than
the full order model, enabling larger scale predictions at a given memory
limitation.Comment: 40 pages, 12 figures. Submitted to Computer Methods in Applied
Mechanics and Engineerin
Response to COVID-19 vaccination in patients on cancer therapy:Analysis in a SARS-CoV-2-naïve population
Background: Cancer patients have increased morbidity and mortality from COVID-19, but may respond poorly to vaccination. The Evaluation of COVID-19 Vaccination Efficacy and Rare Events in Solid Tumors (EVEREST) study, comparing seropositivity between cancer patients and healthy controls in a low SARS-CoV-2 community-transmission setting, allows determination of vaccine response with minimal interference from infection. Methods: Solid tumor patients from The Canberra Hospital, Canberra, Australia, and healthy controls who received COVID-19 vaccination between March 2021 and January 2022 were included. Blood samples were collected at baseline, pre-second vaccine dose and at 1, 3 (primary endpoint), and 6 months post-second dose. SARS-CoV-2 anti-spike-RBD (S-RBD) and anti-nucleocapsid IgG antibodies were measured. Results: Ninety-six solid tumor patients and 20 healthy controls were enrolled, with median age 62 years, and 60% were female. Participants received either AZD1222 (65%) or BNT162b2 (35%) COVID-19 vaccines. Seropositivity 3 months post vaccination was 87% (76/87) in patients and 100% (20/20) in controls (p =.12). Seropositivity was observed in 84% of patients on chemotherapy, 80% on immunotherapy, and 96% on targeted therapy (differences not satistically significant). Seropositivity in cancer patients increased from 40% (6/15) after first dose, to 95% (35/37) 1 month after second dose, then dropped to 87% (76/87) 3 months after second dose. Conclusion: Most patients and all controls became seropositive after two vaccine doses. Antibody concentrations and seropositivity showed a decrease between 1 and 3 months post vaccination, highlighting need for booster vaccinations. SARS-CoV-2 infection amplifies S-RBD antibody responses; however, cannot be adequately identified using nucleocapsid serology. This underlines the value of our COVID-naïve population in studying vaccine immunogenicity.</p
How are falls and fear of falling associated with objectively measured physical activity in a cohort of community-dwelling older men?
BACKGROUND: Falls affect approximately one third of community-dwelling older adults each year and have serious health and social consequences. Fear of falling (FOF) (lack of confidence in maintaining balance during normal activities) affects many older adults, irrespective of whether they have actually experienced falls. Both falls and fear of falls may result in restrictions of physical activity, which in turn have health consequences. To date the relation between (i) falls and (ii) fear of falling with physical activity have not been investigated using objectively measured activity data which permits examination of different intensities of activity and sedentary behaviour.
METHODS: Cross-sectional study of 1680 men aged 71-92 years recruited from primary care practices who were part of an on-going population-based cohort. Men reported falls history in previous 12 months, FOF, health status and demographic characteristics. Men wore a GT3x accelerometer over the hip for 7 days.
RESULTS: Among the 12% of men who had recurrent falls, daily activity levels were lower than among non-fallers; 942 (95% CI 503, 1381) fewer steps/day, 12(95% CI 2, 22) minutes less in light activity, 10(95% CI 5, 15) minutes less in moderate to vigorous PA [MVPA] and 22(95% CI 9, 35) minutes more in sedentary behaviour. 16% (n = 254) of men reported FOF, of whom 52% (n = 133) had fallen in the past year. Physical activity deficits were even greater in the men who reported that they were fearful of falling than in men who had fallen. Men who were fearful of falling took 1766(95% CI 1391, 2142) fewer steps/day than men who were not fearful, and spent 27(95% CI 18, 36) minutes less in light PA, 18(95% CI 13, 22) minutes less in MVPA, and 45(95% CI 34, 56) minutes more in sedentary behaviour. The significant differences in activity levels between (i) fallers and non-fallers and (ii) men who were fearful of falling or not fearful, were mediated by similar variables; lower exercise self-efficacy, fewer excursions from home and more mobility difficulties.
CONCLUSIONS: Falls and in particular fear of falling are important barriers to older people gaining health benefits of walking and MVPA. Future studies should assess the longitudinal associations between falls and physical activity
- …