16 research outputs found

    PULSAR: Graph based Positive Unlabeled Learning with Multi Stream Adaptive Convolutions for Parkinson's Disease Recognition

    Full text link
    Parkinson's disease (PD) is a neuro-degenerative disorder that affects movement, speech, and coordination. Timely diagnosis and treatment can improve the quality of life for PD patients. However, access to clinical diagnosis is limited in low and middle income countries (LMICs). Therefore, development of automated screening tools for PD can have a huge social impact, particularly in the public health sector. In this paper, we present PULSAR, a novel method to screen for PD from webcam-recorded videos of the finger-tapping task from the Movement Disorder Society - Unified Parkinson's Disease Rating Scale (MDS-UPDRS). PULSAR is trained and evaluated on data collected from 382 participants (183 self-reported as PD patients). We used an adaptive graph convolutional neural network to dynamically learn the spatio temporal graph edges specific to the finger-tapping task. We enhanced this idea with a multi stream adaptive convolution model to learn features from different modalities of data critical to detect PD, such as relative location of the finger joints, velocity and acceleration of tapping. As the labels of the videos are self-reported, there could be cases of undiagnosed PD in the non-PD labeled samples. We leveraged the idea of Positive Unlabeled (PU) Learning that does not need labeled negative data. Our experiments show clear benefit of modeling the problem in this way. PULSAR achieved 80.95% accuracy in validation set and a mean accuracy of 71.29% (2.49% standard deviation) in independent test, despite being trained with limited amount of data. This is specially promising as labeled data is scarce in health care sector. We hope PULSAR will make PD screening more accessible to everyone. The proposed techniques could be extended for assessment of other movement disorders, such as ataxia, and Huntington's disease

    Global, regional, and national sex-specific burden and control of the HIV epidemic, 1990-2019, for 204 countries and territories: the Global Burden of Diseases Study 2019

    Get PDF
    Background: The sustainable development goals (SDGs) aim to end HIV/AIDS as a public health threat by 2030. Understanding the current state of the HIV epidemic and its change over time is essential to this effort. This study assesses the current sex-specific HIV burden in 204 countries and territories and measures progress in the control of the epidemic. Methods: To estimate age-specific and sex-specific trends in 48 of 204 countries, we extended the Estimation and Projection Package Age-Sex Model to also implement the spectrum paediatric model. We used this model in cases where age and sex specific HIV-seroprevalence surveys and antenatal care-clinic sentinel surveillance data were available. For the remaining 156 of 204 locations, we developed a cohort-incidence bias adjustment to derive incidence as a function of cause-of-death data from vital registration systems. The incidence was input to a custom Spectrum model. To assess progress, we measured the percentage change in incident cases and deaths between 2010 and 2019 (threshold >75% decline), the ratio of incident cases to number of people living with HIV (incidence-to-prevalence ratio threshold <0·03), and the ratio of incident cases to deaths (incidence-to-mortality ratio threshold <1·0). Findings: In 2019, there were 36·8 million (95% uncertainty interval [UI] 35·1–38·9) people living with HIV worldwide. There were 0·84 males (95% UI 0·78–0·91) per female living with HIV in 2019, 0·99 male infections (0·91–1·10) for every female infection, and 1·02 male deaths (0·95–1·10) per female death. Global progress in incident cases and deaths between 2010 and 2019 was driven by sub-Saharan Africa (with a 28·52% decrease in incident cases, 95% UI 19·58–35·43, and a 39·66% decrease in deaths, 36·49–42·36). Elsewhere, the incidence remained stable or increased, whereas deaths generally decreased. In 2019, the global incidence-to-prevalence ratio was 0·05 (95% UI 0·05–0·06) and the global incidence-to-mortality ratio was 1·94 (1·76–2·12). No regions met suggested thresholds for progress. Interpretation: Sub-Saharan Africa had both the highest HIV burden and the greatest progress between 1990 and 2019. The number of incident cases and deaths in males and females approached parity in 2019, although there remained more females with HIV than males with HIV. Globally, the HIV epidemic is far from the UNAIDS benchmarks on progress metrics. Funding: The Bill & Melinda Gates Foundation, the National Institute of Mental Health of the US National Institutes of Health (NIH), and the National Institute on Aging of the NIH

    Global burden and strength of evidence for 88 risk factors in 204 countries and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021

    Get PDF
    Background: Understanding the health consequences associated with exposure to risk factors is necessary to inform public health policy and practice. To systematically quantify the contributions of risk factor exposures to specific health outcomes, the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 aims to provide comprehensive estimates of exposure levels, relative health risks, and attributable burden of disease for 88 risk factors in 204 countries and territories and 811 subnational locations, from 1990 to 2021. Methods: The GBD 2021 risk factor analysis used data from 54 561 total distinct sources to produce epidemiological estimates for 88 risk factors and their associated health outcomes for a total of 631 risk–outcome pairs. Pairs were included on the basis of data-driven determination of a risk–outcome association. Age-sex-location-year-specific estimates were generated at global, regional, and national levels. Our approach followed the comparative risk assessment framework predicated on a causal web of hierarchically organised, potentially combinative, modifiable risks. Relative risks (RRs) of a given outcome occurring as a function of risk factor exposure were estimated separately for each risk–outcome pair, and summary exposure values (SEVs), representing risk-weighted exposure prevalence, and theoretical minimum risk exposure levels (TMRELs) were estimated for each risk factor. These estimates were used to calculate the population attributable fraction (PAF; ie, the proportional change in health risk that would occur if exposure to a risk factor were reduced to the TMREL). The product of PAFs and disease burden associated with a given outcome, measured in disability-adjusted life-years (DALYs), yielded measures of attributable burden (ie, the proportion of total disease burden attributable to a particular risk factor or combination of risk factors). Adjustments for mediation were applied to account for relationships involving risk factors that act indirectly on outcomes via intermediate risks. Attributable burden estimates were stratified by Socio-demographic Index (SDI) quintile and presented as counts, age-standardised rates, and rankings. To complement estimates of RR and attributable burden, newly developed burden of proof risk function (BPRF) methods were applied to yield supplementary, conservative interpretations of risk–outcome associations based on the consistency of underlying evidence, accounting for unexplained heterogeneity between input data from different studies. Estimates reported represent the mean value across 500 draws from the estimate's distribution, with 95% uncertainty intervals (UIs) calculated as the 2·5th and 97·5th percentile values across the draws. Findings: Among the specific risk factors analysed for this study, particulate matter air pollution was the leading contributor to the global disease burden in 2021, contributing 8·0% (95% UI 6·7–9·4) of total DALYs, followed by high systolic blood pressure (SBP; 7·8% [6·4–9·2]), smoking (5·7% [4·7–6·8]), low birthweight and short gestation (5·6% [4·8–6·3]), and high fasting plasma glucose (FPG; 5·4% [4·8–6·0]). For younger demographics (ie, those aged 0–4 years and 5–14 years), risks such as low birthweight and short gestation and unsafe water, sanitation, and handwashing (WaSH) were among the leading risk factors, while for older age groups, metabolic risks such as high SBP, high body-mass index (BMI), high FPG, and high LDL cholesterol had a greater impact. From 2000 to 2021, there was an observable shift in global health challenges, marked by a decline in the number of all-age DALYs broadly attributable to behavioural risks (decrease of 20·7% [13·9–27·7]) and environmental and occupational risks (decrease of 22·0% [15·5–28·8]), coupled with a 49·4% (42·3–56·9) increase in DALYs attributable to metabolic risks, all reflecting ageing populations and changing lifestyles on a global scale. Age-standardised global DALY rates attributable to high BMI and high FPG rose considerably (15·7% [9·9–21·7] for high BMI and 7·9% [3·3–12·9] for high FPG) over this period, with exposure to these risks increasing annually at rates of 1·8% (1·6–1·9) for high BMI and 1·3% (1·1–1·5) for high FPG. By contrast, the global risk-attributable burden and exposure to many other risk factors declined, notably for risks such as child growth failure and unsafe water source, with age-standardised attributable DALYs decreasing by 71·5% (64·4–78·8) for child growth failure and 66·3% (60·2–72·0) for unsafe water source. We separated risk factors into three groups according to trajectory over time: those with a decreasing attributable burden, due largely to declining risk exposure (eg, diet high in trans-fat and household air pollution) but also to proportionally smaller child and youth populations (eg, child and maternal malnutrition); those for which the burden increased moderately in spite of declining risk exposure, due largely to population ageing (eg, smoking); and those for which the burden increased considerably due to both increasing risk exposure and population ageing (eg, ambient particulate matter air pollution, high BMI, high FPG, and high SBP). Interpretation: Substantial progress has been made in reducing the global disease burden attributable to a range of risk factors, particularly those related to maternal and child health, WaSH, and household air pollution. Maintaining efforts to minimise the impact of these risk factors, especially in low SDI locations, is necessary to sustain progress. Successes in moderating the smoking-related burden by reducing risk exposure highlight the need to advance policies that reduce exposure to other leading risk factors such as ambient particulate matter air pollution and high SBP. Troubling increases in high FPG, high BMI, and other risk factors related to obesity and metabolic syndrome indicate an urgent need to identify and implement interventions

    Modeling and mediating conversational norm violations

    No full text
    Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2021.This thesis focuses on improving human-human interaction during group discussions through feedback from effective human-machine interaction. Identifying verbal and non-verbal attributes and allowing people to be aware of them is crucial for maintaining a safe and effective exchange of ideas. This thesis work explores capturing the related behavioral and affective attributes from audio, video, and language information from videoconferencing-based group meetings, and developing automated feedback systems (chatbot, conversational agents, visualization) for mediation purposes. First, I present a multi-modal dataset for interpersonal disrespect or toxicity from 59 YouTube News Show dyadic remote discussion videos, and propose an algorithm to define a speaker-wise toxicity score. Our models perform with an accuracy of over 60% using visual features and close to 80% on audial features to recognize the attributes of disrespect. Next, I elaborate on developing automated systems to capture multi-modal features from meetings and designing privacy-preserving feedback. The work handles behavioral and contextual features such as talk-time, turn-taking, interruption, volume, sentiment, valence, attitude, shared smile, attention, anger, surprise, engagement, questions, consensus. For keeping heated discussions respectful, we develop a videoconferencing platform integrated with real-time feedback processed on the client-side. Validated by 40 participants, our findings reveal that real-time feedback can reduce expressiveness during the discussion, yet improves the follow-up discussion even without feedback. For post-meeting reflection, we develop a fully automated collaboration platform ‘CoCo’ that is capable of holding video conferencing meetings, processing data post-session in bulk on the server-side, and presenting feedback through an interactive chatbot. Evaluation from 39 participants shows the improvement in group dynamics in successive discussions. We also explore post-session feedback in an in situ workplace setting. We survey the challenges faced in remote meetings (N = 150), and as per the needs design and evaluate a wireframe prototype (N = 16) and an interactive feedback dashboard named ‘MeetingCoach’ (N = 23). The study supports our hypotheses that actionable suggestions, personalized modeling, and privacy-preserving feedback can potentially improve meeting effectiveness and inclusivity. For pre-meeting training, we present a suggestive chatbot incorporated with motivational interviewing (MI) technique for improving conversational skills. Evaluation from a consensus-based résumés evaluation study with 21 participants showcases the effectiveness of the suggestive MI chatbot in encouraging users to apply the information delivered by agents. We highlight strategies on how to reach a consensus that fulfills individual and team goals. We explore improving agent capabilities in terms of empathy and affect. We design the dialogue of an empathetic conversational agent and evaluate it in a Wizard-of-Oz-based study with 34 participants. Our results show improved human-machine interaction mitigating negative affect. We also explore sentiment detection on a human-machine interaction dataset. We build and compare multimodal LSTM fusion (accuracynothreshold = 67.5%, accuracythreshold = 71.8%) and hierarchical (accuracynothreshold = 60.9%, accuracythreshold = 71.8%) models. The results show the importance of increasing agent capabilities in becoming more affective and interactive to effectively interact with users. Overall, the findings of this thesis work provide useful information to the research community regarding modeling conversations to understand group dynamics and mediating discussions through effective feedback agents

    Feedback strategies on verbal and nonverbal cues to improve communication skills

    No full text
    Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2020.In this thesis, we present findings on designing and validating real-time and post feedback on nonverbal skills in face to face communication skills with a humanoid agent. The technical challenges included a real-time machine learning framework that can automatically process the audio-video data via a webcam, allowing the users to converse in natural language and receive live and post feedback on smile intensity, volume modulation, pauses, synchronicity, body language, eye-contact, sentiment and turn-taking. Our initial exploration included designing a wizard-of-oz experiment validating the form factors (i.e., flashing icons using the traffic light analogy) for real-time feedback using 46 college students. Using the data, we trained a hidden Markov model to generate feedback. The feedback on verbal cue was generated by performing sentiment and word category analysis. For post feedback, we summarized the nonverbal feedback using the support vector machine. The technical contributions were validated in three unique contexts: 1) helping individuals with autism; 2) helping elderly with their social skills; 3) helping physicians improve their interactions skills with patients. Applications to speed-dating and autism: In a randomized control study with 47 college students, we found that the feedback helped improve eye contact and gesture. In a preliminary study with nine teenagers with autism, we identified several design guidelines which include, briefing the users, making positive acknowledgments, and personalizing dialogue. Applications to aging: In a pilot study with 25 older adults, participants found the feedback useful and were able to reflect on the feedback. In a subsequent longitudinal study with 18 older adults, participants improved their eye contact and smiling. Applications to patient-physician communication: In the context of patient-physician communication, we conducted a study with eight clinicians where they found the feedback intuitive and easy to follow. Additionally, we identified two communication behaviors of physicians that help improve patients’ prognosis understanding - 1) lecturing style of a conversational structure by maximizing entropy, and 2) the positive language patterns (i.e., sentiment trajectory) using k-means clustering. We used a data set that includes conversations between physicians (N=38) and late-stage cancer patients (N=382). With statistical analysis, we show that physicians who were lecturing their patients and did not vary their positive sentiment had patients with prognosis misunderstanding. During global pandemics (e.g., COVID-19), when social distancing is recommended, most communication is taking place online. This indicates the need for online communication training programs that can overcome social and global boundaries

    Multimodal representation learning and its application to human behavior analysis

    No full text
    Thesis (Ph. D.)--University of Rochester. Department of Computer Science, 2022This thesis aims to learn the joint representation of text, acoustic and visual modalities to understand spoken language in face-to-face communications. Being able to mix and align those modalities appropriately helps humans to display sentiment, humor, and credible argument in daily conversations. The creative usage of these behaviors removes barriers in communication, grabs the attention of the audience, and even helps to build trust. Building algorithms for understanding these behavioral tasks is a difficult problem in AI. These tasks not only demand machine learning algorithms that create efficient fusion across modalities, incorporate world knowledge, and reasoning, but also require large complete datasets. To address these limitations, we design behavioral datasets and a series of multimodal machine learning algorithms. First, we present some key insights about credibility by analyzing the verbal and non-verbal features. The pre-trained facial expressions from baseline questions help to classify the relevant section as truth vs. bluff (70% accuracy >> 52% human accuracy). Analyzing interrogation answers in the context of facial expressions reveals interesting linguistic patterns of deceivers (e.g. less cognitively-inclined words, shorter answers). These patterns are absent when we analyze the language modality alone. Next, we develop UR-FUNNY - the first video dataset (16k instances, 19 hours) of humor detection. It is extracted from TedTalk videos using the laughter marker of the audience. We study the multimodal structure of humor and the importance of having a context story for building up the punchline. We design neural networks to detect multimodal humor and show the effectiveness of humor-centric features like ambiguity and superiority based on linguistic theories. To investigate the properties of high-quality arguments, we propose a set of features such as clarity, content variation, body movements, and pauses. These features are interpretable and can distinguish (p < 0.05) the quality of arguments. A hierarchical neural network model named MARQ is introduced to summarize long multimodal sequences for modeling argument quality. We build general multimodal fusion solutions for modeling spoken language. A neural attachment named MAG is designed to integrate raw acoustic and facial expressions in pre-trained language models during fine-tuning. We also propose a framework to convert non-verbal expressions into a textual format that reduces the multimodal model’s complexity and adds interpretability to the model’s decisions. These approaches achieve state-of-the-art results across diverse multimodal behavioral tasks of sentiment analysis, humor, and sarcasm detection. Finally, we develop a deep learning framework that learns rich acoustic-facial expressions in a self-supervised manner from millions of human speech videos (2-30 seconds long; 2000 hours). Fusing these representations with pre-trained language embeddings creates a rich contextual multimodal message and significantly improves the performances across these behavioral tasks. The findings of this thesis provide useful information about communication behaviors to the research community. We envision it contributing to developing many potential applications like a personal assistant with social skills, and airport security to designing feedback systems for people who lack communication skills
    corecore