Search CORE

13 research outputs found

Approaches to Capacity Building for Machine Learning and Artificial Intelligence Applications in Health

Author: Frank Sullivan
Garth Gibson
P. Alison Paprica
Yin Aphinyanaphongs
Publication venue: 'Swansea University'
Publication date: 01/09/2018
Field of study

Many health systems and research institutes are interested in supplementing their traditional analyses of linked data with machine learning (ML) and other artificial intelligence (AI) methods and tools. However, the availability of individuals who have the required skills to develop and/or implement ML/AI is a constraint, as there is high demand for ML/AI talent in many sectors. The three organizations presenting are all actively involved in training and capacity building for ML/AI broadly, and each has a focus on, and/or discrete initiatives for, particular trainees. P. Alison Paprica, Vector Institute for artificial intelligence, Institute for Clinical Evaluative Sciences, University of Toronto, Canada. Alison is VP, Health Strategy and Partnerships at Vector, responsible for health strategy and also playing a lead role in “1000AIMs” – a Vector-led initiative in support of the Province of Ontario’s \$30 million investment to increase the number of AI-related master’s program graduates to 1,000 per year within five years. Frank Sullivan, University of St Andrews Scotland. Frank is a family physician and an associate director of HDRUK@Scotland. Health Data Research UK \url{https://hdruk.ac.uk/} has recently provided funding to six sites across the UK to address challenging healthcare issues through use of data science. A 50 PhD student Doctoral Training Scheme in AI has also been announced. Each site works in close partnership with National Health Service bodies and the public to translate research findings into benefits for patients and populations. Yin Aphinyanaphongs – INTREPID NYU clinical training program for incoming clinical fellows. Yin is the Director of the Clinical Informatics Training Program at NYU Langone Health. He is deeply interested in the intersection of computer science and health care and as a physician and a scientist, he has a unique perspective on how to train medical professionals for a data drive world. One version of this teaching process is demonstrated in the INTREPID clinical training program. Yin teaches clinicians to work with large scale data within the R environment and generate hypothesis and insights. The session will begin with three brief presentations followed by a facilitated session where all participants share their insights about the essential skills and competencies required for different kinds of ML/AI application and contributions. Live polling and voting will be used at the end of the session to capture participants’ view on the key learnings and take away points. The intended outputs and outcomes of the session are: • Participants will have a better understanding of the skills and competencies required for individuals to contribute to AI applications in health in various ways • Participants will gain knowledge about different options for capacity building from targeted enhancement of the skills of clinical fellows, to producing large number of applied master’s graduates, to doctoral-level training After the session, the co-leads will work together to create a resource that summarizes the learnings from the session and make them public (though publication in a peer-reviewed journal and/or through the IPDLN website

Directory of Open Access Journals

LLMs Understand Glass-Box Models, Discover Surprises, and Suggest Repairs

Author: Aphinyanaphongs Yin
Bordt Sebastian
Caruana Rich
Kellis Manolis
Lengerich Benjamin J.
Nori Harsha
Nunnally Mark E.
Publication venue
Publication date: 02/08/2023
Field of study

We show that large language models (LLMs) are remarkably good at working with interpretable models that decompose complex outcomes into univariate graph-represented components. By adopting a hierarchical approach to reasoning, LLMs can provide comprehensive model-level summaries without ever requiring the entire model to fit in context. This approach enables LLMs to apply their extensive background knowledge to automate common tasks in data science such as detecting anomalies that contradict prior knowledge, describing potential reasons for the anomalies, and suggesting repairs that would remove the anomalies. We use multiple examples in healthcare to demonstrate the utility of these new capabilities of LLMs, with particular emphasis on Generalized Additive Models (GAMs). Finally, we present the package

\texttt{TalkToEBM}

as an open-source LLM-GAM interface

arXiv.org e-Print Archive

Learning Boolean Queries for Article Quality Filtering

Author: Constantin Aliferis
Yin Aphinyanaphongs
Publication venue
Publication date
Field of study

... have the ability to identify high quality content-specific articles in the domain of internal medicine. These models, though powerful, cannot be used in Boolean search engines nor can the content of the models be verified via human inspection. In this paper, we use decision trees combined with several feature selection methods to generate Boolean query filters for the same domain and task. The resulting trees are generated automatically and exhibit high performance. The trees are understandable, manageable, and able to be validated by humans. The subsequent Boolean queries are sensible and can be readily used as filters by Boolean search engines

CiteSeerX

Text Categorization Models for Identifying Unproven Cancer Treatments on the Web

Author: Constantin Aliferis Abc
Yin Aphinyanaphongs A
Publication venue
Publication date
Field of study

The nature of the internet as a non-peer-reviewed (and more generally largely unregulated) publication medium has allowed wide-spread promotion of inaccurate and unproven medical claims in unprecedented scale. Patients with conditions that are not currently fully treatable are particularly susceptible to unproven and dangerous promises about miracle treatments. In extreme cases, fatal adverse outcomes have been documented. Most commonly, the cost is financial, psychological, and delayed application of imperfect but proven scientific modalities. To help protect patients, who may be desperately ill and thus prone to exploitation, we explored the use of machine learning techniques to identify web pages that make unproven claims. This feasibility study shows that the resulting models can identify web pages that make unproven claims in a fully automatic manner, and substantially better than previous web tools and state-of-the-art search engine technology.

CiteSeerX

TEXT CLASSIFICATION FOR AUTOMATIC DETECTION OF E-CIGARETTE USE AND USE FOR SMOKING CESSATION FROM TWITTER: A FEASIBILITY PILOT

Author: Armine Lulejian
Duncan Penfold Brown
Paul Krebs
Richard Bonneau
Yin Aphinyanaphongs
Publication venue
Publication date: 23/04/2020
Field of study

Rapid increases in e-cigarette use and potential exposure to harmful byproducts have shifted public health focus to e-cigarettes as a possible drug of abuse. Effective surveillance of use and prevalence would allow appropriate regulatory responses. An ideal surveillance system would collect usage data in real time, focus on populations of interest, include populations unable to take the survey, allow a breadth of questions to answer, and enable geo-location analysis. Social media streams may provide this ideal system. To realize this use case, a foundational question is whether we can detect ecigarette use at all. This work reports two pilot tasks using text classification to identify automatically Tweets that indicate e-cigarette use and/or e-cigarette use for smoking cessation. We build and define both datasets and compare performance of 4 state of the art classifiers and a keyword search for each task. Our results demonstrate excellent classifier performance of up to 0.90 and 0.94 area under the curve in each category. These promising initial results form the foundation for further studies to realize the ideal surveillance solution

CiteSeerX

Thrombosis in Hospitalized Patients with COVID-19 in a New York City Health System

Author: Aphinyanaphongs Yin
Berger Jeffrey S.
Bilaloglu Seda
Hochman Judith
Iturrate Eduardo
Jones Simon
Publication venue: 'American Medical Association (AMA)'
Publication date: 25/08/2020
Field of study

George Washington University: Health Sciences Research Commons (HSRC)

An SVM-based high-quality article classifier for systematic reviews

Author: Aphinyanaphongs
Bastian
Chalmers
Cohen
Cohen
Cohen
Jinwook Choi
Kilicoglu
Lewis
Mallett
Matwin
Porter
Sackett
Seunghee Kim
Tom
Yin
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Thrombosis in Hospitalized Patients With COVID-19 in a New York City Health System

Author: Bunce
Connors
Cui
Eduardo Iturrate
Grimnes
Jeffrey S. Berger
Judith Hochman
Klok
Seda Bilaloglu
Simon Jones
Swartz
Yin Aphinyanaphongs
Publication venue: 'American Medical Association (AMA)'
Publication date
Field of study

Crossref