1,343 research outputs found

    Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence

    Get PDF
    Objectives: Evidence-based medicine depends on the timely synthesis of research findings. An important source of synthesized evidence resides in systematic reviews. However, a bottleneck in review production involves dual screening of citations with titles and abstracts to find eligible studies. For this research, we tested the effect of various kinds of textual information (features) on performance of a machine learning classifier. Based on our findings, we propose an automated system to reduce screeing burden, as well as offer quality assurance. Methods: We built a database of citations from 5 systematic reviews that varied with respect to domain, topic, and sponsor. Consensus judgments regarding eligibility were inferred from published reports. We extracted 5 feature sets from citations: alphabetic, alphanumeric +, indexing, features mapped to concepts in systematic reviews, and topic models. To simulate a two-person team, we divided the data into random halves. We optimized the parameters of a Bayesian classifier, then trained and tested models on alternate data halves. Overall, we conducted 50 independent tests. Results: All tests of summary performance (mean F3) surpassed the corresponding baseline, P<0.0001. The ranks for mean F3, precision, and classification error were statistically different across feature sets averaged over reviews; P-values for Friedman's test were .045, .002, and .002, respectively. Differences in ranks for mean recall were not statistically significant. Alphanumeric+ features were associated with best performance; mean reduction in screening burden for this feature type ranged from 88% to 98% for the second pass through citations and from 38% to 48% overall. Conclusions: A computer-assisted, decision support system based on our methods could substantially reduce the burden of screening citations for systematic review teams and solo reviewers. Additionally, such a system could deliver quality assurance both by confirming concordant decisions and by naming studies associated with discordant decisions for further consideration. © 2014 Bekhuis et al

    Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health

    Get PDF
    Linking clinical narratives to standardized vocabularies and coding systems is a key component of unlocking the information in medical text for analysis. However, many domains of medical concepts lack well-developed terminologies that can support effective coding of medical text. We present a framework for developing natural language processing (NLP) technologies for automated coding of under-studied types of medical information, and demonstrate its applicability via a case study on physical mobility function. Mobility is a component of many health measures, from post-acute care and surgical outcomes to chronic frailty and disability, and is coded in the International Classification of Functioning, Disability, and Health (ICF). However, mobility and other types of functional activity remain under-studied in medical informatics, and neither the ICF nor commonly-used medical terminologies capture functional status terminology in practice. We investigated two data-driven paradigms, classification and candidate selection, to link narrative observations of mobility to standardized ICF codes, using a dataset of clinical narratives from physical therapy encounters. Recent advances in language modeling and word embedding were used as features for established machine learning models and a novel deep learning approach, achieving a macro F-1 score of 84% on linking mobility activity reports to ICF codes. Both classification and candidate selection approaches present distinct strengths for automated coding in under-studied domains, and we highlight that the combination of (i) a small annotated data set; (ii) expert definitions of codes of interest; and (iii) a representative text corpus is sufficient to produce high-performing automated coding systems. This study has implications for the ongoing growth of NLP tools for a variety of specialized applications in clinical care and research.Comment: Updated final version, published in Frontiers in Digital Health, https://doi.org/10.3389/fdgth.2021.620828. 34 pages (23 text + 11 references); 9 figures, 2 table

    Supporting Career Development and Employment: Benefits Planning, Assistance and Outreach (BPA&O) and Protection and Advocacy for Beneficiaries of Social Security (PABSS)

    Get PDF
    This training curriculum is dedicated to increasing knowledge and understanding of the Social Security Administration\u27s disability and return to work programs and work incentive provisions as prescribed in the Social Security Act and Ticket to Work and Work Incentives Improvement Act of 1999 as well as other federal benefit programs. These informational resources were compiled and edited to provide continuing education and print materials for benefits specialists and protection and advocacy personnel on the interplay of these benefit programs and impact or employment

    Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

    Get PDF
    This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn

    The Difference in Oral Reading Fluency Scores among Rural, Urban, and Suburban School Locations When Using Istation

    Get PDF
    The purpose of this quantitative, casual-comparative study was to determine if there is a difference in reading fluency scores among students in grades kindergarten through second grade using the Istation reading program. The importance of the study focused on the link between student’s oral reading fluency and overall reading comprehension. This study focused on approximately 3,000 kindergarten through second grade students from a total of nine elementary schools in central North Carolina. Three schools were located in a rural area of the community, three from an urban area and three from a suburban area. The data from students beginning of year Istation scores and middle of year Istation scores were analyzed using an ANCOVA statistical comparison. The results for the first hypothesis showed that there was a significant difference between the rural group and suburban group and the rural group and urban group. However, there was not a significant difference between the urban and suburban groups. The results for the second hypothesis showed there was a significant difference between the rural group and suburban group and the rural group and urban group. However, there was not a significant difference between the urban and suburban groups. For the third and final hypothesis the null hypothesis was accepted. The conclusion from the study showed there to be a significant difference between a student’s school location and testing scores. Further recommendations include additional studies that could be conducted centering around comparing students’ mid-year scores to their end of year scores as well as locations in other areas of the United States to determine any correlations

    Sentiment analysis of clinical narratives: A scoping review

    Get PDF
    A clinical sentiment is a judgment, thought or attitude promoted by an observation with respect to the health of an individual. Sentiment analysis has drawn attention in the healthcare domain for secondary use of data from clinical narratives, with a variety of applications including predicting the likelihood of emerging mental illnesses or clinical outcomes. The current state of research has not yet been summarized. This study presents results from a scoping review aiming at providing an overview of sentiment analysis of clinical narratives in order to summarize existing research and identify open research gaps. The scoping review was carried out in line with the PRISMA-ScR (Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews) guideline. Studies were identified by searching 4 electronic databases (e.g., PubMed, IEEE Xplore) in addition to conducting backward and forward reference list checking of the included studies. We extracted information on use cases, methods and tools applied, used datasets and performance of the sentiment analysis approach. Of 1,200 citations retrieved, 29 unique studies were included in the review covering a period of 8 years. Most studies apply general domain tools (e.g. TextBlob) and sentiment lexicons (e.g. SentiWordNet) for realizing use cases such as prediction of clinical outcomes; others proposed new domain-specific sentiment analysis approaches based on machine learning. Accuracy values between 71.5-88.2% are reported. Data used for evaluation and test are often retrieved from MIMIC databases or i2b2 challenges. Latest developments related to artificial neural networks are not yet fully considered in this domain. We conclude that future research should focus on developing a gold standard sentiment lexicon, adapted to the specific characteristics of clinical narratives. Efforts have to be made to either augment existing or create new high-quality labeled data sets of clinical narratives. Last, the suitability of state-of-the-art machine learning methods for natural language processing and in particular transformer-based models should be investigated for their application for sentiment analysis of clinical narratives

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe

    Doctor of Philosophy

    Get PDF
    DissertationHealth information technology (HIT) in conjunction with quality improvement (QI) methodologies can promote higher quality care at lower costs. Unfortunately, most inpatient hospital settings have been slow to adopt HIT and QI methodologies. Successful adoption requires close attention to workflow. Workflow is the sequence of tasks, processes, and the set of people or resources needed for those tasks that are necessary to accomplish a given goal. Assessing the impact on workflow is an important component of determining whether a HIT implementation will be successful, but little research has been conducted on the impact of eMeasure (electronic performance measure) implementation on workflow. One solution to addressing implementation challenges such as the lack of attention to workflow is an implementation toolkit. An implementation toolkit is an assembly of instruments such as checklists, forms, and planning documents. We developed an initial eMeasure Implementation Toolkit for the heart failure (HF) eMeasure to allow QI and information technology (IT) professionals and their team to assess the impact of implementation on workflow. During the development phase of the toolkit, we undertook a literature review to determine the components of the toolkit. We conducted stakeholder interviews with HIT and QI key informants and subject matter experts (SMEs) at the US Department of Veteran Affairs (VA). Key informants provided a broad understanding about the context of workflow during eMeasure implementation. Based on snowball sampling, we also interviewed other SMEs based on the recommendations of the key informants who suggested tools and provided information essential to the toolkit development. The second phase involved evaluation of the toolkit for relevance and clarity, by experts in non-VA settings. The experts evaluated the sections of the toolkit that contained the tools, via a survey. The final toolkit provides a distinct set of resources and tools, which were iteratively developed during the research and available to users in a single source document. The research methodology provided a strong unified overarching implementation framework in the form of the Promoting Action on Research Implementation in Health Services (PARIHS) model in combination with a sociotechnical model of HIT that strengthened the overall design of the study

    Automatic production and integration of knowledge to the support of the decision and planning activities in medical-clinical diagnosis, treatment and prognosis.

    Get PDF
    El concepto de procedimiento médico se refiere al conjunto de actividades seguidas por los profesionales de la salud para solucionar o mitigar el problema de salud que afecta a un paciente. La toma de decisiones dentro del procedimiento médico ha sido, por largo tiempo, uno de las áreas más interesantes de investigación en la informática médica y el contexto de investigación de esta tesis. La motivación para desarrollar este trabajo de investigación se basa en tres aspectos fundamentales: no hay modelos de conocimiento para todas las actividades médico-clínicas que puedan ser inducidas a partir de datos médicos, no hay soluciones de aprendizaje inductivo para todas las actividades de la asistencia médica y no hay un modelo integral que formalice el concepto de procedimiento médico. Por tanto, nuestro objetivo principal es desarrollar un modelo computable basado en conocimiento que integre todas las actividades de decisión y planificación para el diagnóstico, tratamiento y pronóstico médico-clínicos. Para alcanzar el objetivo principal, en primer lugar, explicamos el problema de investigación. En segundo lugar, describimos los antecedentes del problema de investigación desde los contextos médico e informático. En tercer lugar, explicamos el desarrollo de la propuesta de investigación, basada en cuatro contribuciones principales: un nuevo modelo, basado en datos y conocimiento, para la actividad de planificación en el diagnóstico y tratamiento médico-clínicos; una novedosa metodología de aprendizaje inductivo para la actividad de planificación en el diagnóstico y tratamiento médico-clínico; una novedosa metodología de aprendizaje inductivo para la actividad de decisión en el pronóstico médico-clínico, y finalmente, un nuevo modelo computable, basado en datos y conocimiento, que integra las actividades de decisión y planificación para el diagnóstico, tratamiento y pronóstico médico-clínicos.The concept of medical procedure refers to the set of activities carried out by the health care professionals to solve or mitigate the health problems that affect a patient. Decisions making within a medical procedure has been, for a long time, one of the most interesting research areas in medical informatics and the research context of this thesis. The motivation to develop this research work is based on three main aspects: Nowadays there are not knowledge models for all the medical-clinical activities that can be induced from medical data, there are not inductive learning solutions for all the medical-clinical activities, and there is not an integral model that formalizes the concept of medical procedure. Therefore, our main objective is to develop a computable model based in knowledge that integrates all the decision and planning activities for the medical-clinical diagnosis, treatment and prognosis. To achieve this main objective: first, we explain the research problem. Second, we describe the background of the work from both the medical and the informatics contexts. Third, we explain the development of the research proposal based on four main contributions: a novel knowledge representation model, based in data, to the planning activity in medical-clinical diagnosis and treatment; a novel inductive learning methodology to the planning activity in diagnosis and medical-clinical treatment; a novel inductive learning methodology to the decision activity in medical-clinical prognosis, and finally, a novel computable model, based on data and knowledge, which integrates the decision and planning activities of medical-clinical diagnosis, treatment and prognosis

    Study of Machine Learning Methods in Intelligent Transportation Systems

    Full text link
    Machine learning and data mining are currently hot topics of research and are applied in database, artificial intelligence, statistics, and so on to discover valuable knowledge and the patterns in big data available to users. Data mining is predominantly about processing unstructured data and extracting meaningful information from them for end users to help take business decisions. Machine learning techniques use mathematical algorithms to find a pattern or extract meaning out from big data. The popularity of such techniques in analyzing business problems has been enhanced by the arrival of big data. The main objective of this thesis is to study the importance of big data and machine learning and their impact on transportation industry. This thesis is primarily a review of the important machine learning algorithms and their applications in the field of big data. The author has tried to showcase the need to extract meaningful information from the vast amount of big data in the form of traffic data available in today’s world and also listed different machine learning techniques that can be used to extract this knowledge required in order to facilitate better decision making for transportation applications. The analysis is done by using five different multivariate analysis and machine learning techniques in data mining namely cluster analysis, multivariate linear regression, hierarchical multiple regression, factor analysis and discriminant analysis in two different software packages namely SPSS and R. As part of the analysis, the author has tried to explain how knowledge extracted from random traffic data containing variables such as age of the driver, sex of the driver, the day of the week, atmospheric condition and blood alcohol content of the driver can play an important role in predicting the traffic crash. The data taken into account is accident data, which was obtained from Fatality Analysis Reporting System (FARS) ranging from the year 1999 to 2009. It is concluded that traffic accidents were mostly impacted by the atmospheric conditions, blood alcohol content followed by the day of the week
    • …
    corecore