3,857 research outputs found

    The Use and Misuse of Biomedical Data: Is Bigger Really Better?”

    Get PDF
    Very large biomedical research databases, containing electronic health records (HER) and genomic data from millions of patients, have been heralded recently for their potential to accelerate scientific discovery and produce dramatic improvements in medical treatments. Research enabled by these databases may also lead to profound changes in law, regulation, social policy, and even litigation strategies. Yet, is “big data” necessarily better data? This paper makes an original contribution to the legal literature by focusing on what can go wrong in the process of biomedical database research and what precautions are necessary to avoid critical mistakes. We address three main reasons for a cautious approach to such research and to relying on its outcomes for purposes of public policy or litigation. First, the data contained in databases is surprisingly likely to be incorrect or incomplete. Second, systematic biases, arising from both the nature of the data and the preconceptions of investigators, are serious threats to the validity of biomedical database research, especially in answering causal questions. Third, data mining of biomedical databases makes it easier for individuals with political, social, or economic agendas to generate ostensibly scientific but misleading research findings for the purpose of manipulating public opinion and swaying policy makers. In short, this paper sheds much-needed light on the problems of credulous and uninformed uses of biomedical databases. An understanding of the pitfalls of big data analysis is of critical importance to anyone who will rely on or dispute its outcomes, including lawyers, policy makers, and the public at large. The article also recommends technical, methodological, and educational interventions to combat the dangers of database errors and abuses

    Social analytics for health integration, intelligence, and monitoring

    Get PDF
    Nowadays, patient-generated social health data are abundant and Healthcare is changing from the authoritative provider-centric model to collaborative and patient-oriented care. The aim of this dissertation is to provide a Social Health Analytics framework to utilize social data to solve the interdisciplinary research challenges of Big Data Science and Health Informatics. Specific research issues and objectives are described below. The first objective is semantic integration of heterogeneous health data sources, which can vary from structured to unstructured and include patient-generated social data as well as authoritative data. An information seeker has to spend time selecting information from many websites and integrating it into a coherent mental model. An integrated health data model is designed to allow accommodating data features from different sources. The model utilizes semantic linked data for lightweight integration and allows a set of analytics and inferences over data sources. A prototype analytical and reasoning tool called “Social InfoButtons” that can be linked from existing EHR systems is developed to allow doctors to understand and take into consideration the behaviors, patterns or trends of patients’ healthcare practices during a patient’s care. The tool can also shed insights for public health officials to make better-informed policy decisions. The second objective is near-real time monitoring of disease outbreaks using social media. The research for epidemics detection based on search query terms entered by millions of users is limited by the fact that query terms are not easily accessible by non-affiliated researchers. Publically available Twitter data is exploited to develop the Epidemics Outbreak and Spread Detection System (EOSDS). EOSDS provides four visual analytics tools for monitoring epidemics, i.e., Instance Map, Distribution Map, Filter Map, and Sentiment Trend to investigate public health threats in space and time. The third objective is to capture, analyze and quantify public health concerns through sentiment classifications on Twitter data. For traditional public health surveillance systems, it is hard to detect and monitor health related concerns and changes in public attitudes to health-related issues, due to their expenses and significant time delays. A two-step sentiment classification model is built to measure the concern. In the first step, Personal tweets are distinguished from Non-Personal tweets. In the second step, Personal Negative tweets are further separated from Personal Non-Negative tweets. In the proposed classification, training data is labeled by an emotion-oriented, clue-based method, and three Machine Learning models are trained and tested. Measure of Concern (MOC) is computed based on the number of Personal Negative sentiment tweets. A timeline trend of the MOC is also generated to monitor public concern levels, which is important for health emergency resource allocations and policy making. The fourth objective is predicting medical condition incidence and progression trajectories by using patients’ self-reported data on PatientsLikeMe. Some medical conditions are correlated with each other to a measureable degree (“comorbidities”). A prediction model is provided to predict the comorbidities and rank future conditions by their likelihood and to predict the possible progression trajectories given an observed medical condition. The novel models for trajectory prediction of medical conditions are validated to cover the comorbidities reported in the medical literature

    Development of computer model and expert system for pneumatic fracturing of geologic formations

    Get PDF
    The objective of this study was the development of a new computer program called PF-Model to analyze pneumatic fracturing of geologic formations. Pneumatic fracturing is an in situ remediation process that involves injecting high pressure gas into soil or rock matrices to enhance permeability, as well as to introduce liquid and solid amendments. PF-Model has two principal components: (1) Site Screening, which heuristically evaluates sites with regard to process applicability; and (2) System Design, which uses the numerical solution of a coupled algorithm to generate preliminary design parameters. Designed as an expert system, the Site Screening component is a high performance computer program capable of simulating human expertise within a narrow domain. The reasoning process is controlled by the inference engine, which uses subjective probability theory (based on Bayes\u27 theorem) to handle uncertainty. The expert system also contains an extensive knowledge base of geotechnical data related to field performance of pneumatic fracturing. The hierarchical order of importance established for the geotechnical properties was formation type, depth, consistency/relative density, plasticity, fracture frequency, weathering, and depth of water table. The expert system was validated by a panel of five experts who rated selected sites on the applicability of the three main variants of pneumatic fracturing. Overall, PF-Model demonstrated better than an 80% agreement with the expert panel. The System Design component was programmed with structured algorithms to accomplish two main functions: (1) to estimate fracture aperture and radius (Fracture Prediction Mode); and (2) to calibrate post-fracture Young\u27s modulus and pneumatic conductivity (Calibration Mode). The Fracture Prediction Mode uses numerical analysis to converge on a solution by considering the three coupled physical processes that affect fracture propagation: pressure distribution, leakoff, and deflection. The Calibration Mode regresses modulus using a modified deflection equation, and then converges on the conductivity in a method similar to the Fracture Prediction Mode. The System Design component was validated and calibrated for each of the 14 different geologic formation types supported by the program. Validation was done by comparing the results of PF-Model to the original mathematical model. For the calibration process, default values for flow rate, density, Poisson\u27s ratio, modulus, and pneumatic conductivity were established by regression until the model simulated, in general, actual site behavior. PF-Model was programmed in Visual Basic 5.0 and features a menu driven GUI. Three extensive default libraries are provided: probabilistic knowledge base, flownet shape factors, and geotechnical defaults. Users can conveniently access and modify the default libraries to reflect evolving trends and knowledge. Recommendations for future study are included in the work
    • …
    corecore