71 research outputs found

    The writing on the wall: the concealed communities of the East Yorkshire horselads

    Get PDF
    This paper examines the graffiti found within late nineteenth and early-twentieth century farm buildings in the Wolds of East Yorkshire. It suggests that the graffiti were created by a group of young men at the bottom of the social hierarchy - the horselads – and was one of the ways in which they constructed a distinctive sense of communal identity, at a particular stage in their lives. Whilst it tells us much about changing agricultural regimes and social structures, it also informs us about experiences and attitudes often hidden from official histories and biographies. In this way, the graffiti are argued to inform our understanding, not only of a concealed community, but also about their hidden histor

    Tutorial: Multivariate Classification for Vibrational Spectroscopy in Biological Samples

    Get PDF
    Vibrational spectroscopy techniques, such as Fourier-transform infrared (FTIR) and Raman spectroscopy, have been successful methods for studying the interaction of light with biological materials and facilitating novel cell biology analysis. Spectrochemical analysis is very attractive in disease screening and diagnosis, microbiological studies and forensic and environmental investigations because of its low cost, minimal sample preparation, non-destructive nature and substantially accurate results. However, there is now an urgent need for multivariate classification protocols allowing one to analyze biologically derived spectrochemical data to obtain accurate and reliable results. Multivariate classification comprises discriminant analysis and class-modeling techniques where multiple spectral variables are analyzed in conjunction to distinguish and assign unknown samples to pre-defined groups. The requirement for such protocols is demonstrated by the fact that applications of deep-learning algorithms of complex datasets are being increasingly recognized as critical for extracting important information and visualizing it in a readily interpretable form. Hereby, we have provided a tutorial for multivariate classification analysis of vibrational spectroscopy data (FTIR, Raman and near-IR) highlighting a series of critical steps, such as preprocessing, data selection, feature extraction, classification and model validation. This is an essential aspect toward the construction of a practical spectrochemical analysis model for biological analysis in real-world applications, where fast, accurate and reliable classification models are fundamental

    Estimating cardiac output from arterial blood pressurewaveforms: a critical evaluation using the MIMIC II database

    No full text
    Cardiac output (CO) estimation using arterial blood pressure (ABP) waveforms has been an active area of physiology research over the past century. However, the ef-fectiveness of the estimators has not been extensively stud-ied in a clinical setting. In this paper, we evaluate 11 well-known CO estimators using clinical radial ABP waveforms from the Multi-Parameter Intelligent Monitoring for Inten-sive Care II (MIMIC II) database, using thermodilution CO (TCO) as reference for comparison. We compare esti-mations to 988 TCO measurements in 84 patients, totaling 165 hours of ABP waveforms sampled at 125 Hz. As a nec-essary step for producing absolute CO estimates, we also present 3 methods of calibrating the estimators, each tai-lored towards a different use model. The results show that the standard deviation of error between TCO and the best CO estimators is approximately 1 L/min for absolute CO estimates. For relative estimates without calibration, the best CO estimator has 18 % error at 1 standard deviation. 1

    Automated de-identification of free-text medical records

    No full text
    Background: Text-based patient medical records are a vital resource in medical research. In order to preserve patient confidentiality, however, the U.S. Health Insurance Portability and Accountability Act (HIPAA) requires that protected health information (PHI) be removed from medical records before they can be disseminated. Manual de-identification of large medical record databases is prohibitively expensive, time-consuming and prone to error, necessitating automatic methods for large-scale, automated de-identification. Methods: We describe an automated Perl-based de-identification software package that is generally usable on most free-text medical records, e.g., nursing notes, discharge summaries, X-ray reports, etc. The software uses lexical look-up tables, regular expressions, and simple heuristics to locate both HIPAA PHI, and an extended PHI set that includes doctors' names and years of dates. To develop the de-identification approach, we assembled a gold standard corpus of re-identified nursing notes with real PHI replaced by realistic surrogate information. This corpus consists of 2,434 nursing notes containing 334,000 words and a total of 1,779 instances of PHI taken from 163 randomly selected patient records. This gold standard corpus was used to refine the algorithm and measure its sensitivity. To test the algorithm on data not used in its development, we constructed a second test corpus of 1,836 nursing notes containing 296,400 words. The algorithm's false negative rate was evaluated using this test corpus. Results: Performance evaluation of the de-identification software on the development corpus yielded an overall recall of 0.967, precision value of 0.749, and fallout value of approximately 0.002. On the test corpus, a total of 90 instances of false negatives were found, or 27 per 100,000 word count, with an estimated recall of 0.943. Only one full date and one age over 89 were missed. No patient names were missed in either corpus. Conclusion: We have developed a pattern-matching de-identification system based on dictionary look-ups, regular expressions, and heuristics. Evaluation based on two different sets of nursing notes collected from a U.S. hospital suggests that, in terms of recall, the software out-performs a single human de-identifier (0.81) and performs at least as well as a consensus of two human de-identifiers (0.94). The system is currently tuned to de-identify PHI in nursing notes and discharge summaries but is sufficiently generalized and can be customized to handle text files of any format. Although the accuracy of the algorithm is high, it is probably insufficient to be used to publicly disseminate medical data. The open-source de-identification software and the gold standard re-identified corpus of medical records have therefore been made available to researchers via the PhysioNet website to encourage improvements in the algorithm. </p
    • …
    corecore