158 research outputs found

    Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

    Full text link
    From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

    Recent trends and developments in pyrolysis-gas chromatography: review

    Get PDF
    Pyrolysis-gas chromatography (Py-GC) has become well established as a simple, quick and reliable analytical technique for a range of applications including the analysis of polymeric materials. Recent developments in Py-GC technology and instrumentation include laser pyrolysis and non-discriminating pyrolysis. Progress has also been made in the detection of low level polymer additives with the use of novel Py-GC devices. Furthermore, it has been predicted that future advances in separation technology such as the use of comprehensive two-dimensional gas chromatography will further enhance the analytical scope of Py-GC

    The epidemiology and transmission of methicillin-resistant Staphylococcus aureus in the community in Singapore: study protocol for a longitudinal household study.

    Get PDF
    BACKGROUND/AIM: Methicillin-resistant Staphylococcus aureus (MRSA) is one of the most common multidrug-resistant organisms in healthcare settings worldwide, but little is known about MRSA transmission outside of acute healthcare settings especially in Asia. We describe the methods for a prospective longitudinal study of MRSA prevalence and transmission. METHODS: MRSA-colonized individuals were identified from MRSA admission screening at two tertiary hospitals and recruited together with their household contacts. Participants submitted self-collected nasal, axilla and groin (NAG) swabs by mail for MRSA culture at baseline and monthly thereafter for 6 months. A comparison group of households of MRSA-negative patients provided swab samples at one time point. In a validation sub-study, separate swabs from each site were collected from randomly selected individuals, to compare MRSA detection rates between swab sites, and between samples collected by participants versus those collected by trained research staff. Information on each participant's demographic information, medical status and medical history, past healthcare facilities usage and contacts, and personal interactions with others were collected using a self-administered questionnaire. DISCUSSION/CONCLUSION: Understanding the dynamics of MRSA persistence and transmission in the community is crucial to devising and evaluating successful MRSA control strategies. Close contact with MRSA colonized patients may to be important for MRSA persistence in the community; evidence from this study on the extent of community MRSA could inform the development of household- or community-based interventions to reduce MRSA colonization of close contacts and subsequent re-introduction of MRSA into healthcare settings. Analysis of longitudinal data using whole-genome sequencing will yield further information regarding MRSA transmission within households, with significant implications for MRSA infection control outside acute hospital settings

    The Impact of Oxygen on Metabolic Evolution: A Chemoinformatic Investigation

    Get PDF
    The appearance of planetary oxygen likely transformed the chemical and biochemical makeup of life and probably triggered episodes of organismal diversification. Here we use chemoinformatic methods to explore the impact of the rise of oxygen on metabolic evolution. We undertake a comprehensive comparative analysis of structures, chemical properties and chemical reactions of anaerobic and aerobic metabolites. The results indicate that aerobic metabolism has expanded the structural and chemical space of metabolites considerably, including the appearance of 130 novel molecular scaffolds. The molecular functions of these metabolites are mainly associated with derived aspects of cellular life, such as signal transfer, defense against biotic factors, and protection of organisms from oxidation. Moreover, aerobic metabolites are more hydrophobic and rigid than anaerobic compounds, suggesting they are better fit to modulate membrane functions and to serve as transmembrane signaling factors. Since higher organisms depend largely on sophisticated membrane-enabled functions and intercellular signaling systems, the metabolic developments brought about by oxygen benefit the diversity of cellular makeup and the complexity of cellular organization as well. These findings enhance our understanding of the molecular link between oxygen and evolution. They also show the significance of chemoinformatics in addressing basic biological questions

    Conserved CDC20 Cell Cycle Functions Are Carried out by Two of the Five Isoforms in Arabidopsis thaliana

    Get PDF
    The CDC20 and Cdh1/CCS52 proteins are substrate determinants and activators of the Anaphase Promoting Complex/Cyclosome (APC/C) E3 ubiquitin ligase and as such they control the mitotic cell cycle by targeting the degradation of various cell cycle regulators. In yeasts and animals the main CDC20 function is the destruction of securin and mitotic cyclins. Plants have multiple CDC20 gene copies whose functions have not been explored yet. In Arabidopsis thaliana there are five CDC20 isoforms and here we aimed at defining their contribution to cell cycle regulation, substrate selectivity and plant development.Studying the gene structure and phylogeny of plant CDC20s, the expression of the five AtCDC20 gene copies and their interactions with the APC/C subunit APC10, the CCS52 proteins, components of the mitotic checkpoint complex (MCC) and mitotic cyclin substrates, conserved CDC20 functions could be assigned for AtCDC20.1 and AtCDC20.2. The other three intron-less genes were silent and specific for Arabidopsis. We show that AtCDC20.1 and AtCDC20.2 are components of the MCC and interact with mitotic cyclins with unexpected specificity. AtCDC20.1 and AtCDC20.2 are expressed in meristems, organ primordia and AtCDC20.1 also in pollen grains and developing seeds. Knocking down both genes simultaneously by RNAi resulted in severe delay in plant development and male sterility. In these lines, the meristem size was reduced while the cell size and ploidy levels were unaffected indicating that the lower cell number and likely slowdown of the cell cycle are the cause of reduced plant growth.The intron-containing CDC20 gene copies provide conserved and redundant functions for cell cycle progression in plants and are required for meristem maintenance, plant growth and male gametophyte formation. The Arabidopsis-specific intron-less genes are possibly "retrogenes" and have hitherto undefined functions or are pseudogenes

    Integrating the Genetic and Physical Maps of Arabidopsis thaliana: Identification of Mapped Alleles of Cloned Essential (EMB) Genes

    Get PDF
    The classical genetic map of Arabidopsis includes more than 130 genes with an embryo-defective (emb) mutant phenotype. Many of these essential genes remain to be cloned. Hundreds of additional EMB genes have been cloned and catalogued (www.seedgenes.org) but not mapped. To facilitate EMB gene identification and assess the current level of saturation, we updated the classical map, compared the physical and genetic locations of mapped loci, and performed allelism tests between mapped (but not cloned) and cloned (but not mapped) emb mutants with similar chromosome locations. Two hundred pairwise combinations of genes located on chromosomes 1 and 5 were tested and more than 1100 total crosses were screened. Sixteen of 51 mapped emb mutants examined were found to be disrupted in a known EMB gene. Alleles of a wide range of published EMB genes (YDA, GLA1, TIL1, AtASP38, AtDEK1, EMB506, DG1, OEP80) were discovered. Two EMS mutants isolated 30 years ago, T-DNA mutants with complex insertion sites, and a mutant with an atypical, embryo-specific phenotype were resolved. The frequency of allelism encountered was consistent with past estimates of 500 to 1000 EMB loci. New EMB genes identified among mapped T-DNA insertion mutants included CHC1, which is required for chromatin remodeling, and SHS1/AtBT1, which encodes a plastidial nucleotide transporter similar to the maize Brittle1 protein required for normal endosperm development. Two classical genetic markers (PY, ALB1) were identified based on similar map locations of known genes required for thiamine (THIC) and chlorophyll (PDE166) biosynthesis. The alignment of genetic and physical maps presented here should facilitate the continued analysis of essential genes in Arabidopsis and further characterization of a broad spectrum of mutant phenotypes in a model plant

    Does the early frog catch the worm? Disentangling potential drivers of a parasite age–intensity relationship in tadpoles

    Get PDF
    The manner in which parasite intensity and aggregation varies with host age can provide insights into parasite dynamics and help identify potential means of controlling infections in humans and wildlife. A significant challenge is to distinguish among competing mechanistic hypotheses for the relationship between age and parasite intensity or aggregation. Because different mechanisms can generate similar relationships, testing among competing hypotheses can be difficult, particularly in wildlife hosts, and often requires a combination of experimental and model fitting approaches. We used field data, experiments, and model fitting to distinguish among ten plausible drivers of a curvilinear age–intensity relationship and increasing aggregation with host age for echinostome trematode infections of green frogs. We found little support for most of these proposed drivers but did find that the parsimonious explanation for the observed age–intensity relationship was seasonal exposure to echinostomes. The parsimonious explanation for the aggregated distribution of parasites in this host population was heterogeneity in exposure. A predictive model incorporating seasonal exposure indicated that tadpoles hatching early or late in the breeding season should have lower trematode burdens at metamorphosis, particularly with simulated warmer climates. Application of this multi-pronged approach (field surveys, lab experiments, and modeling) to additional parasite–host systems could lead to discovery of general patterns in the drivers of parasite age–intensity and age–distribution relationships

    NMR Studies on Structure and Dynamics of the Monomeric Derivative of BS-RNase: New Insights for 3D Domain Swapping

    Get PDF
    Three-dimensional domain swapping is a common phenomenon in pancreatic-like ribonucleases. In the aggregated state, these proteins acquire new biological functions, including selective cytotoxicity against tumour cells. RNase A is able to dislocate both N- and C-termini, but usually this process requires denaturing conditions. In contrast, bovine seminal ribonuclease (BS-RNase), which is a homo-dimeric protein sharing 80% of sequence identity with RNase A, occurs natively as a mixture of swapped and unswapped isoforms. The presence of two disulfides bridging the subunits, indeed, ensures a dimeric structure also to the unswapped molecule. In vitro, the two BS-RNase isoforms interconvert under physiological conditions. Since the tendency to swap is often related to the instability of the monomeric proteins, in these paper we have analysed in detail the stability in solution of the monomeric derivative of BS-RNase (mBS) by a combination of NMR studies and Molecular Dynamics Simulations. The refinement of NMR structure and relaxation data indicate a close similarity with RNase A, without any evidence of aggregation or partial opening. The high compactness of mBS structure is confirmed also by H/D exchange, urea denaturation, and TEMPOL mapping of the protein surface. The present extensive structural and dynamic investigation of (monomeric) mBS did not show any experimental evidence that could explain the known differences in swapping between BS-RNase and RNase A. Hence, we conclude that the swapping in BS-RNase must be influenced by the distinct features of the dimers, suggesting a prominent role for the interchain disulfide bridges

    Cervical lymph node metastasis in adenoid cystic carcinoma of the larynx: a collective international review

    Get PDF
    Adenoid cystic carcinoma (AdCC) of the head and neck is a well-recognized pathologic entity that rarely occurs in the larynx. Although the 5-year locoregional control rates are high, distant metastasis has a tendency to appear more than 5 years post treatment. Because AdCC of the larynx is uncommon, it is difficult to standardize a treatment protocol. One of the controversial points is the decision whether or not to perform an elective neck dissection on these patients. Because there is contradictory information about this issue, we have critically reviewed the literature from 1912 to 2015 on all reported cases of AdCC of the larynx in order to clarify this issue. During the most recent period of our review (1991-2015) with a more exact diagnosis of the tumor histology, 142 cases were observed of AdCC of the larynx, of which 91 patients had data pertaining to lymph node status. Eleven of the 91 patients (12.1%) had nodal metastasis and, based on this low proportion of patients, routine elective neck dissection is therefore not recommended
    corecore