128 research outputs found

    Identification of a 5-Protein Biomarker Molecular Signature for Predicting Alzheimer's Disease

    Get PDF
    Background: Alzheimer’s disease (AD) is a progressive brain disease with a huge cost to human lives. The impact of the disease is also a growing concern for the governments of developing countries, in particular due to the increasingly high number of elderly citizens at risk. Alzheimer’s is the most common form of dementia, a common term for memory loss and other cognitive impairments. There is no current cure for AD, but there are drug and non-drug based approaches for its treatment. In general the drug-treatments are directed at slowing the progression of symptoms. They have proved to be effective in a large group of patients but success is directly correlated with identifying the disease carriers at its early stages. This justifies the need for timely and accurate forms of diagnosis via molecular means. We report here a 5-protein biomarker molecular signature that achieves, on average, a 96% total accuracy in predicting clinical AD. The signature is composed of the abundances of IL-1α, IL-3, EGF, TNF-α and G-CSF. Methodology/Principal Findings: Our results are based on a recent molecular dataset that has attracted worldwide attention. Our paper illustrates that improved results can be obtained with the abundance of only five proteins. Our methodology consisted of the application of an integrative data analysis method. This four step process included: a) abundance quantization, b) feature selection, c) literature analysis, d) selection of a classifier algorithm which is independent of the feature selection process. These steps were performed without using any sample of the test datasets. For the first two steps, we used the application of Fayyad and Irani’s discretization algorithm for selection and quantization, which in turn creates an instance of the (alpha-beta)-k-Feature Set problem; a numerical solution of this problem led to the selection of only 10 proteins. Conclusions/Significance: the previous study has provided an extremely useful dataset for the identification of A biomarkers. However, our subsequent analysis also revealed several important facts worth reporting: 1. A 5-protein signature (which is a subset of the 18-protein signature of Ray et al.) has the same overall performance (when using the same classifier). 2. Using more than 20 different classifiers available in the widely-used Weka software package, our 5- protein signature has, on average, a smaller prediction error indicating the independence of the classifier and the robustness of this set of biomarkers (i.e. 96% accuracy when predicting AD against non-demented control). 3. Using very simple classifiers, like Simple Logistic or Logistic Model Trees, we have achieved the following results on 92 samples: 100 percent success to predict Alzheimer’s Disease and 92 percent to predict Non Demented Control on the AD dataset

    A bioinformatics knowledge discovery in text application for grid computing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A fundamental activity in biomedical research is Knowledge Discovery which has the ability to search through large amounts of biomedical information such as documents and data. High performance computational infrastructures, such as Grid technologies, are emerging as a possible infrastructure to tackle the intensive use of Information and Communication resources in life science. The goal of this work was to develop a software middleware solution in order to exploit the many knowledge discovery applications on scalable and distributed computing systems to achieve intensive use of ICT resources.</p> <p>Methods</p> <p>The development of a grid application for Knowledge Discovery in Text using a middleware solution based methodology is presented. The system must be able to: perform a user application model, process the jobs with the aim of creating many parallel jobs to distribute on the computational nodes. Finally, the system must be aware of the computational resources available, their status and must be able to monitor the execution of parallel jobs. These operative requirements lead to design a middleware to be specialized using user application modules. It included a graphical user interface in order to access to a node search system, a load balancing system and a transfer optimizer to reduce communication costs.</p> <p>Results</p> <p>A middleware solution prototype and the performance evaluation of it in terms of the speed-up factor is shown. It was written in JAVA on Globus Toolkit 4 to build the grid infrastructure based on GNU/Linux computer grid nodes. A test was carried out and the results are shown for the named entity recognition search of symptoms and pathologies. The search was applied to a collection of 5,000 scientific documents taken from PubMed.</p> <p>Conclusion</p> <p>In this paper we discuss the development of a grid application based on a middleware solution. It has been tested on a knowledge discovery in text process to extract new and useful information about symptoms and pathologies from a large collection of unstructured scientific documents. As an example a computation of Knowledge Discovery in Database was applied on the output produced by the KDT user module to extract new knowledge about symptom and pathology bio-entities.</p

    Use of varenicline for smoking cessation treatment in UK primary care: an association rule mining analysis

    Get PDF
    BACKGROUND: Varenicline is probably the most effective smoking cessation pharmacotherapy, but is less widely used than nicotine replacement therapy. We therefore set out to identify the characteristics of numerically important groups of patients who typically do, or do not, receive varenicline in the UK. METHODS: We used association rule mining to analyse data on prescribing of smoking cessation pharmacotherapy in relation to age, sex, comorbidity and other variables from 477,620 people aged 16 years and over, registered as patients throughout 2011 with one of 559 UK general practices in The Health Improvement Network (THIN) database, and recorded to be current smokers. RESULTS: 46,685 participants (9.8% of all current smokers) were prescribed any smoking cessation treatment during 2011, and 19,316 of these (4% of current smokers, 41% of those who received any therapy) were prescribed varenicline. Prescription of varenicline was most common among heavy smokers aged 31–60, and in those with a diagnosis of COPD. Varenicline was rarely used among smokers who were otherwise in good health, or were aged over 60, were lighter smokers, or had psychotic disorders or dementia. CONCLUSIONS: Varenicline is being underused in healthy smokers, or in older smokers, and in those with psychotic disorders or dementia. Since varenicline is probably the most effective available single cessation therapy, this study identifies under-treatment of substantial public health significance

    Occupancy Classification of Position Weight Matrix-Inferred Transcription Factor Binding Sites

    Get PDF
    BACKGROUND: Computational prediction of Transcription Factor Binding Sites (TFBS) from sequence data alone is difficult and error-prone. Machine learning techniques utilizing additional environmental information about a predicted binding site (such as distances from the site to particular chromatin features) to determine its occupancy/functionality class show promise as methods to achieve more accurate prediction of true TFBS in silico. We evaluate the Bayesian Network (BN) and Support Vector Machine (SVM) machine learning techniques on four distinct TFBS data sets and analyze their performance. We describe the features that are most useful for classification and contrast and compare these feature sets between the factors. RESULTS: Our results demonstrate good performance of classifiers both on TFBS for transcription factors used for initial training and for TFBS for other factors in cross-classification experiments. We find that distances to chromatin modifications (specifically, histone modification islands) as well as distances between such modifications to be effective predictors of TFBS occupancy, though the impact of individual predictors is largely TF specific. In our experiments, Bayesian network classifiers outperform SVM classifiers. CONCLUSIONS: Our results demonstrate good performance of machine learning techniques on the problem of occupancy classification, and demonstrate that effective classification can be achieved using distances to chromatin features. We additionally demonstrate that cross-classification of TFBS is possible, suggesting the possibility of constructing a generalizable occupancy classifier capable of handling TFBS for many different transcription factors

    An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker's Yeast, Saccharomyces cerevisiae

    Get PDF
    Background: Probabilistic functional gene networks are powerful theoretical frameworks for integrating heterogeneous functional genomics and proteomics data into objective models of cellular systems. Such networks provide syntheses of millions of discrete experimental observations, spanning DNA microarray experiments, physical protein interactions, genetic interactions, and comparative genomics; the resulting networks can then be easily applied to generate testable hypotheses regarding specific gene functions and associations. Methodology/Principal Findings: We report a significantly improved version (v. 2) of a probabilistic functional gene network [1] of the baker's yeast, Saccharomyces cerevisiae. We describe our optimization methods and illustrate their effects in three major areas: the reduction of functional bias in network training reference sets, the application of a probabilistic model for calculating confidences in pair-wise protein physical or genetic interactions, and the introduction of simple thresholds that eliminate many false positive mRNA co-expression relationships. Using the network, we predict and experimentally verify the function of the yeast RNA binding protein Puf6 in 60S ribosomal subunit biogenesis. Conclusions/Significance: YeastNet v. 2, constructed using these optimizations together with additional data, shows significant reduction in bias and improvements in precision and recall, in total covering 102,803 linkages among 5,483 yeast proteins (95% of the validated proteome). YeastNet is available from http://www.yeastnet.org.This work was supported by grants from the N.S.F. (IIS-0325116, EIA-0219061), N.I.H. (GM06779-01,GM076536-01), Welch (F-1515), and a Packard Fellowship (EMM). These agencies were not involved in the design and conduct of the study, in the collection, analysis, and interpretation of the data, or in the preparation, review, or approval of the manuscript.Cellular and Molecular Biolog

    The development and evaluation of a five-language multi-perspective standardised measure: clinical decision-making involvement and satisfaction (CDIS).

    Get PDF
    BACKGROUND: The aim of this study was to develop and evaluate a brief quantitative five-language measure of involvement and satisfaction in clinical decision-making (CDIS) - with versions for patients (CDIS-P) and staff (CDIS-S) - for use in mental health services. METHODS: An English CDIS was developed by reviewing existing measures, focus groups, semistructured interviews and piloting. Translations into Danish, German, Hungarian and Italian followed the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Task Force principles of good practice for translation and cultural adaptation. Psychometricevaluation involved testing the measure in secondary mental health services in Aalborg, Debrecen, London, Naples, Ulm and Zurich. RESULTS: After appraising 14 measures, the Control Preference Scale and Satisfaction With Decision-making English-language scales were modified and evaluated in interviews (n = 9), focus groups (n = 22) and piloting (n = 16). Translations were validated through focus groups (n = 38) and piloting (n = 61). A total of 443 service users and 403 paired staff completed CDIS. The Satisfaction sub-scale had internal consistency of 0.89 (0.86-0.89 after item-level deletion) for staff and 0.90 (0.87-0.90) for service users, both continuous and categorical (utility) versions were associated with symptomatology and both staff-rated and service userrated therapeutic alliance (showing convergent validity), and not with social disability (showing divergent validity), and satisfaction predicted staff-rated (OR 2.43, 95%CI 1.54- 3.83 continuous, OR 5.77, 95%CI 1.90-17.53 utility) and service user-rated (OR 2.21, 95%CI 1.51-3.23 continuous, OR 3.13, 95%CI 1.10-8.94 utility) decision implementation two months later. The Involvement sub-scale had appropriate distribution and no floor or ceiling effects, was associated with stage of recovery, functioning and quality of life (staff only) (showing convergent validity), and not with symptomatology or social disability (showing divergent validity), and staff-rated passive involvement by the service user predicted implementation (OR 3.55, 95%CI 1.53-8.24). Relationships remained after adjusting for clustering by staff. CONCLUSIONS: CDIS demonstrates adequate internal consistency, no evidence of item redundancy, appropriate distribution, and face, content, convergent, divergent and predictive validity. It can be recommended for research and clinical use. CDIS-P and CDIS-S in all 3 five languages can be downloaded at http://www.cedar-net.eu/instruments. TRIAL REGISTRATION: ISRCTN75841675.CEDAR study is funded by a grant from the Seventh Framework Programme (Research Area HEALTH-2007-3.1-4 Improving clinical decision making) of the European Union (Grant no. 223290)

    Accurate molecular classification of cancer using simple rules

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible.</p> <p>Methods</p> <p>We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV) of training sets and classification of independent test sets.</p> <p>Results</p> <p>We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML]), lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML). Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods.</p> <p>Conclusion</p> <p>In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.</p

    Global report on preterm birth and stillbirth (2 of 7): discovery science

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Normal and abnormal processes of pregnancy and childbirth are poorly understood. This second article in a global report explains what is known about the etiologies of preterm births and stillbirths and identifies critical gaps in knowledge. Two important concepts emerge: the continuum of pregnancy, beginning at implantation and ending with uterine involution following birth; and the multifactorial etiologies of preterm birth and stillbirth. Improved tools and data will enable discovery scientists to identify causal pathways and cost-effective interventions.</p> <p>Pregnancy and parturition continuum</p> <p>The biological process of pregnancy and childbirth begins with implantation and, after birth, ends with the return of the uterus to its previous state. The majority of pregnancy is characterized by rapid uterine and fetal growth without contractions. Yet most research has addressed only uterine stimulation (labor) that accounts for <0.5% of pregnancy.</p> <p>Etiologies</p> <p>The etiologies of preterm birth and stillbirth differ by gestational age, genetics, and environmental factors. Approximately 30% of all preterm births are indicated for either maternal or fetal complications, such as maternal illness or fetal growth restriction. Commonly recognized pathways leading to preterm birth occur most often during the gestational ages indicated: (1) inflammation caused by infection (22-32 weeks); (2) decidual hemorrhage caused by uteroplacental thrombosis (early or late preterm birth); (3) stress (32-36 weeks); and (4) uterine overdistention, often caused by multiple fetuses (32-36 weeks). Other contributors include cervical insufficiency, smoking, and systemic infections. Many stillbirths have similar causes and mechanisms. About two-thirds of late fetal deaths occur during the antepartum period; the other third occur during childbirth. Intrapartum asphyxia is a leading cause of stillbirths in low- and middle-income countries.</p> <p>Recommendations</p> <p>Utilizing new systems biology tools, opportunities now exist for researchers to investigate various pathways important to normal and abnormal pregnancies. Improved access to quality data and biological specimens are critical to advancing discovery science. Phenotypes, standardized definitions, and uniform criteria for assessing preterm birth and stillbirth outcomes are other immediate research needs.</p> <p>Conclusion</p> <p>Preterm birth and stillbirth have multifactorial etiologies. More resources must be directed toward accelerating our understanding of these complex processes, and identifying upstream and cost-effective solutions that will improve these pregnancy outcomes.</p

    Barrier Tissue Macrophages: Functional Adaptation to Environmental Challenges

    Get PDF
    Macrophages are found throughout the body, where they have crucial roles in tissue development, homeostasis and remodeling, as well as being sentinels of the innate immune system that can contribute to protective immunity and inflammation. Barrier tissues, such as the intestine, lung, skin and liver, are exposed constantly to the outside world, which places special demands on resident cell populations such as macrophages. Here we review the mounting evidence that although macrophages in different barrier tissues may be derived from distinct progenitors, their highly specific properties are shaped by the local environment, which allows them to adapt precisely to the needs of their anatomical niche. We discuss the properties of macrophages in steady-state barrier tissues, outline the factors that shape their differentiation and behavior and describe how macrophages change during protective immunity and inflammation
    • …
    corecore