38 research outputs found
Unsupervised Bayesian linear unmixing of gene expression microarrays
Background: This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Results: Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. Conclusions: The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor
Development and Validation of Risk Scores for All-Cause Mortality for a Smartphone-Based "General Health Score" App: Prospective Cohort Study Using the UK Biobank
This is the final version. Available on open access from JMIR Publications via the DOI in this recordBACKGROUND: Given the established links between an individual's behaviors and lifestyle factors and potentially adverse health outcomes, univariate or simple multivariate health metrics and scores have been developed to quantify general health at a given point in time and estimate risk of negative future outcomes. However, these health metrics may be challenging for widespread use and are unlikely to be successful at capturing the broader determinants of health in the general population. Hence, there is a need for a multidimensional yet widely employable and accessible way to obtain a comprehensive health metric. OBJECTIVE: The objective of the study was to develop and validate a novel, easily interpretable, points-based health score ("C-Score") derived from metrics measurable using smartphone components and iterations thereof that utilize statistical modeling and machine learning (ML) approaches. METHODS: A literature review was conducted to identify relevant predictor variables for inclusion in the first iteration of a points-based model. This was followed by a prospective cohort study in a UK Biobank population for the purposes of validating the C-Score and developing and comparatively validating variations of the score using statistical and ML models to assess the balance between expediency and ease of interpretability and model complexity. Primary and secondary outcome measures were discrimination of a points-based score for all-cause mortality within 10 years (Harrell c-statistic) and discrimination and calibration of Cox proportional hazards models and ML models that incorporate C-Score values (or raw data inputs) and other predictors to predict the risk of all-cause mortality within 10 years. RESULTS: The study cohort comprised 420,560 individuals. During a cohort follow-up of 4,526,452 person-years, there were 16,188 deaths from any cause (3.85%). The points-based model had good discrimination (c-statistic=0.66). There was a 31% relative reduction in risk of all-cause mortality per decile of increasing C-Score (hazard ratio of 0.69, 95% CI 0.663-0.675). A Cox model integrating age and C-Score had improved discrimination (8 percentage points; c-statistic=0.74) and good calibration. ML approaches did not offer improved discrimination over statistical modeling. CONCLUSIONS: The novel health metric ("C-Score") has good predictive capabilities for all-cause mortality within 10 years. Embedding the C-Score within a smartphone app may represent a useful tool for democratized, individualized health risk prediction. A simple Cox model using C-Score and age balances parsimony and accuracy of risk predictions and could be used to produce absolute risk estimations for app users.Chelsea Digital VenturesHuma Therapeutic
A Novel Semi-Supervised Methodology for Extracting Tumor Type-Specific MRS Sources in Human Brain Data
BackgroundThe clinical investigation of human brain tumors often starts with a non-invasive imaging study, providing information about the tumor extent and location, but little insight into the biochemistry of the analyzed tissue. Magnetic Resonance Spectroscopy can complement imaging by supplying a metabolic fingerprint of the tissue. This study analyzes single-voxel magnetic resonance spectra, which represent signal information in the frequency domain. Given that a single voxel may contain a heterogeneous mix of tissues, signal source identification is a relevant challenge for the problem of tumor type classification from the spectroscopic signal.Methodology/Principal FindingsNon-negative matrix factorization techniques have recently shown their potential for the identification of meaningful sources from brain tissue spectroscopy data. In this study, we use a convex variant of these methods that is capable of handling negatively-valued data and generating sources that can be interpreted as tumor class prototypes. A novel approach to convex non-negative matrix factorization is proposed, in which prior knowledge about class information is utilized in model optimization. Class-specific information is integrated into this semi-supervised process by setting the metric of a latent variable space where the matrix factorization is carried out. The reported experimental study comprises 196 cases from different tumor types drawn from two international, multi-center databases. The results indicate that the proposed approach outperforms a purely unsupervised process by achieving near perfect correlation of the extracted sources with the mean spectra of the tumor types. It also improves tissue type classification.Conclusions/SignificanceWe show that source extraction by unsupervised matrix factorization benefits from the integration of the available class information, so operating in a semi-supervised learning manner, for discriminative source identification and brain tumor labeling from single-voxel spectroscopy data. We are confident that the proposed methodology has wider applicability for biomedical signal processing
MtSNPscore: a combined evidence approach for assessing cumulative impact of mitochondrial variations in disease
Human mitochondrial DNA (mtDNA) variations have been implicated in a broad spectrum of diseases. With over 3000 mtDNA variations reported across databases, establishing pathogenicity of variations in mtDNA is a major challenge. We have designed and developed a comprehensive weighted scoring system (MtSNPscore) for identification of mtDNA variations that can impact pathogenicity and would likely be associated with disease. The criteria for pathogenicity include information available in the literature, predictions made by various in silico tools and frequency of variation in normal and patient datasets. The scoring scheme also assigns scores to patients and normal individuals to estimate the cumulative impact of variations. The method has been implemented in an automated pipeline and has been tested on Indian ataxia dataset (92 individuals), sequenced in this study, and other publicly available mtSNP dataset comprising of 576 mitochondrial genomes of Japanese individuals from six different groups, namely, patients with Parkinson's disease, patients with Alzheimer's disease, young obese males, young non-obese males, and type-2 diabetes patients with or without severe vascular involvement. MtSNPscore, for analysis can extract information from variation data or from mitochondrial DNA sequences. It has a web-interface http://bioinformatics.ccmb.res.in/cgi-bin/snpscore/Mtsnpscore.pl webcite that provides flexibility to update/modify the parameters for estimating pathogenicity
SaS-BCI: A New Strategy to Predict Image Memorability and use Mental Imagery as a Brain-Based Biometric Authentication
Security authentication is one of the most important levels of information security. Nowadays, human biometric techniques are the most secure methods for authentication purposes that cover the problems of older types of authentication like passwords and pins. There are many advantages of recent biometrics in terms of security; however, they still have some disadvantages. Progresses in technology made some specific devices, which make it possible to copy and make a fake human biometric because they are all visible and touchable. According to this matter, there is a need for a new biometric to cover the issues of other types. Brainwave is human data, which uses them as a new type of security authentication that has engaged many researchers. There are some research and experiments, which are investigating and testing EEG signals to find the uniqueness of human brainwave. Some researchers achieved high accuracy rates in this area by applying different signal acquisition techniques, feature extraction and classifications using Brain–Computer Interface (BCI). One of the important parts of any BCI processes is the way that brainwaves could be acquired and recorded. A new Signal Acquisition Strategy is presented in this paper for the process of authorization and authentication of brain signals specifically. This is to predict image memorability from the user’s brain to use mental imagery as a visualization pattern for security authentication. Therefore, users can authenticate themselves with visualizing a specific picture in their minds. In conclusion, we can see that brainwaves can be different according to the mental tasks, which it would make it harder using them for authentication process. There are many signal acquisition strategies and signal processing for brain-based authentication that by using the right methods, a higher level of accuracy rate could be achieved which is suitable for using brain signal as another biometric security authentication
Resting-State Quantitative Electroencephalography Reveals Increased Neurophysiologic Connectivity in Depression
Symptoms of Major Depressive Disorder (MDD) are hypothesized to arise from dysfunction in brain networks linking the limbic system and cortical regions. Alterations in brain functional cortical connectivity in resting-state networks have been detected with functional imaging techniques, but neurophysiologic connectivity measures have not been systematically examined. We used weighted network analysis to examine resting state functional connectivity as measured by quantitative electroencephalographic (qEEG) coherence in 121 unmedicated subjects with MDD and 37 healthy controls. Subjects with MDD had significantly higher overall coherence as compared to controls in the delta (0.5–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), and beta (12–20 Hz) frequency bands. The frontopolar region contained the greatest number of “hub nodes” (surface recording locations) with high connectivity. MDD subjects expressed higher theta and alpha coherence primarily in longer distance connections between frontopolar and temporal or parietooccipital regions, and higher beta coherence primarily in connections within and between electrodes overlying the dorsolateral prefrontal cortical (DLPFC) or temporal regions. Nearest centroid analysis indicated that MDD subjects were best characterized by six alpha band connections primarily involving the prefrontal region. The present findings indicate a loss of selectivity in resting functional connectivity in MDD. The overall greater coherence observed in depressed subjects establishes a new context for the interpretation of previous studies showing differences in frontal alpha power and synchrony between subjects with MDD and normal controls. These results can inform the development of qEEG state and trait biomarkers for MDD