15,295 research outputs found

    Bayesian outlier detection in Capital Asset Pricing Model

    Full text link
    We propose a novel Bayesian optimisation procedure for outlier detection in the Capital Asset Pricing Model. We use a parametric product partition model to robustly estimate the systematic risk of an asset. We assume that the returns follow independent normal distributions and we impose a partition structure on the parameters of interest. The partition structure imposed on the parameters induces a corresponding clustering of the returns. We identify via an optimisation procedure the partition that best separates standard observations from the atypical ones. The methodology is illustrated with reference to a real data set, for which we also provide a microeconomic interpretation of the detected outliers

    One-Class Classification: Taxonomy of Study and Review of Techniques

    Full text link
    One-class classification (OCC) algorithms aim to build classification models when the negative class is either absent, poorly sampled or not well defined. This unique situation constrains the learning of efficient classifiers by defining class boundary just with the knowledge of positive class. The OCC problem has been considered and applied under many research themes, such as outlier/novelty detection and concept learning. In this paper we present a unified view of the general problem of OCC by presenting a taxonomy of study for OCC problems, which is based on the availability of training data, algorithms used and the application domains applied. We further delve into each of the categories of the proposed taxonomy and present a comprehensive literature review of the OCC algorithms, techniques and methodologies with a focus on their significance, limitations and applications. We conclude our paper by discussing some open research problems in the field of OCC and present our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure

    Two meta-analyses of noncontact healing studies

    Get PDF
    Reviews of empirical work on the efficacy of noncontact healing have found that interceding on behalf of patients through prayer or by adopting various practices that incorporate an intention to heal can have some positive effect upon their wellbeing. However, reviewers have also raised concerns about study quality and the diversity of healing approaches adopted, which makes the findings difficult to interpret. Some of these concerns can be addressed by adopting a standardised approach based on the double-blind randomised controlled clinical trial, and a recent review restricted to such studies has reported a combined effect size of .40 (p < .001). However, the studies in this review involve human participants for whom there can be no guarantee that control patients are not beneficiaries of healing intentions from friends, family or their own religious groups. We proposed to address this by reviewing healing studies that involved biological systems other than ‘whole’ humans (i.e. to include animal and plant work but also work involving human biological matter such as blood samples or cell cultures), which are less susceptible to placebo and expectancy effects and also allow for more circumscribed outcome measures. Secondly, doubts have been cast concerning the legitimacy of some of the work included in previous reviews so we planned to conduct an updated review that excluded that work. For phase 1, 49 non-whole human studies from 34 papers were eligible for review. The combined effect size weighted by sample size yielded a highly significant r of .258. However the effect sizes in the database were heterogeneous, and outcomes correlated with blind ratings of study quality. When restricted to studies that met minimum quality thresholds, the remaining 22 studies gave a reduced but still significant weighted r of .115. For phase 2, 57 whole human studies across 56 papers were eligible for review. When combined, these studies yielded a small effect size of r = .203 that was also significant. This database was also heterogeneous, and outcomes were correlated with methodological quality ratings. However, when restricted to studies that met threshold quality levels the weighted effect size for the 27 surviving studies increased to r = .224. Taken together these results suggest that subjects in the active condition exhibit a significant improvement in wellbeing relative to control subjects under circumstances that do not seem to be susceptible to placebo and expectancy effects. Findings with the whole human database gave a smaller mean effect size but this was still significant and suggests that the effect is not dependent upon the previous inclusion of suspect studies and is robust enough to accommodate some high profile failures to replicate. Both databases show problems with heterogeneity and with study quality and recommendations are made for necessary standards for future replication attempts

    Toward a Standardized Strategy of Clinical Metabolomics for the Advancement of Precision Medicine

    Get PDF
    Despite the tremendous success, pitfalls have been observed in every step of a clinical metabolomics workflow, which impedes the internal validity of the study. Furthermore, the demand for logistics, instrumentations, and computational resources for metabolic phenotyping studies has far exceeded our expectations. In this conceptual review, we will cover inclusive barriers of a metabolomics-based clinical study and suggest potential solutions in the hope of enhancing study robustness, usability, and transferability. The importance of quality assurance and quality control procedures is discussed, followed by a practical rule containing five phases, including two additional "pre-pre-" and "post-post-" analytical steps. Besides, we will elucidate the potential involvement of machine learning and demonstrate that the need for automated data mining algorithms to improve the quality of future research is undeniable. Consequently, we propose a comprehensive metabolomics framework, along with an appropriate checklist refined from current guidelines and our previously published assessment, in the attempt to accurately translate achievements in metabolomics into clinical and epidemiological research. Furthermore, the integration of multifaceted multi-omics approaches with metabolomics as the pillar member is in urgent need. When combining with other social or nutritional factors, we can gather complete omics profiles for a particular disease. Our discussion reflects the current obstacles and potential solutions toward the progressing trend of utilizing metabolomics in clinical research to create the next-generation healthcare system.11Ysciescopu

    DataGauge: A Model-Driven Framework for Systematically Assessing the Quality of Clinical Data for Secondary Use

    Get PDF
    There is growing interest in the reuse of clinical data for research and clinical healthcare quality improvement. However, direct analysis of clinical data sets can yield misleading results. Data Cleaning is often employed as a means to detect and fix data issues during analysis but this approach lacks of systematicity. Data Quality (DQ) assessments are a more thorough way of spotting threats to the validity of analytical results stemming from data repurposing. This is because DQ assessments aim to evaluate ‘fitness for purpose’. However, there is currently no systematic method to assess DQ for the secondary analysis of clinical data. In this dissertation I present DataGauge, a framework to address this gap in the state of the art. I begin by introducing the problem and its general significance to the field of biomedical and clinical informatics (Chapter 1). I then present a literature review that surveys current methods for the DQ assessment of repurposed clinical data and derive the features required to advance the state of the art (Chapter 2). In chapter 3 I present DataGauge, a model-driven framework for systematically assessing the quality of repurposed clinical data, which addresses current limitations in the state of the art. Chapter 4 describes the development of a guidance framework to ensure the systematicity of DQ assessment design. I then evaluate DataGauge’s ability to flag potential DQ issues in comparison to a systematic state of the art method. DataGauge was able to increase ten fold the number of potential DQ issues found over the systematic state of the art method. It identified more specific issues that were a direct threat to fitness for purpose, but also provided broader coverage of the clinical data types and knowledge domains involved in secondary analyses. DataGauge sets the groundwork for systematic and purpose-specific DQ assessments that fully integrate with secondary analysis workflows. It also promotes a team-based approach and the explicit definition of DQ requirements to support communication and transparent reporting of DQ results. Overall, this work provides tools that pave the way to a deeper understanding of repurposed clinical dataset limitations before analysis. It is also a first step towards the automation of purpose-specific DQ assessments for the secondary use of clinical data. Future work will consist of further development of these methods and validating them with research teams making secondary use of clinical data

    Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples

    Get PDF
    Available statistical preprocessing or quality control analysis tools for gene expression microarray datasets are known to greatly affect downstream data analysis, especially when degraded samples, unique tissue samples, or novel expression assays are used. It is therefore important to assess the validity and impact of the assumptions built in to preprocessing schemes for a dataset. We developed and assessed a data preprocessing strategy for use with the Illumina DASL-based gene expression assay with partially degraded postmortem prefrontal cortex samples. The samples were obtained from individuals with autism as part of an investigation of the pathogenic factors contributing to autism. Using statistical analysis methods and metrics such as those associated with multivariate distance matrix regression and mean inter-array correlation, we developed a DASL-based assay gene expression preprocessing pipeline to accommodate and detect problems with microarray-based gene expression values obtained with degraded brain samples. Key steps in the pipeline included outlier exclusion, data transformation and normalization, and batch effect and covariate corrections. Our goal was to produce a clean dataset for subsequent downstream differential expression analysis. We ultimately settled on available transformation and normalization algorithms in the R/Bioconductor package lumi based on an assessment of their use in various combinations. A log2-transformed, quantile-normalized, and batch and seizure-corrected procedure was likely the most appropriate for our data. We empirically tested different components of our proposed preprocessing strategy and believe that our results suggest that a preprocessing strategy that effectively identifies outliers, normalizes the data, and corrects for batch effects can be applied to all studies, even those pursued with degraded samples
    corecore