2,665 research outputs found
Learning Differentially Expressed Gene Pairs in Microarray Data
To identify differentially expressed genes (DEGs) in analysis of microarray data, a majority of existing filter methods rank gene individually. Such a paradigm could overlook the genes with trivial individual discriminant powers but significant powers of discrimination in their combinations. This paper proposed an impurity metric in which the number of split intervals for each feature is considered as a parameter to be optimized for gaining maximal discrimination. The proposed method was first evaluated by applying to a synthesized noisy rectangular grid dataset, in which the significant feature pair which forms a rectangular grid pattern was successfully recognized. Furthermore, applying to the identification of DEGs on colon microarray data, the proposed method demonstrated that it could become an alternative to Fisher's test for the prescreening of genes which led to better performance of the SVM-RFE method
Modeling Large Sparse Data for Feature Selection: Hospital Admission Predictions of the Dementia Patients Using Primary Care Electronic Health Records
A growing elderly population suffering from incurable, chronic conditions such as dementia present a continual strain on medical services due to mental impairment paired with high comorbidity resulting in increased hospitalization risk. The identification of at risk individuals allows for preventative measures to alleviate said strain. Electronic health records provide opportunity for big data analysis to address such applications. Such data however, provides a challenging problem space for traditional statistics and machine learning due to high dimensionality and sparse data elements. This article proposes a novel machine learning methodology: entropy regularization with ensemble deep neural networks (ECNN), which simultaneously provides high predictive performance of hospitalization of patients with dementia whilst enabling an interpretable heuristic analysis of the model architecture, able to identify individual features of importance within a large feature domain space. Experimental results on health records containing 54,647 features were able to identify 10 event indicators within a patient timeline: a collection of diagnostic events, medication prescriptions and procedural events, the highest ranked being essential hypertension. The resulting subset was still able to provide a highly competitive hospitalization prediction (Accuracy: 0.759) as compared to the full feature domain (Accuracy: 0.755) or traditional feature selection techniques (Accuracy: 0.737), a significant reduction in feature size. The discovery and heuristic evidence of correlation provide evidence for further clinical study of said medical events as potential novel indicators. There also remains great potential for adaption of ECNN within other medical big data domains as a data mining tool for novel risk factor identification
Concept Libraries for Repeatable and Reusable Research: Qualitative Study Exploring the Needs of Users
Background:Big data research in the field of health sciences is hindered by a lack of agreement on how to identify and define different conditions and their medications. This means that researchers and health professionals often have different phenotype definitions for the same condition. This lack of agreement makes it difficult to compare different study findings and hinders the ability to conduct repeatable and reusable research.Objective:This study aims to examine the requirements of various users, such as researchers, clinicians, machine learning experts, and managers, in the development of a data portal for phenotypes (a concept library).Methods:This was a qualitative study using interviews and focus group discussion. One-to-one interviews were conducted with researchers, clinicians, machine learning experts, and senior research managers in health data science (N=6) to explore their specific needs in the development of a concept library. In addition, a focus group discussion with researchers (N=14) working with the Secured Anonymized Information Linkage databank, a national eHealth data linkage infrastructure, was held to perform a SWOT (strengths, weaknesses, opportunities, and threats) analysis for the phenotyping system and the proposed concept library. The interviews and focus group discussion were transcribed verbatim, and 2 thematic analyses were performed.Results:Most of the participants thought that the prototype concept library would be a very helpful resource for conducting repeatable research, but they specified that many requirements are needed before its development. Although all the participants stated that they were aware of some existing concept libraries, most of them expressed negative perceptions about them. The participants mentioned several facilitators that would stimulate them to share their work and reuse the work of others, and they pointed out several barriers that could inhibit them from sharing their work and reusing the work of others. The participants suggested some developments that they would like to see to improve reproducible research output using routine data.Conclusions:The study indicated that most interviewees valued a concept library for phenotypes. However, only half of the participants felt that they would contribute by providing definitions for the concept library, and they reported many barriers regarding sharing their work on a publicly accessible platform. Analysis of interviews and the focus group discussion revealed that different stakeholders have different requirements, facilitators, barriers, and concerns about a prototype concept library
Back-action Induced Non-equilibrium Effect in Electron Charge Counting Statistics
We report our study of the real-time charge counting statistics measured by a
quantum point contact (QPC) coupled to a single quantum dot (QD) under
different back-action strength. By tuning the QD-QPC coupling or QPC bias, we
controlled the QPC back-action which drives the QD electrons out of thermal
equilibrium. The random telegraph signal (RTS) statistics showed strong and
tunable non-thermal-equilibrium saturation effect, which can be quantitatively
characterized as a back-action induced tunneling out rate. We found that the
QD-QPC coupling and QPC bias voltage played different roles on the back-action
strength and cut-off energy.Comment: 4 pages, 4 figures, 1 tabl
Construction of retrovirus vector taking MDR1/ACBC1 and its transfection into human placenta derived mesenchymal stem cells
In the study, we used both the methods of perfusion and density gradient centrifugation to isolate and purify mesenchymal stem cells (MSCS) from placenta tissue, and constructed a retroviral vector with multiple drug resistant genes, and the green fluorescent protein (GFP) has been used as an indicative mark. The 293T cell was transfected by the retroviral vector PMX-flag-MDR1-GFP together with its peripheral membrane protein gene. After the infective and replication–defective retrovirus were acquired, we transfected them into human placenta-derived mesenchymal stem cells (HPMSCs). We successfully observed the expression of the reporter gene-GFP by using the green light fluorescence microscope and the p-glycoprotein (P-gp) expressed by exogenous gene MDR1 by Western Blotting. All these facts indicated that the retroviral vector PMX-flag-MDR1-GFP had successfully been transfected into HPMSCs and the exogenous gene multidrug resistance (MDR)1 was detected as normally expressed. The daunorubicin (DNR) pump experiment proved that P-gp of HPMSCs transfected with PMX-flag-MDR1-GFP was of biological activity. The result indicates that MDR1 retroviral vector can transfect the HPMSCs. Not only can the exogenous gene be expressed, but also the expression protein had the biological activity. The conclusion lays a solid foundation of the clinical application of MDR1 genetic therapy.Keywords: Transfect, human placenta-derived mesenchymal stem cells, multidrug resistance (MDR)1 gene
- …