526 research outputs found

    A comparison of classification methods for predicting Chronic Fatigue Syndrome based on genetic data

    Get PDF
    Background: In the studies of genomics, it is essential to select a small number of genes that are more significant than the others for the association studies of disease susceptibility. In this work, our goal was to compare computational tools with and without feature selection for predicting chronic fatigue syndrome (CFS) using genetic factors such as single nucleotide polymorphisms ( SNPs). Methods: We employed the dataset that was original to the previous study by the CDC Chronic Fatigue Syndrome Research Group. To uncover relationships between CFS and SNPs, we applied three classification algorithms including naive Bayes, the support vector machine algorithm, and the C4.5 decision tree algorithm. Furthermore, we utilized feature selection methods to identify a subset of influential SNPs. One was the hybrid feature selection approach combining the chi-squared and information-gain methods. The other was the wrapper- based feature selection method. Results: The naive Bayes model with the wrapper-based approach performed maximally among predictive models to infer the disease susceptibility dealing with the complex relationship between CFS and SNPs. Conclusion: We demonstrated that our approach is a promising method to assess the associations between CFS and SNPs

    Fabrication and Performance of MEMS-Based Pressure Sensor Packages Using Patterned Ultra-Thick Photoresists

    Get PDF
    A novel plastic packaging of a piezoresistive pressure sensor using a patterned ultra-thick photoresist is experimentally and theoretically investigated. Two pressure sensor packages of the sacrifice-replacement and dam-ring type were used in this study. The characteristics of the packaged pressure sensors were investigated by using a finite-element (FE) model and experimental measurements. The results show that the thermal signal drift of the packaged pressure sensor with a small sensing-channel opening or with a thin silicon membrane for the dam-ring approach had a high packaging induced thermal stress, leading to a high temperature coefficient of span (TCO) response of −0.19% span/°C. The results also show that the thermal signal drift of the packaged pressure sensors with a large sensing-channel opening for sacrifice-replacement approach significantly reduced packaging induced thermal stress, and hence a low TCO response of −0.065% span/°C. However, the packaged pressure sensors of both the sacrifice-replacement and dam-ring type still met the specification −0.2% span/°C of the unpackaged pressure sensor. In addition, the size of proposed packages was 4 × 4 × 1.5 mm3 which was about seven times less than the commercialized packages. With the same packaging requirement, the proposed packaging approaches may provide an adequate solution for use in other open-cavity sensors, such as gas sensors, image sensors, and humidity sensors

    Repeated Small Perturbation Approach Reveals Transcriptomic Steady States

    Get PDF
    The study of biological systems dynamics requires elucidation of the transitions of steady states. A “small perturbation” approach can provide important information on the “steady state” of a biological system. In our experiments, small perturbations were generated by applying a series of repeating small doses of ultraviolet radiation to a human keratinocyte cell line, HaCaT. The biological response was assessed by monitoring the gene expression profiles using cDNA microarrays. Repeated small doses (10 J/m2) of ultraviolet B (UVB) exposure modulated the expression profiles of two groups of genes in opposite directions. The genes that were up-regulated have functions mainly associated with anti-proliferation/anti-mitogenesis/apoptosis, and the genes that were down-regulated were mainly related to proliferation/mitogenesis/anti-apoptosis. For both groups of genes, repetition of the small doses of UVB caused an immediate response followed by relaxation between successive small perturbations. This cyclic pattern was suppressed when large doses (233 or 582.5 J/m2) of UVB were applied. Our method and results contribute to a foundation for computational systems biology, which implicitly uses the concept of steady state

    Land Subsidence Caused by Groundwater Exploitation in Yunlin, Taiwan

    Get PDF
    Source: ICHE Conference Archive - https://mdi-de.baw.de/icheArchive

    NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition

    Get PDF
    BACKGROUND: Biomedical named entity recognition (Bio-NER) is a challenging problem because, in general, biomedical named entities of the same category (e.g., proteins and genes) do not follow one standard nomenclature. They have many irregularities and sometimes appear in ambiguous contexts. In recent years, machine-learning (ML) approaches have become increasingly common and now represent the cutting edge of Bio-NER technology. This paper addresses three problems faced by ML-based Bio-NER systems. First, most ML approaches usually employ singleton features that comprise one linguistic property (e.g., the current word is capitalized) and at least one class tag (e.g., B-protein, the beginning of a protein name). However, such features may be insufficient in cases where multiple properties must be considered. Adding conjunction features that contain multiple properties can be beneficial, but it would be infeasible to include all conjunction features in an NER model since memory resources are limited and some features are ineffective. To resolve the problem, we use a sequential forward search algorithm to select an effective set of features. Second, variations in the numerical parts of biomedical terms (e.g., "2" in the biomedical term IL2) cause data sparseness and generate many redundant features. In this case, we apply numerical normalization, which solves the problem by replacing all numerals in a term with one representative numeral to help classify named entities. Third, the assignment of NE tags does not depend solely on the target word's closest neighbors, but may depend on words outside the context window (e.g., a context window of five consists of the current word plus two preceding and two subsequent words). We use global patterns generated by the Smith-Waterman local alignment algorithm to identify such structures and modify the results of our ML-based tagger. This is called pattern-based post-processing. RESULTS: To develop our ML-based Bio-NER system, we employ conditional random fields, which have performed effectively in several well-known tasks, as our underlying ML model. Adding selected conjunction features, applying numerical normalization, and employing pattern-based post-processing improve the F-scores by 1.67%, 1.04%, and 0.57%, respectively. The combined increase of 3.28% yields a total score of 72.98%, which is better than the baseline system that only uses singleton features. CONCLUSION: We demonstrate the benefits of using the sequential forward search algorithm to select effective conjunction feature groups. In addition, we show that numerical normalization can effectively reduce the number of redundant and unseen features. Furthermore, the Smith-Waterman local alignment algorithm can help ML-based Bio-NER deal with difficult cases that need longer context windows

    Microarray meta-analysis database (M2DB): a uniformly pre-processed, quality controlled, and manually curated human clinical microarray database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Over the past decade, gene expression microarray studies have greatly expanded our knowledge of genetic mechanisms of human diseases. Meta-analysis of substantial amounts of accumulated data, by integrating valuable information from multiple studies, is becoming more important in microarray research. However, collecting data of special interest from public microarray repositories often present major practical problems. Moreover, including low-quality data may significantly reduce meta-analysis efficiency.</p> <p>Results</p> <p>M<sup>2</sup>DB is a human curated microarray database designed for easy querying, based on clinical information and for interactive retrieval of either raw or uniformly pre-processed data, along with a set of quality-control metrics. The database contains more than 10,000 previously published Affymetrix GeneChip arrays, performed using human clinical specimens. M<sup>2</sup>DB allows online querying according to a flexible combination of five clinical annotations describing disease state and sampling location. These annotations were manually curated by controlled vocabularies, based on information obtained from GEO, ArrayExpress, and published papers. For array-based assessment control, the online query provides sets of QC metrics, generated using three available QC algorithms. Arrays with poor data quality can easily be excluded from the query interface. The query provides values from two algorithms for gene-based filtering, and raw data and three kinds of pre-processed data for downloading.</p> <p>Conclusion</p> <p>M<sup>2</sup>DB utilizes a user-friendly interface for QC parameters, sample clinical annotations, and data formats to help users obtain clinical metadata. This database provides a lower entry threshold and an integrated process of meta-analysis. We hope that this research will promote further evolution of microarray meta-analysis.</p

    Intra- and Inter-Individual Variance of Gene Expression in Clinical Studies

    Get PDF
    BACKGROUND: Variance in microarray studies has been widely discussed as a critical topic on the identification of differentially expressed genes; however, few studies have addressed the influence of estimating variance. METHODOLOGY/PRINCIPAL FINDINGS: To break intra- and inter-individual variance in clinical studies down to three levels--technical, anatomic, and individual--we designed experiments and algorithms to investigate three forms of variances. As a case study, a group of "inter-individual variable genes" were identified to exemplify the influence of underestimated variance on the statistical and biological aspects in identification of differentially expressed genes. Our results showed that inadequate estimation of variance inevitably led to the inclusion of non-statistically significant genes into those listed as significant, thereby interfering with the correct prediction of biological functions. Applying a higher cutoff value of fold changes in the selection of significant genes reduces/eliminates the effects of underestimated variance. CONCLUSIONS/SIGNIFICANCE: Our data demonstrated that correct variance evaluation is critical in selecting significant genes. If the degree of variance is underestimated, "noisy" genes are falsely identified as differentially expressed genes. These genes are the noise associated with biological interpretation, reducing the biological significance of the gene set. Our results also indicate that applying a higher number of fold change as the selection criteria reduces/eliminates the differences between distinct estimations of variance
    corecore