41 research outputs found

    A new regularized least squares support vector regression for gene selection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Selection of influential genes with microarray data often faces the difficulties of a large number of genes and a relatively small group of subjects. In addition to the curse of dimensionality, many gene selection methods weight the contribution from each individual subject equally. This equal-contribution assumption cannot account for the possible dependence among subjects who associate similarly to the disease, and may restrict the selection of influential genes.</p> <p>Results</p> <p>A novel approach to gene selection is proposed based on kernel similarities and kernel weights. We do not assume uniformity for subject contribution. Weights are calculated via regularized least squares support vector regression (RLS-SVR) of class levels on kernel similarities and are used to weight subject contribution. The cumulative sum of weighted expression levels are next ranked to select responsible genes. These procedures also work for multiclass classification. We demonstrate this algorithm on acute leukemia, colon cancer, small, round blue cell tumors of childhood, breast cancer, and lung cancer studies, using kernel Fisher discriminant analysis and support vector machines as classifiers. Other procedures are compared as well.</p> <p>Conclusion</p> <p>This approach is easy to implement and fast in computation for both binary and multiclass problems. The gene set provided by the RLS-SVR weight-based approach contains a less number of genes, and achieves a higher accuracy than other procedures.</p

    Repeatability and Reproducibility of Decisions by Latent Fingerprint Examiners

    Get PDF
    The interpretation of forensic fingerprint evidence relies on the expertise of latent print examiners. We tested latent print examiners on the extent to which they reached consistent decisions. This study assessed intra-examiner repeatability by retesting 72 examiners on comparisons of latent and exemplar fingerprints, after an interval of approximately seven months; each examiner was reassigned 25 image pairs for comparison, out of total pool of 744 image pairs. We compare these repeatability results with reproducibility (inter-examiner) results derived from our previous study. Examiners repeated 89.1% of their individualization decisions, and 90.1% of their exclusion decisions; most of the changed decisions resulted in inconclusive decisions. Repeatability of comparison decisions (individualization, exclusion, inconclusive) was 90.0% for mated pairs, and 85.9% for nonmated pairs. Repeatability and reproducibility were notably lower for comparisons assessed by the examiners as “difficult” than for “easy” or “moderate” comparisons, indicating that examiners' assessments of difficulty may be useful for quality assurance. No false positive errors were repeated (n = 4); 30% of false negative errors were repeated. One percent of latent value decisions were completely reversed (no value even for exclusion vs. of value for individualization). Most of the inter- and intra-examiner variability concerned whether the examiners considered the information available to be sufficient to reach a conclusion; this variability was concentrated on specific image pairs such that repeatability and reproducibility were very high on some comparisons and very low on others. Much of the variability appears to be due to making categorical decisions in borderline cases

    Probabilistic Daily ILI Syndromic Surveillance with a Spatio-Temporal Bayesian Hierarchical Model

    Get PDF
    BACKGROUND: For daily syndromic surveillance to be effective, an efficient and sensible algorithm would be expected to detect aberrations in influenza illness, and alert public health workers prior to any impending epidemic. This detection or alert surely contains uncertainty, and thus should be evaluated with a proper probabilistic measure. However, traditional monitoring mechanisms simply provide a binary alert, failing to adequately address this uncertainty. METHODS AND FINDINGS: Based on the Bayesian posterior probability of influenza-like illness (ILI) visits, the intensity of outbreak can be directly assessed. The numbers of daily emergency room ILI visits at five community hospitals in Taipei City during 2006-2007 were collected and fitted with a Bayesian hierarchical model containing meteorological factors such as temperature and vapor pressure, spatial interaction with conditional autoregressive structure, weekend and holiday effects, seasonality factors, and previous ILI visits. The proposed algorithm recommends an alert for action if the posterior probability is larger than 70%. External data from January to February of 2008 were retained for validation. The decision rule detects successfully the peak in the validation period. When comparing the posterior probability evaluation with the modified Cusum method, results show that the proposed method is able to detect the signals 1-2 days prior to the rise of ILI visits. CONCLUSIONS: This Bayesian hierarchical model not only constitutes a dynamic surveillance system but also constructs a stochastic evaluation of the need to call for alert. The monitoring mechanism provides earlier detection as well as a complementary tool for current surveillance programs

    An Efficient Rank Based Approach for Closest String and Closest Substring

    Get PDF
    This paper aims to present a new genetic approach that uses rank distance for solving two known NP-hard problems, and to compare rank distance with other distance measures for strings. The two NP-hard problems we are trying to solve are closest string and closest substring. For each problem we build a genetic algorithm and we describe the genetic operations involved. Both genetic algorithms use a fitness function based on rank distance. We compare our algorithms with other genetic algorithms that use different distance measures, such as Hamming distance or Levenshtein distance, on real DNA sequences. Our experiments show that the genetic algorithms based on rank distance have the best results

    The Impact of Matching Vaccine Strains and Post-SARS Public Health Efforts on Reducing Influenza-Associated Mortality among the Elderly

    Get PDF
    Public health administrators do not have effective models to predict excess influenza-associated mortality and monitor viral changes associated with it. This study evaluated the effect of matching/mismatching vaccine strains, type/subtype pattern changes in Taiwan's influenza viruses, and the impact of post-SARS (severe acute respiratory syndrome) public health efforts on excess influenza-associated mortalities among the elderly. A negative binomial model was developed to estimate Taiwan's monthly influenza-associated mortality among the elderly. We calculated three winter and annual excess influenza-associated mortalities [pneumonia and influenza (P&I), respiratory and circulatory, and all-cause] from the 1999–2000 through the 2006–2007 influenza seasons. Obtaining influenza virus sequences from the months/years in which death from P&I was excessive, we investigated molecular variation in vaccine-mismatched influenza viruses by comparing hemagglutinin 1 (HA1) of the circulating and vaccine strains. We found that the higher the isolation rate of A (H3N2) and vaccine-mismatched influenza viruses, the greater the monthly P&I mortality. However, this significant positive association became negative for higher matching of A (H3N2) and public health efforts with post-SARS effect. Mean excess P&I mortality for winters was significantly higher before 2003 than after that year [mean ± S.D.: 1.44±1.35 vs. 0.35±1.13, p = 0.04]. Further analysis revealed that vaccine-matched circulating influenza A viruses were significantly associated with lower excess P&I mortality during post-SARS winters (i.e., 2005–2007) than during pre-SARS winters [0.03±0.06 vs. 1.57±1.27, p = 0.01]. Stratification of these vaccine-matching and post-SARS effect showed substantial trends toward lower elderly excess P&I mortalities in winters with either mismatching vaccines during the post-SARS period or matching vaccines during the pre-SARS period. Importantly, all three excess mortalities were at their highest in May, 2003, when inter-hospital nosocomial infections were peaking. Furthermore, vaccine-mismatched H3N2 viruses circulating in the years with high excess P&I mortality exhibited both a lower amino acid identity percentage of HA1 between vaccine and circulating strains and a higher numbers of variations at epitope B. Our model can help future decision makers to estimate excess P&I mortality effectively, select and test virus strains for antigenic variation, and evaluate public health strategy effectiveness

    Use of ecstasy and other psychoactive substances among school-attending adolescents in Taiwan: national surveys 2004–2006

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>With the backdrop of a global ecstasy epidemic, this study sought to examine the trend, correlates, and onset sequence of ecstasy use among adolescents in Taiwan, where a well-established gateway drug such as marijuana is much less popular.</p> <p>Methods</p> <p>A multistage probability survey of school-attending adolescents in grades 7, 9, 10, and 12, aged 11–19 years, was conducted in 2004, 2005, and 2006. A self-administered anonymous questionnaire elicited response rates ranging from 94.3% to 96.6%. The sample sizes were 18232 respondents in 2004, 17986 in 2005, and 17864 in 2006.</p> <p>Results</p> <p>In terms of lifetime prevalence and incidence, ecstasy and ketamine by and large appeared as the first and second commonly used illegal drugs, respectively, among middle (grades 7 and 9) and high school students (grades 10 and 12) during the 3-year survey period; however, this order was reversed in the middle school-aged students starting in 2006. Having sexual experience, tobacco use, and betel nut use were factors consistently associated with the onset of ecstasy use across years. The majority of ecstasy users had been involved in polydrug use, such as the use of ketamine (41.4%–53.5%), marijuana (12.7%–18.7%), and methamphetamine (4.2%–9.5%).</p> <p>Conclusion</p> <p>From 2004 to 2006, a decline was noted in the prevalence and incidence rate of ecstasy, a leading illegal drug used by school-attending adolescents in Taiwan since the early 2000s. The emerging ketamine use trend may warrant more attention in the future.</p

    Identification of Prognostic Genes for Recurrent Risk Prediction in Triple Negative Breast Cancer Patients in Taiwan

    Get PDF
    Discrepancies in the prognosis of triple negative breast cancer exist between Caucasian and Asian populations. Yet, the gene signature of triple negative breast cancer specifically for Asians has not become available. Therefore, the purpose of this study is to construct a prediction model for recurrence of triple negative breast cancer in Taiwanese patients. Whole genome expression profiling of breast cancers from 185 patients in Taiwan from 1995 to 2008 was performed, and the results were compared to the previously published literature to detect differences between Asian and Western patients. Pathway analysis and Cox proportional hazard models were applied to construct a prediction model for the recurrence of triple negative breast cancer. Hierarchical cluster analysis showed that triple negative breast cancers from different races were in separate sub-clusters but grouped in a bigger cluster. Two pathways, cAMP-mediated signaling and ephrin receptor signaling, were significantly associated with the recurrence of triple negative breast cancer. After using stepwise model selection from the combination of the initial filtered genes, we developed a prediction model based on the genes SLC22A23, PRKAG3, DPEP3, MORC2, GRB7, and FAM43A. The model had 91.7% accuracy, 81.8% sensitivity, and 94.6% specificity under leave-one-out support vector regression. In this study, we identified pathways related to triple negative breast cancer and developed a model to predict its recurrence. These results could be used for assisting with clinical prognosis and warrant further investigation into the possibility of targeted therapy of triple negative breast cancer in Taiwanese patients

    Common Variants in MAGI2 Gene Are Associated with Increased Risk for Cognitive Impairment in Schizophrenic Patients

    Get PDF
    Schizophrenia is a complex psychiatric disorder characterized by positive symptoms, negative symptoms, and cognitive impairment. MAGI2, a relatively large gene (∼1.5 Mbps) that maps to chromosome 7q21, is involved in recruitment of neurotransmitter receptors such as AMPA- and NMDA-type glutamate receptors. A genetic association study designed to evaluate the association between MAGI2 and cognitive performance or schizophrenia has not been conducted. In this case-control study, we examined the relationship of single nucleotide polymorphism (SNP) variations in MAGI2 and risk for schizophrenia in a large Japanese sample and explored the potential relationships between variations in MAGI2 and aspects of human cognitive function related to glutamate activity. Based on the result of first schizophrenia genome-wide association study in a Japanese population (JGWAS), we selected four independent SNPs and performed an association study using a large independent Japanese sample set (cases 1624, controls 1621). Wisconsin Card Sorting Test (WCST) was used to evaluate executive function in 114 cases and 91 controls. We found suggestive evidence for genetic association of common SNPs within MAGI2 locus and schizophrenia in Japanese population. Furthermore in terms of association between MAGI2 and cognitive performance, we observed that genotype effect of rs2190665 on WCST score was significant (p = 0.034) and rs4729938 trended toward significance (p = 0.08). In conclusion, although we could not detect strong genetic evidence for association of common variants in MAGI2 and increased schizophrenia risk in a Japanese population, these SNPs may increase risk of cognitive impairment in schizophrenic patients

    Application of Multi-SNP Approaches Bayesian LASSO and AUC-RF to Detect Main Effects of Inflammatory-Gene Variants Associated with Bladder Cancer Risk

    Get PDF
    The relationship between inflammation and cancer is well established in several tumor types, including bladder cancer. We performed an association study between 886 inflammatory-gene variants and bladder cancer risk in 1,047 cases and 988 controls from the Spanish Bladder Cancer (SBC)/EPICURO Study. A preliminary exploration with the widely used univariate logistic regression approach did not identify any significant SNP after correcting for multiple testing. We further applied two more comprehensive methods to capture the complexity of bladder cancer genetic susceptibility: Bayesian Threshold LASSO (BTL), a regularized regression method, and AUC-Random Forest, a machine-learning algorithm. Both approaches explore the joint effect of markers. BTL analysis identified a signature of 37 SNPs in 34 genes showing an association with bladder cancer. AUC-RF detected an optimal predictive subset of 56 SNPs. 13 SNPs were identified by both methods in the total population. Using resources from the Texas Bladder Cancer study we were able to replicate 30% of the SNPs assessed. The associations between inflammatory SNPs and bladder cancer were reexamined among non-smokers to eliminate the effect of tobacco, one of the strongest and most prevalent environmental risk factor for this tumor. A 9 SNP-signature was detected by BTL. Here we report, for the first time, a set of SNP in inflammatory genes jointly associated with bladder cancer risk. These results highlight the importance of the complex structure of genetic susceptibility associated with cancer risk.The work was partially supported by the Fondo de Investigacion Sanitaria, Instituto de Salud Carlos III (G03/174, 00/0745, PI051436, PI061614, PI09-02102, G03/174 and Sara Borrell fellowship to ELM) and Ministry of Science and Innovation (MTM2008-06747-C02-02 and FPU fellowship award to VU), Spain; AGAUR-Generalitat de Catalunya (Grant 2009SGR-581); Fundaciola Maratode TV3; Red Tematica de Investigacion Cooperativa en Cancer (RTICC); Asociacion Espanola Contra el Cancer (AECC); EU-FP7-201663; and RO1-CA089715 and CA34627; the Spanish National Institute for Bioinformatics (www.inab.org); and by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute, USA. MD Anderson support for this project included U01 CA 127615 (XW); R01 CA 74880 (XW); P50 CA 91846 (XW, CPD); Betty B. Marcus Chair fund in Cancer Prevention (XW); UT Research Trust fund (XW) and R01 CA 131335 (JG)
    corecore