556 research outputs found

    Exploiting likely-positive and unlabeled data to improve the identification of protein-protein interaction articles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Experimentally verified protein-protein interactions (PPI) cannot be easily retrieved by researchers unless they are stored in PPI databases. The curation of such databases can be made faster by ranking newly-published articles' relevance to PPI, a task which we approach here by designing a machine-learning-based PPI classifier. All classifiers require labeled data, and the more labeled data available, the more reliable they become. Although many PPI databases with large numbers of labeled articles are available, incorporating these databases into the base training data may actually reduce classification performance since the supplementary databases may not annotate exactly the same PPI types as the base training data. Our first goal in this paper is to find a method of selecting likely positive data from such supplementary databases. Only extracting likely positive data, however, will bias the classification model unless sufficient negative data is also added. Unfortunately, negative data is very hard to obtain because there are no resources that compile such information. Therefore, our second aim is to select such negative data from unlabeled PubMed data. Thirdly, we explore how to exploit these likely positive and negative data. And lastly, we look at the somewhat unrelated question of which term-weighting scheme is most effective for identifying PPI-related articles.</p> <p>Results</p> <p>To evaluate the performance of our PPI text classifier, we conducted experiments based on the BioCreAtIvE-II IAS dataset. Our results show that adding likely-labeled data generally increases AUC by 3~6%, indicating better ranking ability. Our experiments also show that our newly-proposed term-weighting scheme has the highest AUC among all common weighting schemes. Our final model achieves an F-measure and AUC 2.9% and 5.0% higher than those of the top-ranking system in the IAS challenge.</p> <p>Conclusion</p> <p>Our experiments demonstrate the effectiveness of integrating unlabeled and likely labeled data to augment a PPI text classification system. Our mixed model is suitable for ranking purposes whereas our hierarchical model is better for filtering. In addition, our results indicate that supervised weighting schemes outperform unsupervised ones. Our newly-proposed weighting scheme, TFBRF, which considers documents that do not contain the target word, avoids some of the biases found in traditional weighting schemes. Our experiment results show TFBRF to be the most effective among several other top weighting schemes.</p

    Pretreatment carcinoembryonic antigen level is a risk factor for para-aortic lymph node recurrence in addition to squamous cell carcinoma antigen following definitive concurrent chemoradiotherapy for squamous cell carcinoma of the uterine cervix

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>To identify pretreatment carcinoembryonic antigen (CEA) levels as a risk factor for para-aortic lymph node (PALN) recurrence following concurrent chemoradiotherapy (CCRT) for cervical cancer.</p> <p>Methods</p> <p>From March 1995 to January 2008, 188 patients with squamous cell carcinoma (SCC) of the uterine cervix were analyzed retrospectively. No patient received PALN irradiation as the initial treatment. CEA and squamous cell carcinoma antigen (SCC-Ag) were measured before and after radiotherapy. PALN recurrence was detected by computer tomography (CT) scans. We analyzed the actuarial rates of PALN recurrence by using Kaplan-Meier curves. Multivariate analyses were carried out with Cox regression models. We stratified the risk groups based on the hazard ratios (HR).</p> <p>Results</p> <p>Both pretreatment CEA levels ≥ 10 ng/mL and SCC-Ag levels < 10 ng/mL (<it>p </it>< 0.001, HR = 8.838), SCC-Ag levels ≥ 40 ng/mL (<it>p </it>< 0.001, HR = 12.551), and SCC-Ag levels of 10-40 ng/mL (<it>p </it>< 0.001, HR = 4.2464) were significant factors for PALN recurrence. The corresponding 5-year PALN recurrence rates were 51.5%, 84.8%, and 27.5%, respectively. The 5-year PALN recurrence rate for patients with both low (< 10 ng/mL) SCC and CEA was only 9.6%. CEA levels ≥ 10 ng/mL or SCC-Ag levels ≥ 10 ng/mL at PALN recurrence were associated with overall survival after an isolated PALN recurrence. Pretreatment CEA levels ≥ 10 ng/mL were also associated with survival after an isolated PALN recurrence.</p> <p>Conclusions</p> <p>Pretreatment CEA ≥ 10 ng/mL is an additional risk factor of PALN relapse following definitive CCRT for SCC of the uterine cervix in patients with pretreatment SCC-Ag levels < 10 ng/mL. More comprehensive examinations before CCRT and intensive follow-up schedules are suggested for early detection and salvage in patients with SCC-Ag or CEA levels ≥ 10 ng/mL.</p

    A comparison of ARMS and direct sequencing for EGFR mutation analysis and Tyrosine Kinase Inhibitors treatment prediction in body fluid samples of Non-Small-Cell Lung Cancer patients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Epidermal growth factor receptor (<it>EGFR</it>) mutation is strongly associated with the therapeutic effect of tyrosine kinase inhibitors (TKIs) in patients with non-small-cell lung cancer (NSCLC). Nevertheless, tumor tissue that needed for mutation analysis is frequently unavailable. Body fluid was considered to be a feasible substitute for the analysis, but arising problems in clinical practice such as relatively lower mutation rate and poor clinical correlation are not yet fully resolved.</p> <p>Method</p> <p>In this study, 50 patients (32 pleural fluids and 18 plasmas) with TKIs therapy experience and with direct sequencing results were selected from 220 patients for further analysis. The <it>EGFR </it>mutation status was re-evaluated by Amplification Refractory Mutation System (ARMS), and the clinical outcomes of TKIs were analyzed retrospectively.</p> <p>Results</p> <p>As compared with direct sequencing, 16 positive and 23 negative patients were confirmed by ARMS, and the other 11 former negative patients (6 pleural fluids and 5 plasmas) were redefined as positive, with a fairly well clinical outcome (7 PR, 3 SD, and 1 PD). The objective response rate (ORR) of positive patients was significant, 81.3% (direct sequencing) and 72.7% (ARMS) for pleural fluids, and 80% (ARMS) for plasma. Notably, even reclassified by ARMS, the ORR for negative patients was still relatively high, 60% for pleural fluids and 46.2% for plasma.</p> <p>Conclusions</p> <p>When using body fluids for <it>EGFR </it>mutation analysis, positive result is consistently a good indicator for TKIs therapy, and the predictive effect was no less than that of tumor tissue, no matter what method was employed. However, even reclassified by ARMS, the correlation between negative results and clinical outcome of TKIs was still unsatisfied. The results indicated that false negative mutation still existed, which may be settled by using method with sensitivity to single DNA molecule or by optimizing the extraction procedure with RNA or CTC to ensure adequate amount of tumor-derived nucleic acid for the test.</p

    Wolfberry genomes and the evolution of Lycium (Solanaceae)

    Get PDF
    AbstractWolfberry Lycium, an economically important genus of the Solanaceae family, contains approximately 80 species and shows a fragmented distribution pattern among the Northern and Southern Hemispheres. Although several herbaceous species of Solanaceae have been subjected to genome sequencing, thus far, no genome sequences of woody representatives have been available. Here, we sequenced the genomes of 13 perennial woody species of Lycium, with a focus on Lycium barbarum. Integration with other genomes provides clear evidence supporting a whole-genome triplication (WGT) event shared by all hitherto sequenced solanaceous plants, which occurred shortly after the divergence of Solanaceae and Convolvulaceae. We identified new gene families and gene family expansions and contractions that first appeared in Solanaceae. Based on the identification of self-incompatibility related-gene families, we inferred that hybridization hotspots are enriched for genes that might be functioning in gametophytic self-incompatibility pathways in wolfberry. Extremely low expression of LOCULE NUBER (LC) and COLORLESS NON-RIPENING (CNR) orthologous genes during Lycium fruit development and ripening processes suggests functional diversification of these two genes between Lycium and tomato. The existence of additional flowering locus C-like MADS-box genes might correlate with the perennial flowering cycle of Lycium. Differential gene expression involved in the lignin biosynthetic pathway between Lycium and tomato likely illustrates woody and herbaceous differentiation. We also provide evidence that Lycium migrated from Africa into Asia, and subsequently from Asia into North America. Our results provide functional insights into Solanaceae origins, evolution and diversification.</jats:p

    Centrally Administered Pertussis Toxin Inhibits Microglia Migration to the Spinal Cord and Prevents Dissemination of Disease in an EAE Mouse Model

    Get PDF
    Background: Experimental autoimmune encephalomyelitis (EAE) models are important vehicles for studying the effect of infectious elements such as Pertussis toxin (PTx) on disease processes related to acute demyelinating encephalomyelitis (ADEM) or multiple sclerosis (MS). PTx has pleotropic effects on the immune system. This study was designed to investigate the effects of PTx administered intracerebroventricularly (icv) in preventing downstream immune cell infiltration and demyelination of the spinal cord. Methods and Findings: EAE was induced in C57BL/6 mice with MOG35–55. PTx icv at seven days post MOG immunization resulted in mitigation of clinical motor symptoms, minimal T cell infiltration, and the marked absence of axonal loss and demyelination of the spinal cord. Integrity of the blood brain barrier was compromised in the brain whereas spinal cord BBB integrity remained intact. PTx icv markedly increased microglia numbers in the brain preventing their migration to the spinal cord. An in vitro transwell study demonstrated that PTx inhibited migration of microglia. Conclusion: Centrally administered PTx abrogated migration of microglia in EAE mice, limiting the inflammatory cytokine milieu to the brain and prevented dissemination of demyelination. The effects of PTx icv warrants further investigation and provides an attractive template for further study regarding the pleotropic effects of infectious elements such as PTx in th

    The Forward Physics Facility at the High-Luminosity LHC

    Get PDF

    Optimasi Portofolio Resiko Menggunakan Model Markowitz MVO Dikaitkan dengan Keterbatasan Manusia dalam Memprediksi Masa Depan dalam Perspektif Al-Qur`an

    Full text link
    Risk portfolio on modern finance has become increasingly technical, requiring the use of sophisticated mathematical tools in both research and practice. Since companies cannot insure themselves completely against risk, as human incompetence in predicting the future precisely that written in Al-Quran surah Luqman verse 34, they have to manage it to yield an optimal portfolio. The objective here is to minimize the variance among all portfolios, or alternatively, to maximize expected return among all portfolios that has at least a certain expected return. Furthermore, this study focuses on optimizing risk portfolio so called Markowitz MVO (Mean-Variance Optimization). Some theoretical frameworks for analysis are arithmetic mean, geometric mean, variance, covariance, linear programming, and quadratic programming. Moreover, finding a minimum variance portfolio produces a convex quadratic programming, that is minimizing the objective function ðð¥with constraintsð ð 𥠥 ðandð´ð¥ = ð. The outcome of this research is the solution of optimal risk portofolio in some investments that could be finished smoothly using MATLAB R2007b software together with its graphic analysis

    Search for heavy resonances decaying to two Higgs bosons in final states containing four b quarks

    Get PDF
    A search is presented for narrow heavy resonances X decaying into pairs of Higgs bosons (H) in proton-proton collisions collected by the CMS experiment at the LHC at root s = 8 TeV. The data correspond to an integrated luminosity of 19.7 fb(-1). The search considers HH resonances with masses between 1 and 3 TeV, having final states of two b quark pairs. Each Higgs boson is produced with large momentum, and the hadronization products of the pair of b quarks can usually be reconstructed as single large jets. The background from multijet and t (t) over bar events is significantly reduced by applying requirements related to the flavor of the jet, its mass, and its substructure. The signal would be identified as a peak on top of the dijet invariant mass spectrum of the remaining background events. No evidence is observed for such a signal. Upper limits obtained at 95 confidence level for the product of the production cross section and branching fraction sigma(gg -> X) B(X -> HH -> b (b) over barb (b) over bar) range from 10 to 1.5 fb for the mass of X from 1.15 to 2.0 TeV, significantly extending previous searches. For a warped extra dimension theory with amass scale Lambda(R) = 1 TeV, the data exclude radion scalar masses between 1.15 and 1.55 TeV

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
    corecore