14 research outputs found

    Additional file 1 of Imbalance learning for the prediction of N6-Methylation sites in mRNAs

    No full text
    Data set of human mature mRNA N6-Methylation. Training and testing data used in this paper is accessible in this file. For each sample, the transcript id, site position, transcript length and flanking sequence with a size of 26 nts are given. (XLSX 15155 kb

    Additional file 2 of Imbalance learning for the prediction of N6-Methylation sites in mRNAs

    No full text
    Supplementary Tables, Algorithm and Figure. Table S1: The result of Fisher’s exact test on training data. The SNP variant states of positive and negative samples are counted respectively at all positions in window sequence. The P-value is computed with Fisher’s exact function from Python scipy package. Table S2: Complete SNP specificity ranking for all positions. Table S3: The feature distribution in HMpre feature space. Algorithm S1: SNP Specificity Identification Algorithm. Figure S1: Distribution of feature importance scores in XGBoost Classifier learning stage. (PDF 323 kb

    The Incidence Characteristics of Second Primary Malignancy after Diagnosis of Primary Colon and Rectal Cancer: A Population Based Study

    No full text
    <div><p>Background</p><p>With the expanding population of colorectal cancer (CRC) survivors in the United States, one concerning issue is the risk of developing second primary malignancies (SPMs) for these CRC survivors. The present study attempts to identify the incidence characteristics of SPMs after diagnosis of first primary colon cancer (CC) and rectal cancer (RC).</p><p>Methods</p><p>189,890 CC and 83,802 RC cases were identified from Surveillance, Epidemiology and End Results Program (SEER) database. We performed rate analysis on incidence trend of SPMs in both CC and RC. Expected incidence rates were stratified by age, race and stage, calendar year of first CRC diagnosis and latency period since first CRC diagnosis. The standardized incidence ratios (SIRs), measure for estimating risk of SPMs, were calculated for CC and RC respectively.</p><p>Results</p><p>The trends of incidence of SPMs in both CC and RC were decreasing from 1992 to 2012. Both CC and RC survivors had higher risk of developing SPMs (SIRCC = 1.13; SIRRC = 1.05). For CC patients, the highest risks of SPM were cancers of small intestine (SIR = 4.03), colon (SIR = 1.87) and rectum (SIR = 1.80). For RC patients, the highest risks of SPMs were cancers of rectum (SIR = 2.88), small intestine (SIR = 2.16) and thyroid (SIR = 1.46). According to stratified analyses, we also identified incidence characteristics which were contributed to higher risk of developing SPMs, including the age between 20 and 40, American Indian/Alaska Native, localized stage, diagnosed at calendar year from 2002 to 2012 and the latency between 12 and 59 months.</p><p>Conclusions</p><p>Both CC and RC survivors remain at higher risk of developing SPMs. The identification of incidence characteristics of SPMs is extremely essential for continuous cancer surveillance among CRC survivors.</p></div
    corecore