233 research outputs found

    Primary tumor site specificity is preserved in patient-derived tumor xenograft models

    Get PDF
    Patient-derived tumor xenograft (PDX) mouse models are widely used for drug screening. The underlying assumption is that PDX tissue is very similar with the original patient tissue, and it has the same response to the drug treatment. To investigate whether the primary tumor site information is well preserved in PDX, we analyzed the gene expression profiles of PDX mouse models originated from different tissues, including breast, kidney, large intestine, lung, ovary, pancreas, skin, and soft tissues. The popular Monte Carlo feature selection method was employed to analyze the expression profile, yielding a feature list. From this list, incremental feature selection and support vector machine (SVM) were adopted to extract distinctively expressed genes in PDXs from different primary tumor sites and build an optimal SVM classifier. In addition, we also set up a group of quantitative rules to identify primary tumor sites. A total of 755 genes were extracted by the feature selection procedures, on which the SVM classifier can provide a high performance with MCC 0.986 on classifying primary tumor sites originated from different tissues. Furthermore, we obtained 16 classification rules, which gave a lower accuracy but clear classification procedures. Such results validated that the primary tumor site specificity was well preserved in PDX as the PDXs from different primary tumor sites were still very different and these PDX differences were similar with the differences observed in patients with tumor. For example, VIM and ABHD17C were highly expressed in the PDX from breast tissue and also highly expressed in breast cancer patients

    Identifying patients with atrioventricular septal defect in down syndrome populations by using self-normalizing neural networks and feature selection

    Get PDF
    Atrioventricular septal defect (AVSD) is a clinically significant subtype of congenital heart disease (CHD) that severely influences the health of babies during birth and is associated with Down syndrome (DS). Thus, exploring the differences in functional genes in DS samples with and without AVSD is a critical way to investigate the complex association between AVSD and DS. In this study, we present a computational method to distinguish DS patients with AVSD from those without AVSD using the newly proposed self-normalizing neural network (SNN). First, each patient was encoded by using the copy number of probes on chromosome 21. The encoded features were ranked by the reliable Monte Carlo feature selection (MCFS) method to obtain a ranked feature list. Based on this feature list, we used a two-stage incremental feature selection to construct two series of feature subsets and applied SNNs to build classifiers to identify optimal features. Results show that 2737 optimal features were obtained, and the corresponding optimal SNN classifier constructed on optimal features yielded a Matthew’s correlation coefficient (MCC) value of 0.748. For comparison, random forest was also used to build classifiers and uncover optimal features. This method received an optimal MCC value of 0.582 when top 132 features were utilized. Finally, we analyzed some key features derived from the optimal features in SNNs found in literature support to further reveal their essential roles

    Classification of Widely and Rarely Expressed Genes with Recurrent Neural Network

    Get PDF
    A tissue-specific gene expression shapes the formation of tissues, while gene expression changes reflect the immune response of the human body to environmental stimulations or pressure, particularly in disease conditions, such as cancers. A few genes are commonly expressed across tissues or various cancers, while others are not. To investigate the functional differences between widely and rarely expressed genes, we defined the genes that were expressed in 32 normal tissues/cancers (i.e., called widely expressed genes; FPKM >1 in all samples) and those that were not detected (i.e., called rarely expressed genes; FPKM <1 in all samples) based on the large gene expression data set provided by Uhlen et al. Each gene was encoded using the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment scores. Minimum redundancy maximum relevance (mRMR) was used to measure and rank these features on the mRMR feature list. Thereafter, we applied the incremental feature selection method with a supervised classifier recurrent neural network (RNN) to select the discriminate features for classifying widely expressed genes from rarely expressed genes and construct an optimum RNN classifier. The Youden's indexes generated by the optimum RNN classifier and evaluated using a 10-fold cross validation were 0.739 for normal tissues and 0.639 for cancers. Furthermore, the underlying mechanisms of the key discriminate GO and KEGG features were analyzed. Results can facilitate the identification of the expression landscape of genes and elucidation of how gene expression shapes tissues and the microenvironment of cancers

    Tissue Expression Difference between mRNAs and lncRNAs

    Get PDF
    Messenger RNA (mRNA) and long noncoding RNA (lncRNA) are two main subgroups of RNAs participating in transcription regulation. With the development of next generation sequencing, increasing lncRNAs are identified. Many hidden functions of lncRNAs are also revealed. However, the differences in lncRNAs and mRNAs are still unclear. For example, we need to determine whether lncRNAs have stronger tissue specificity than mRNAs and which tissues have more lncRNAs expressed. To investigate such tissue expression difference between mRNAs and lncRNAs, we encoded 9339 lncRNAs and 14,294 mRNAs with 71 expression features, including 69 maximum expression features for 69 types of cells, one feature for the maximum expression in all cells, and one expression specificity feature that was measured as Chao-Shen-corrected Shannon's entropy. With advanced feature selection methods, such as maximum relevance minimum redundancy, incremental feature selection methods, and random forest algorithm, 13 features presented the dissimilarity of lncRNAs and mRNAs. The 11 cell subtype features indicated which cell types of the lncRNAs and mRNAs had the largest expression difference. Such cell subtypes may be the potential cell models for lncRNA identification and function investigation. The expression specificity feature suggested that the cell types to express mRNAs and lncRNAs were different. The maximum expression feature suggested that the maximum expression levels of mRNAs and lncRNAs were different. In addition, the rule learning algorithm, repeated incremental pruning to produce error reduction algorithm, was also employed to produce effective classification rules for classifying lncRNAs and mRNAs, which gave competitive results compared with random forest and could give a clearer picture of different expression patterns between lncRNAs and mRNAs. Results not only revealed the heterogeneous expression pattern of lncRNA and mRNA, but also gave rise to the development of a new tool to identify the potential biological functions of such RNA subgroups

    Lithography-free Fabrication of High Quality Substrate-supported and Freestanding Graphene devices

    Get PDF
    We present a lithography-free technique for fabrication of clean, high quality graphene devices. This technique is based on evaporation through hard Si shadow masks, and eliminates contaminants introduced by lithographical processes. We demonstrate that devices fabricated by this technique have significantly higher mobility values than those by standard electron beam lithography. To obtain ultra-high mobility devices, we extend this technique to fabricate suspended graphene samples with mobility as high as 120,000 cm^2/Vs

    Generation of polyclonal antibody with high avidity to rosuvastatin and its use in development of highly sensitive ELISA for determination of rosuvastatin in plasma

    Get PDF
    In this study, a polyclonal antibody with high avidity and specificity to the potent hypocholesterolaemic agent rosuvastatin (ROS) has been prepared and used in the development of highly sensitive enzyme-linked immunosorbent assay (ELISA) for determination of ROS in plasma. ROS was coupled to keyhole limpt hemocyanin (KLH) and bovine serum albumin (BSA) using carbodiimide reagent. ROS-KLH conjugate was used for immunization of female 8-weeks old New Zealand white rabbits. The immune response of the rabbits was monitored by direct ELISA using ROS-BSA immobilized onto microwell plates as a solid phase. The rabbit that showed the highest antibody titer and avidity to ROS was scarified and its sera were collected. The IgG fraction was isolated and purified by avidity chromatography on protein A column. The purified antibody showed high avidity to ROS; IC50 = 0.4 ng/ml. The specificity of the antibody for ROS was evaluated by indirect ELISA using various competitors from the ROS-structural analogues and the therapeutic agents used with ROS in a combination therapy. The proposed ELISA involved a competitive binding reaction between ROS, in plasma sample, and the immobilized ROS-BSA for the binding sites on a limited amount of the anti-ROS antibody. The bound anti-ROS antibody was quantified with horseradish peroxidase-labeled second anti-rabbit IgG antibody (HRP-IgG) and 3,3',5,5'-tetramethylbenzidine (TMB) as a substrate for the peroxidase enzyme. The concentration of ROS in the sample was quantified by its ability to inhibit the binding of the anti-ROS antibody to the immobilized ROS-BSA and subsequently the color intensity in the assay wells. The assay enabled the determination of ROS in plasma at concentrations as low as 40 pg/ml

    DNA Barcode Goes Two-Dimensions: DNA QR Code Web Server

    Get PDF
    The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, “DNA barcode” actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications

    Palmitoleate Induces Hepatic Steatosis but Suppresses Liver Inflammatory Response in Mice

    Get PDF
    The interaction between fat deposition and inflammation during obesity contributes to the development of non-alcoholic fatty liver disease (NAFLD). The present study examined the effects of palmitoleate, a monounsaturated fatty acid (16∶1n7), on liver metabolic and inflammatory responses, and investigated the mechanisms by which palmitoleate increases hepatocyte fatty acid synthase (FAS) expression. Male wild-type C57BL/6J mice were supplemented with palmitoleate and subjected to the assays to analyze hepatic steatosis and liver inflammatory response. Additionally, mouse primary hepatocytes were treated with palmitoleate and used to analyze fat deposition, the inflammatory response, and sterol regulatory element-binding protein 1c (SREBP1c) activation. Compared with controls, palmitoleate supplementation increased the circulating levels of palmitoleate and improved systemic insulin sensitivity. Locally, hepatic fat deposition and SREBP1c and FAS expression were significantly increased in palmitoleate-supplemented mice. These pro-lipogenic events were accompanied by improvement of liver insulin signaling. In addition, palmitoleate supplementation reduced the numbers of macrophages/Kupffer cells in livers of the treated mice. Consistently, supplementation of palmitoleate decreased the phosphorylation of nuclear factor kappa B (NF-κB, p65) and the expression of proinflammatory cytokines. These results were recapitulated in primary mouse hepatocytes. In terms of regulating FAS expression, treatment of palmitoleate increased the transcription activity of SREBP1c and enhanced the binding of SREBP1c to FAS promoter. Palmitoleate also decreased the phosphorylation of NF-κB p65 and the expression of proinflammatory cytokines in cultured macrophages. Together, these results suggest that palmitoleate acts through dissociating liver inflammatory response from hepatic steatosis to play a unique role in NAFLD

    Applications of Nature-Inspired Algorithms for Dimension Reduction: Enabling Efficient Data Analytics

    Get PDF
    In [1], we have explored the theoretical aspects of feature selection and evolutionary algorithms. In this chapter, we focus on optimization algorithms for enhancing data analytic process, i.e., we propose to explore applications of nature-inspired algorithms in data science. Feature selection optimization is a hybrid approach leveraging feature selection techniques and evolutionary algorithms process to optimize the selected features. Prior works solve this problem iteratively to converge to an optimal feature subset. Feature selection optimization is a non-specific domain approach. Data scientists mainly attempt to find an advanced way to analyze data n with high computational efficiency and low time complexity, leading to efficient data analytics. Thus, by increasing generated/measured/sensed data from various sources, analysis, manipulation and illustration of data grow exponentially. Due to the large scale data sets, Curse of dimensionality (CoD) is one of the NP-hard problems in data science. Hence, several efforts have been focused on leveraging evolutionary algorithms (EAs) to address the complex issues in large scale data analytics problems. Dimension reduction, together with EAs, lends itself to solve CoD and solve complex problems, in terms of time complexity, efficiently. In this chapter, we first provide a brief overview of previous studies that focused on solving CoD using feature extraction optimization process. We then discuss practical examples of research studies are successfully tackled some application domains, such as image processing, sentiment analysis, network traffics / anomalies analysis, credit score analysis and other benchmark functions/data sets analysis

    Transcriptome and Network Changes in Climbers at Extreme Altitudes

    Get PDF
    Extreme altitude can induce a range of cellular and systemic responses. Although it is known that hypoxia underlies the major changes and that the physiological responses include hemodynamic changes and erythropoiesis, the molecular mechanisms and signaling pathways mediating such changes are largely unknown. To obtain a more complete picture of the transcriptional regulatory landscape and networks involved in extreme altitude response, we followed four climbers on an expedition up Mount Xixiabangma (8,012 m), and collected blood samples at four stages during the climb for mRNA and miRNA expression assays. By analyzing dynamic changes of gene networks in response to extreme altitudes, we uncovered a highly modular network with 7 modules of various functions that changed in response to extreme altitudes. The erythrocyte differentiation module is the most prominently up-regulated, reflecting increased erythrocyte differentiation from hematopoietic stem cells, probably at the expense of differentiation into other cell lineages. These changes are accompanied by coordinated down-regulation of general translation. Network topology and flow analyses also uncovered regulators known to modulate hypoxia responses and erythrocyte development, as well as unknown regulators, such as the OCT4 gene, an important regulator in stem cells and assumed to only function in stem cells. We predicted computationally and validated experimentally that increased OCT4 expression at extreme altitude can directly elevate the expression of hemoglobin genes. Our approach established a new framework for analyzing the transcriptional regulatory network from a very limited number of samples
    corecore