50 research outputs found

    Effective selection of informative SNPs and classification on the HapMap genotype data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Since the single nucleotide polymorphisms (SNPs) are genetic variations which determine the difference between any two unrelated individuals, the SNPs can be used to identify the correct source population of an individual. For efficient population identification with the HapMap genotype data, as few informative SNPs as possible are required from the original 4 million SNPs. Recently, Park <it>et al.</it> (2006) adopted the nearest shrunken centroid method to classify the three populations, i.e., Utah residents with ancestry from Northern and Western Europe (CEU), Yoruba in Ibadan, Nigeria in West Africa (YRI), and Han Chinese in Beijing together with Japanese in Tokyo (CHB+JPT), from which 100,736 SNPs were obtained and the top 82 SNPs could completely classify the three populations.</p> <p>Results</p> <p>In this paper, we propose to first rank each feature (SNP) using a ranking measure, i.e., a modified t-test or F-statistics. Then from the ranking list, we form different feature subsets by sequentially choosing different numbers of features (e.g., 1, 2, 3, ..., 100.) with top ranking values, train and test them by a classifier, e.g., the support vector machine (SVM), thereby finding one subset which has the highest classification accuracy. Compared to the classification method of Park <it>et al.</it>, we obtain a better result, i.e., good classification of the 3 populations using on average 64 SNPs.</p> <p>Conclusion</p> <p>Experimental results show that the both of the modified t-test and F-statistics method are very effective in ranking SNPs about their classification capabilities. Combined with the SVM classifier, a desirable feature subset (with the minimum size and most informativeness) can be quickly found in the greedy manner after ranking all SNPs. Our method is able to identify a very small number of important SNPs that can determine the populations of individuals.</p

    De novo assembly of a transcriptome from the eggs and early embryos of Astropecten aranciacus

    Get PDF
    Starfish have been instrumental in many fields of biological and ecological research. Oocytes of Astropecten aranciacus, a common species native to the Mediterranean Sea and the East Atlantic, have long been used as an experimental model to study meiotic maturation, fertilization, intracellular Ca2+ signaling, and cell cycle controls. However, investigation of the underlying molecular mechanisms has often been hampered by the overall lack of DNA or protein sequences for the species. In this study, we have assembled a transcriptome for this species from the oocytes, eggs, zygotes, and early embryos, which are known to have the highest RNA sequence complexity. Annotation of the transcriptome identified over 32,000 transcripts including the ones that encode 13 distinct cyclins and as many cyclin-dependent kinases (CDK), as well as the expected components of intracellular Ca2+ signaling toolkit. Although the mRNAs of cyclin and CDK families did not undergo significant abundance changes through the stages from oocyte to early embryo, as judged by real-time PCR, the transcript encoding Mos, a negative regulator of mitotic cell cycle, was drastically reduced during the period of rapid cleavages. Molecular phylogenetic analysis using the homologous amino acid sequences of cytochrome oxidase subunit I from A. aranciacus and 30 other starfish species indicated that Paxillosida, to which A. aranciacus belongs, is not likely to be the most basal order in Asteroidea. Taken together, the first transcriptome we assembled in this species is expected to enable us to perform comparative studies and to design gene-specific molecular tools with which to tackle long-standing biological questions

    Cyclic Steady State Space Refinement

    No full text

    Synergistic Clinical Trials with CAD Systems for the Early Detection of Lung Cancer

    No full text

    A deep learning line to assess patient’s lung cancer stages

    Get PDF
    Our goal is to pursue a vision of developing and maintaining a comprehensive and integrated computer model to help physicians plan the most appropriate treatment and anticipate a patient’s prospects for the extent of cancer. For example, cancer can be treated at an early stage by surgery or radiation, while chemotherapy may be the care for more advanced stages. In fact, early detection of this type of cancer facilitates its treatment and may rise the patients’ prospect of a continued existence. Thus, a formal view of an intelligent system for performing cancer feature extraction and analysis in order to establish the bases that will help physicians plan treatment and predict patient’s prognosis is presented. It is based on the Logic Programming Language and draws a line between Deep Learning and Knowledge Representation and Reasoning, and is supported by a Case Based attitude to computing. In fact, despite the fact that each patient’s condition is different, treating cancer at the same stage is often similar.This work has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013
    corecore