346 research outputs found

    Ground truth deficiencies in software engineering: when codifying the past can be counterproductive

    Get PDF
    Many software engineering tools build and evaluate their models based on historical data to support development and process decisions. These models help us answer numerous interesting questions, but have their own caveats. In a real-life setting, the objective function of human decision-makers for a given task might be influenced by a whole host of factors that stem from their cognitive biases, subverting the ideal objective function required for an optimally functioning system. Relying on this data as ground truth may give rise to systems that end up automating software engineering decisions by mimicking past sub-optimal behaviour. We illustrate this phenomenon and suggest mitigation strategies to raise awareness

    Mitochondrial carrier homolog 1 (Mtch1) antibodies in neuro-Behçet's disease

    Get PDF
    Cataloged from PDF version of article.Efforts for the identification of diagnostic autoantibodies for neuro-Behcet's disease (NBD) have failed. Screening of NBD patients' sera with protein macroarray identified mitochondrial carrier homolog 1 (Mtch1), an apoptosis-related protein, as a potential autoantigen. ELISA studies showed serum Mtch1 antibodies in 68 of 144 BD patients with or without neurological involvement and in 4 of 168 controls corresponding to a sensitivity of 47.2% and specificity of 97.6%. Mtch1 antibody positive NBD patients had more attacks, increased disability and lower serum nucleosome levels. Mtch1 antibody might be involved in pathogenic mechanisms of NBD rather than being a coincidental byproduct of autoinflammation. © 2013 Elsevier B.V

    Dissipation and fluctuations in nanoelectromechanical systems based on carbon nanotubes

    Full text link
    Tribological characteristics of nanotube-based nanoelectromechanical systems (NEMS) exemplified by a gigahertz oscillator are studied. Various factors that influence the tribological properties of the nanotube-based NEMS are quantitatively analyzed with the use of molecular dynamics calculations of the quality factor (Q-factor) of the gigahertz oscillator. We demonstrate that commensurability of the nanotube walls can increase the dissipation rate, while the structure of the wall ends and the nanotube length do not influence the Q-factor. It is shown that the dissipation rate depends on the interwall distance and the way of fixation of the outer wall and is significant in the case of a poor fixation for the nanotubes with a large interwall distance. Defects are found to strongly decrease the Q-factor due to the excitation of low-frequency vibrational modes. No universal correlation between the static friction forces and the energy dissipation rate is established. We propose an explanation of the obtained results on the basis of the classical theory of vibrational-translational relaxation. Significant thermodynamics fluctuations are revealed in the gigahertz oscillator by molecular dynamics simulations and analyzed in the framework of the fluctuation-dissipation theorem. Possibility of designing the NEMS with a desirable Q-factor and their applications are discussed on the basis of the above results.Comment: 32 pages, 7 figure

    ESTIMATING GENOME-WIDE COPY NUMBER USING ALLELE SPECIFIC MIXTURE MODELS

    Get PDF
    Genomic changes such as copy number alterations are thought to be one of the major underlying causes of human phenotypic variation among normal and disease subjects [23,11,25,26,5,4,7,18]. These include chromosomal regions with so-called copy number alterations: instead of the expected two copies, a section of the chromosome for a particular individual may have zero copies (homozygous deletion), one copy (hemizygous deletions), or more than two copies (amplifications). The canonical example is Down syndrome which is caused by an extra copy of chromosome 21. Identification of such abnormalities in smaller regions has been of great interest, because it is believed to be an underlying cause of cancer. More than one decade ago comparative genomic hybridization (CGH)technology was developed to detect copy number changes in a high-throughput fashion. However, this technology only provides a 10 MB resolution which limits the ability to detect copy number alterations spanning small regions. It is widely believed that a copy number alteration as small as one base can have significant downstream effects, thus microarray manufacturers have developed technologies that provide much higher resolution. Unfortunately, strong probe effects and variation introduced by sample preparation procedures have made single-point copy number estimates too imprecise to be useful. CGH arrays use a two-color hybridization, usually comparing a sample of interest to a reference sample, which to some degree removes the probe effect. However, the resolution is not nearly high enough to provide single-point copy number estimates. Various groups have proposed statistical procedures that pool data from neighboring locations to successfully improve precision. However, these procedure need to average across relatively large regions to work effectively thus greatly reducing the resolution. Recently, regression-type models that account for probe-effect have been proposed and appear to improve accuracy as well as precision. In this paper, we propose a mixture model solution specifically designed for single-point estimation, that provides various advantages over the existing methodology. We use a 314 sample database, constructed with public datasets, to motivate and fit models for the conditional distribution of the observed intensities given allele specific copy numbers. With the estimated models in place we can compute posterior probabilities that provide a useful prediction rule as well as a confidence measure for each call. Software to implement this procedure will be available in the Bioconductor oligo packagehttp://www.bioconductor.org)

    A fast and accurate method to detect allelic genomic imbalances underlying mosaic rearrangements using SNP array data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mosaicism for copy number and copy neutral chromosomal rearrangements has been recently identified as a relatively common source of genetic variation in the normal population. However its prevalence is poorly defined since it has been only studied systematically in one large-scale study and by using non optimal <it>ad-hoc </it>SNP array data analysis tools, uncovering rather large alterations (> 1 Mb) and affecting a high proportion of cells. Here we propose a novel methodology, Mosaic Alteration Detection-MAD, by providing a software tool that is effective for capturing previously described alterations as wells as new variants that are smaller in size and/or affecting a low percentage of cells.</p> <p>Results</p> <p>The developed method identified all previously known mosaic abnormalities reported in SNP array data obtained from controls, bladder cancer and HapMap individuals. In addition MAD tool was able to detect new mosaic variants not reported before that were smaller in size and with lower percentage of cells affected. The performance of the tool was analysed by studying simulated data for different scenarios. Our method showed high sensitivity and specificity for all assessed scenarios.</p> <p>Conclusions</p> <p>The tool presented here has the ability to identify mosaic abnormalities with high sensitivity and specificity. Our results confirm the lack of sensitivity of former methods by identifying new mosaic variants not reported in previously utilised datasets. Our work suggests that the prevalence of mosaic alterations could be higher than initially thought. The use of appropriate SNP array data analysis methods would help in defining the human genome mosaic map.</p

    Atypical presentation of colon adenocarcinoma: a case report

    Get PDF
    <p>Abstract</p> <p>Introduction</p> <p>Adenocarcinoma of the colon is the most common histopathological type of colorectal cancer. In Western Europe and the United States, it is the third most common type and accounts for 98% of cancers of the large intestine. In Uganda, as elsewhere in Africa, the majority of patients are elderly (at least 60 years old). However, more recently, it has been observed that younger patients (less than 40 years of age) are presenting with the disease. There is also an increase in its incidence and most patients present late, possibly because of the lack of a comprehensive national screening and preventive health-care program. We describe the clinicopathological features of colorectal carcinoma in the case of a young man in Kampala, Uganda.</p> <p>Case presentation</p> <p>A 27-year-old man from Kampala, Uganda, presented with gross abdominal distension, progressive loss of weight, and fever. He was initially screened for tuberculosis, hepatitis, and lymphoma, and human immunodeficiency virus/acquired immunodeficiency syndrome infection. After a battery of tests, a diagnosis of colorectal carcinoma was finally established with hematoxylin and eosin staining of a cell block made from the sediment of a liter of cytospun ascitic fluid, which showed atypical glands floating in abundant extracellular mucin, suggestive of adenocarcinoma. Ancillary tests with alcian blue/periodic acid Schiff and mucicarmine staining revealed that it was a mucinous adenocarcinoma. Immunohistochemistry showed strong positivity with CDX2, confirming that the origin of the tumor was the colon.</p> <p>Conclusions</p> <p>Colorectal carcinoma has been noted to occur with increasing frequency in young adults in Africa. Most patients have mucinous adenocarcinoma, present late, and have rapid disease progression and poor outcome. Therefore, colorectal malignancy should no longer be excluded from consideration only on the basis of a patient's age. A high index of suspicion is important in the diagnosis of colorectal malignancy in young African patients.</p

    Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model

    Get PDF
    Abstract Background Copy number variants (CNVs) have been demonstrated to occur at a high frequency and are now widely believed to make a significant contribution to the phenotypic variation in human populations. Array-based comparative genomic hybridization (array-CGH) and newly developed read-depth approach through ultrahigh throughput genomic sequencing both provide rapid, robust, and comprehensive methods to identify CNVs on a whole-genome scale. Results We developed a Bayesian statistical analysis algorithm for the detection of CNVs from both types of genomic data. The algorithm can analyze such data obtained from PCR-based bacterial artificial chromosome arrays, high-density oligonucleotide arrays, and more recently developed high-throughput DNA sequencing. Treating parameters--e.g., the number of CNVs, the position of each CNV, and the data noise level--that define the underlying data generating process as random variables, our approach derives the posterior distribution of the genomic CNV structure given the observed data. Sampling from the posterior distribution using a Markov chain Monte Carlo method, we get not only best estimates for these unknown parameters but also Bayesian credible intervals for the estimates. We illustrate the characteristics of our algorithm by applying it to both synthetic and experimental data sets in comparison to other segmentation algorithms. Conclusions In particular, the synthetic data comparison shows that our method is more sensitive than other approaches at low false positive rates. Furthermore, given its Bayesian origin, our method can also be seen as a technique to refine CNVs identified by fast point-estimate methods and also as a framework to integrate array-CGH and sequencing data with other CNV-related biological knowledge, all through informative priors.</p

    Revealing the missing expressed genes beyond the human reference genome by RNA-Seq

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The complete and accurate human reference genome is important for functional genomics researches. Therefore, the incomplete reference genome and individual specific sequences have significant effects on various studies.</p> <p>Results</p> <p>we used two RNA-Seq datasets from human brain tissues and 10 mixed cell lines to investigate the completeness of human reference genome. First, we demonstrated that in previously identified ~5 Mb Asian and ~5 Mb African novel sequences that are absent from the human reference genome of NCBI build 36, ~211 kb and ~201 kb of them could be transcribed, respectively. Our results suggest that many of those transcribed regions are not specific to Asian and African, but also present in Caucasian. Then, we found that the expressions of 104 RefSeq genes that are unalignable to NCBI build 37 in brain and cell lines are higher than 0.1 RPKM. 55 of them are conserved across human, chimpanzee and macaque, suggesting that there are still a significant number of functional human genes absent from the human reference genome. Moreover, we identified hundreds of novel transcript contigs that cannot be aligned to NCBI build 37, RefSeq genes and EST sequences. Some of those novel transcript contigs are also conserved among human, chimpanzee and macaque. By positioning those contigs onto the human genome, we identified several large deletions in the reference genome. Several conserved novel transcript contigs were further validated by RT-PCR.</p> <p>Conclusion</p> <p>Our findings demonstrate that a significant number of genes are still absent from the incomplete human reference genome, highlighting the importance of further refining the human reference genome and curating those missing genes. Our study also shows the importance of <it>de novo </it>transcriptome assembly. The comparative approach between reference genome and other related human genomes based on the transcriptome provides an alternative way to refine the human reference genome.</p

    Quantitative Analysis of Single Nucleotide Polymorphisms within Copy Number Variation

    Get PDF
    BACKGROUND: Single nucleotide polymorphisms (SNPs) have been used extensively in genetics and epidemiology studies. Traditionally, SNPs that did not pass the Hardy-Weinberg equilibrium (HWE) test were excluded from these analyses. Many investigators have addressed possible causes for departure from HWE, including genotyping errors, population admixture and segmental duplication. Recent large-scale surveys have revealed abundant structural variations in the human genome, including copy number variations (CNVs). This suggests that a significant number of SNPs must be within these regions, which may cause deviation from HWE. RESULTS: We performed a Bayesian analysis on the potential effect of copy number variation, segmental duplication and genotyping errors on the behavior of SNPs. Our results suggest that copy number variation is a major factor of HWE violation for SNPs with a small minor allele frequency, when the sample size is large and the genotyping error rate is 0~1%. CONCLUSIONS: Our study provides the posterior probability that a SNP falls in a CNV or a segmental duplication, given the observed allele frequency of the SNP, sample size and the significance level of HWE testing
    corecore