4,376 research outputs found

    Scatteract: Automated extraction of data from scatter plots

    Full text link
    Charts are an excellent way to convey patterns and trends in data, but they do not facilitate further modeling of the data or close inspection of individual data points. We present a fully automated system for extracting the numerical values of data points from images of scatter plots. We use deep learning techniques to identify the key components of the chart, and optical character recognition together with robust regression to map from pixels to the coordinate system of the chart. We focus on scatter plots with linear scales, which already have several interesting challenges. Previous work has done fully automatic extraction for other types of charts, but to our knowledge this is the first approach that is fully automatic for scatter plots. Our method performs well, achieving successful data extraction on 89% of the plots in our test set.Comment: Submitted to ECML PKDD 2017 proceedings, 16 page

    Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty

    Get PDF
    Background: Accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, a comparative study of pairwise statistical significance with database statistical significance was conducted. In this paper, we extend the earlier work on pairwise statistical significance by incorporating with it the use of multiple parameter sets. Results: Results for a knowledge discovery application of homology detection reveal that using multiple parameter sets for pairwise statistical significance estimates gives better coverage than using a single parameter set, at least at some error levels. Further, the results of pairwise statistical significance using multiple parameter sets are shown to be significantly better than database statistical significance estimates reported by BLAST and PSI-BLAST, and comparable and at times significantly better than SSEARCH. Using non-zero parameter set change penalty values give better performance than zero penalty. Conclusion: The fact that the homology detection performance does not degrade when using multiple parameter sets is a strong evidence for the validity of the assumption that the alignment score distribution follows an extreme value distribution even when using multiple parameter sets. Parameter set change penalty is a useful parameter for alignment using multiple parameter sets. Pairwise statistical significance using multiple parameter sets can be effectively used to determine the relatedness of a (or a few) pair(s) of sequences without performing a time-consuming database search

    Zebrafish Reproduction: Revisiting In Vitro Fertilization to Increase Sperm Cryopreservation Success

    Get PDF
    Although conventional cryopreservation is a proven method for long-term, safe storage of genetic material, protocols used by the zebrafish community are not standardized and yield inconsistent results, thereby putting the security of many genotypes in individual laboratories and stock centers at risk. An important challenge for a successful zebrafish sperm cryopreservation program is the large variability in the post-thaw in vitro fertilization success (0 to 80%). But how much of this variability was due to the reproductive traits of the in vitro fertilization process, and not due to the cryopreservation process? These experiments only assessed the in vitro process with fresh sperm, but yielded the basic metrics needed for successful in vitro fertilization using cryopreserved sperm, as well. We analyzed the reproductive traits for zebrafish males with a strict body condition range. It did not correlate with sperm volume, or motility (P>0.05), but it did correlate with sperm concentration. Younger males produced more concentrated sperm (P<0.05). To minimize the wastage of sperm during the in vitro fertilization process, 106 cells/ml was the minimum sperm concentration needed to achieve an in vitro fertilization success of ≥ 70%. During the in vitro process, pooling sperm did not reduce fertilization success (P>0.05), but pooling eggs reduced it by approximately 30 to 50% (P<0.05). This reduction in fertilization success was due not to the pooling of the females' eggs, but to the type of tools used to handle the eggs. Recommendations to enhance the in vitro process for zebrafish include: 1) using males of a body condition closer to 1.5 for maximal sperm concentration; 2) minimizing sperm wastage by using a working sperm concentration of 106 motile cells/ml for in vitro fertilization; and 3) never using metal or sharp-edged tools to handle eggs prior to fertilization

    Unsupervised Bayesian linear unmixing of gene expression microarrays

    Get PDF
    Background: This paper introduces a new constrained model and the corresponding algorithm, called unsupervised Bayesian linear unmixing (uBLU), to identify biological signatures from high dimensional assays like gene expression microarrays. The basis for uBLU is a Bayesian model for the data samples which are represented as an additive mixture of random positive gene signatures, called factors, with random positive mixing coefficients, called factor scores, that specify the relative contribution of each signature to a specific sample. The particularity of the proposed method is that uBLU constrains the factor loadings to be non-negative and the factor scores to be probability distributions over the factors. Furthermore, it also provides estimates of the number of factors. A Gibbs sampling strategy is adopted here to generate random samples according to the posterior distribution of the factors, factor scores, and number of factors. These samples are then used to estimate all the unknown parameters. Results: Firstly, the proposed uBLU method is applied to several simulated datasets with known ground truth and compared with previous factor decomposition methods, such as principal component analysis (PCA), non negative matrix factorization (NMF), Bayesian factor regression modeling (BFRM), and the gradient-based algorithm for general matrix factorization (GB-GMF). Secondly, we illustrate the application of uBLU on a real time-evolving gene expression dataset from a recent viral challenge study in which individuals have been inoculated with influenza A/H3N2/Wisconsin. We show that the uBLU method significantly outperforms the other methods on the simulated and real data sets considered here. Conclusions: The results obtained on synthetic and real data illustrate the accuracy of the proposed uBLU method when compared to other factor decomposition methods from the literature (PCA, NMF, BFRM, and GB-GMF). The uBLU method identifies an inflammatory component closely associated with clinical symptom scores collected during the study. Using a constrained model allows recovery of all the inflammatory genes in a single factor

    TRPV3 and TRPV4 ion channels are not major contributors to mouse heat sensation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The discovery of heat-sensitive Transient Receptor Potential Vanilloid (TRPV) ion channels provided a potential molecular explanation for the perception of innocuous and noxious heat stimuli. TRPV1 has a significant role in acute heat nociception and inflammatory heat hyperalgesia. Yet, substantial innocuous and noxious heat sensitivity remains in TRPV1 knockout animals. Here we investigated the role of two related channels, TRPV3 and TRPV4, in these capacities. We studied TRPV3 knockout animals on both C57BL6 and 129S6 backgrounds, as well as animals deficient in both TRPV3 and TRPV4 on a C57BL6 background. Additionally, we assessed the contributions of TRPV3 and TRPV4 to acute heat nociception and inflammatory heat hyperalgesia during inhibition of TRPV1.</p> <p>Results</p> <p>TRPV3 knockout mice on the C57BL6 background exhibited no obvious alterations in thermal preference behavior. On the 129S6 background, absence of TRPV3 resulted in a more restrictive range of occupancy centered around cooler floor temperatures. TRPV3 knockout mice showed no deficits in acute heat nociception on either background. Mice deficient in both TRPV3 and TRPV4 on a C57BL6 background showed thermal preference behavior similar to wild-type controls on the thermal gradient, and little or no change in acute heat nociception or inflammatory heat hyperalgesia. Masking of TRPV1 by the TRPV1 antagonist JNJ-17203212 did not reveal differences between C57BL6 animals deficient in TRPV3 and TRPV4, compared to their wild-type counterparts.</p> <p>Conclusions</p> <p>Our results support the notion that TRPV3 and TRPV4 likely make limited and strain-dependent contributions to innocuous warm temperature perception or noxious heat sensation, even when TRPV1 is masked. These findings imply the existence of other significant mechanisms for heat perception.</p

    EGFR Gene Overexpression Retained in an Invasive Xenograft Model by Solid Orthotopic Transplantation of Human Glioblastoma Multiforme Into Nude Mice

    Get PDF
    Orthotopic xenograft animal model from human glioblastoma multiforme (GBM) cell lines often do not recapitulate an extremely important aspect of invasive growth and epidermal growth factor receptor (EGFR) gene overexpression of human GBM. We developed an orthotopic xenograft model by solid transplantation of human GBM into the brain of nude mouse. The orthotopic xenografts sharing the same histopathological features with their original human GBMs were highly invasive and retained the overexpression of EGFR gene. The murine orthotopic GBM models constitute a valuable in vivo system for preclinical studies to test novel therapies for human GBM

    Computational modelling of cancerous mutations in the EGFR/ERK signalling pathway

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund - Copyright @ 2009 Orton et al.BACKGROUND: The Epidermal Growth Factor Receptor (EGFR) activated Extracellular-signal Regulated Kinase (ERK) pathway is a critical cell signalling pathway that relays the signal for a cell to proliferate from the plasma membrane to the nucleus. Deregulation of the EGFR/ERK pathway due to alterations affecting the expression or function of a number of pathway components has long been associated with numerous forms of cancer. Under normal conditions, Epidermal Growth Factor (EGF) stimulates a rapid but transient activation of ERK as the signal is rapidly shutdown. Whereas, under cancerous mutation conditions the ERK signal cannot be shutdown and is sustained resulting in the constitutive activation of ERK and continual cell proliferation. In this study, we have used computational modelling techniques to investigate what effects various cancerous alterations have on the signalling flow through the ERK pathway. RESULTS: We have generated a new model of the EGFR activated ERK pathway, which was verified by our own experimental data. We then altered our model to represent various cancerous situations such as Ras, B-Raf and EGFR mutations, as well as EGFR overexpression. Analysis of the models showed that different cancerous situations resulted in different signalling patterns through the ERK pathway, especially when compared to the normal EGF signal pattern. Our model predicts that cancerous EGFR mutation and overexpression signals almost exclusively via the Rap1 pathway, predicting that this pathway is the best target for drugs. Furthermore, our model also highlights the importance of receptor degradation in normal and cancerous EGFR signalling, and suggests that receptor degradation is a key difference between the signalling from the EGF and Nerve Growth Factor (NGF) receptors. CONCLUSION: Our results suggest that different routes to ERK activation are being utilised in different cancerous situations which therefore has interesting implications for drug selection strategies. We also conducted a comparison of the critical differences between signalling from different growth factor receptors (namely EGFR, mutated EGFR, NGF, and Insulin) with our results suggesting the difference between the systems are large scale and can be attributed to the presence/absence of entire pathways rather than subtle difference in individual rate constants between the systems.This work was funded by the Department of Trade and Industry (DTI), under their Bioscience Beacon project programme. AG was funded by an industrial PhD studentship from Scottish Enterprise and Cyclacel

    Bayesian DNA copy number analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Some diseases, like tumors, can be related to chromosomal aberrations, leading to changes of DNA copy number. The copy number of an aberrant genome can be represented as a piecewise constant function, since it can exhibit regions of deletions or gains. Instead, in a healthy cell the copy number is two because we inherit one copy of each chromosome from each our parents.</p> <p>Bayesian Piecewise Constant Regression (BPCR) is a Bayesian regression method for data that are noisy observations of a piecewise constant function. The method estimates the unknown segment number, the endpoints of the segments and the value of the segment levels of the underlying piecewise constant function. The Bayesian Regression Curve (BRC) estimates the same data with a smoothing curve. However, in the original formulation, some estimators failed to properly determine the corresponding parameters. For example, the boundary estimator did not take into account the dependency among the boundaries and succeeded in estimating more than one breakpoint at the same position, losing segments.</p> <p>Results</p> <p>We derived an improved version of the BPCR (called mBPCR) and BRC, changing the segment number estimator and the boundary estimator to enhance the fitting procedure. We also proposed an alternative estimator of the variance of the segment levels, which is useful in case of data with high noise. Using artificial data, we compared the original and the modified version of BPCR and BRC with other regression methods, showing that our improved version of BPCR generally outperformed all the others. Similar results were also observed on real data.</p> <p>Conclusion</p> <p>We propose an improved method for DNA copy number estimation, mBPCR, which performed very well compared to previously published algorithms. In particular, mBPCR was more powerful in the detection of the true position of the breakpoints and of small aberrations in very noisy data. Hence, from a biological point of view, our method can be very useful, for example, to find targets of genomic aberrations in clinical cancer samples.</p

    Roy-Steiner equations for pion-nucleon scattering

    Get PDF
    Starting from hyperbolic dispersion relations, we derive a closed system of Roy-Steiner equations for pion-nucleon scattering that respects analyticity, unitarity, and crossing symmetry. We work out analytically all kernel functions and unitarity relations required for the lowest partial waves. In order to suppress the dependence on the high-energy regime we also consider once- and twice-subtracted versions of the equations, where we identify the subtraction constants with subthreshold parameters. Assuming Mandelstam analyticity we determine the maximal range of validity of these equations. As a first step towards the solution of the full system we cast the equations for the ππNˉN\pi\pi\to\bar NN partial waves into the form of a Muskhelishvili-Omn\`es problem with finite matching point, which we solve numerically in the single-channel approximation. We investigate in detail the role of individual contributions to our solutions and discuss some consequences for the spectral functions of the nucleon electromagnetic form factors.Comment: 106 pages, 18 figures; version published in JHE

    IL-4-secreting CD4+ T cells are crucial to the development of CD8+ T-cell responses against malaria liver stages.

    No full text
    CD4+ T cells are crucial to the development of CD8+ T cell responses against hepatocytes infected with malaria parasites. In the absence of CD4+ T cells, CD8+ T cells initiate a seemingly normal differentiation and proliferation during the first few days after immunization. However, this response fails to develop further and is reduced by more than 90%, compared to that observed in the presence of CD4+ T cells. We report here that interleukin-4 (IL-4) secreted by CD4+ T cells is essential to the full development of this CD8+ T cell response. This is the first demonstration that IL-4 is a mediator of CD4/CD8 cross-talk leading to the development of immunity against an infectious pathogen
    corecore