3,370 research outputs found

    BATCH-GE : batch analysis of next-generation sequencing data for genome editing assessment

    Get PDF
    Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome

    Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm

    Full text link
    Twitter is a popular social network platform where users can interact and post texts of up to 280 characters called tweets. Hashtags, hyperlinked words in tweets, have increasingly become crucial for tweet retrieval and search. Using hashtags for tweet topic classification is a challenging problem because of context dependent among words, slangs, abbreviation and emoticons in a short tweet along with evolving use of hashtags. Since Twitter generates millions of tweets daily, tweet analytics is a fundamental problem of Big data stream that often requires a real-time Distributed processing. This paper proposes a distributed online approach to tweet topic classification with hashtags. Being implemented on Apache Storm, a distributed real time framework, our approach incrementally identifies and updates a set of strong predictors in the Na\"ive Bayes model for classifying each incoming tweet instance. Preliminary experiments show promising results with up to 97% accuracy and 37% increase in throughput on eight processors.Comment: IEEE International Conference on Big Data 201

    FindFoci: a focus detection algorithm with automated parameter training that closely matches human assignments, reduces human inconsistencies and increases speed of analysis

    Get PDF
    Accurate and reproducible quantification of the accumulation of proteins into foci in cells is essential for data interpretation and for biological inferences. To improve reproducibility, much emphasis has been placed on the preparation of samples, but less attention has been given to reporting and standardizing the quantification of foci. The current standard to quantitate foci in open-source software is to manually determine a range of parameters based on the outcome of one or a few representative images and then apply the parameter combination to the analysis of a larger dataset. Here, we demonstrate the power and utility of using machine learning to train a new algorithm (FindFoci) to determine optimal parameters. FindFoci closely matches human assignments and allows rapid automated exploration of parameter space. Thus, individuals can train the algorithm to mirror their own assignments and then automate focus counting using the same parameters across a large number of images. Using the training algorithm to match human assignments of foci, we demonstrate that applying an optimal parameter combination from a single image is not broadly applicable to analysis of other images scored by the same experimenter or by other experimenters. Our analysis thus reveals wide variation in human assignment of foci and their quantification. To overcome this, we developed training on multiple images, which reduces the inconsistency of using a single or a few images to set parameters for focus detection. FindFoci is provided as an open-source plugin for ImageJ

    Comparison of panel codes for aerodynamic analysis of airfoils

    Get PDF
    Cieľom tejto práce bolo vytvorenie prehľadu v súčasnosti používaných implementácií panelových metód pre aerodynamické výpočty charakteristík 2D profilov. Základný popis princípu panelovej metódy, porovnanie jednotlivých implementácií a zhodnotenie ich možností (presnosť, aplikovateľnosť) na typické úlohy. V práci boli použité tri rôzne panelové programy: Xfoil, JavaFoil a XFLR5. Práca bola obohatená o meranie v aerodynamickom tuneli.The purpose of this study is to create an overview of currently the most used panel codes for computation of aerodynamic characteristics of 2D airfoils. Description of the basic principles of panel code, comparison of various implementation and evaluation (accuracy, applicability) for typical tasks. In this thesis there were used three different panel codes: Xfoil, JavaFoil and XFLR5. Thesis was enriched by measurement in wind tunnel.

    Incremental Principal Component Analysis Exact implementation and continuity corrections

    Full text link
    This paper describes some applications of an incremental implementation of the principal component analysis (PCA). The algorithm updates the transformation coefficients matrix on-line for each new sample, without the need to keep all the samples in memory. The algorithm is formally equivalent to the usual batch version, in the sense that given a sample set the transformation coefficients at the end of the process are the same. The implications of applying the PCA in real time are discussed with the help of data analysis examples. In particular we focus on the problem of the continuity of the PCs during an on-line analysis.Comment: accepted at http://www.icinco.org

    Job Interactivity Using a Steering Service in an Interactive Grid Analysis Environment

    Get PDF
    Grid computing has been dominated by the execution of batch jobs. Interactive data analysis is a new domain in the area of grid job execution. The Grid-Enabled Analysis Environment (GAE) attempts to address this in HEP grids by the use of a Steering Service. This service will provide physicists with the continuous feedback of their jobs and will provide them with the ability to control and steer the execution of their submitted jobs. It will enable them to move their jobs to different grid nodes when desired. The Steering Service will also act autonomously to make steering decisions on behalf of the user, attempting to optimize the execution of the job. This service will also ensure the optimal consumption of the Grid user's resource quota. The Steering Service will provide a web service interface defined by standard WSDL. In this paper we have discussed how the Steering Service will facilitate interactive remote analysis of data generated in Interactive Grid Analysis Environment
    • …
    corecore