49,201 research outputs found

    Peak Alignment of Gas Chromatography-Mass Spectrometry Data with Deep Learning

    Full text link
    We present ChromAlignNet, a deep learning model for alignment of peaks in Gas Chromatography-Mass Spectrometry (GC-MS) data. In GC-MS data, a compound's retention time (RT) may not stay fixed across multiple chromatograms. To use GC-MS data for biomarker discovery requires alignment of identical analyte's RT from different samples. Current methods of alignment are all based on a set of formal, mathematical rules. We present a solution to GC-MS alignment using deep learning neural networks, which are more adept at complex, fuzzy data sets. We tested our model on several GC-MS data sets of various complexities and analysed the alignment results quantitatively. We show the model has very good performance (AUC 1\sim 1 for simple data sets and AUC 0.85\sim 0.85 for very complex data sets). Further, our model easily outperforms existing algorithms on complex data sets. Compared with existing methods, ChromAlignNet is very easy to use as it requires no user input of reference chromatograms and parameters. This method can easily be adapted to other similar data such as those from liquid chromatography. The source code is written in Python and available online

    Proceedings of the 2nd Computer Science Student Workshop: Microsoft Istanbul, Turkey, April 9, 2011

    Get PDF

    Learning Reputation in an Authorship Network

    Full text link
    The problem of searching for experts in a given academic field is hugely important in both industry and academia. We study exactly this issue with respect to a database of authors and their publications. The idea is to use Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) to perform topic modelling in order to find authors who have worked in a query field. We then construct a coauthorship graph and motivate the use of influence maximisation and a variety of graph centrality measures to obtain a ranked list of experts. The ranked lists are further improved using a Markov Chain-based rank aggregation approach. The complete method is readily scalable to large datasets. To demonstrate the efficacy of the approach we report on an extensive set of computational simulations using the Arnetminer dataset. An improvement in mean average precision is demonstrated over the baseline case of simply using the order of authors found by the topic models

    Overview of Polkadot and its Design Considerations

    Get PDF
    In this paper we describe the design components of the heterogenous multi-chain protocol Polkadot and explain how these components help Polkadot address some of the existing shortcomings of blockchain technologies. At present, a vast number of blockchain projects have been introduced and employed with various features that are not necessarily designed to work with each other. This makes it difficult for users to utilise a large number of applications on different blockchain projects. Moreover, with the increase in number of projects the security that each one is providing individually becomes weaker. Polkadot aims to provide a scalable and interoperable framework for multiple chains with pooled security that is achieved by the collection of components described in this paper

    Scaling Distributed Ledgers and Privacy-Preserving Applications

    Get PDF
    This thesis proposes techniques aiming to make blockchain technologies and smart contract platforms practical by improving their scalability, latency, and privacy. This thesis starts by presenting the design and implementation of Chainspace, a distributed ledger that supports user defined smart contracts and execute user-supplied transactions on their objects. The correct execution of smart contract transactions is publicly verifiable. Chainspace is scalable by sharding state; it is secure against subsets of nodes trying to compromise its integrity or availability properties through Byzantine Fault Tolerance (BFT). This thesis also introduces a family of replay attacks against sharded distributed ledgers targeting cross-shard consensus protocols; they allow an attacker, with network access only, to double-spend resources with minimal efforts. We then build Byzcuit, a new cross-shard consensus protocol that is immune to those attacks and that is tailored to run at the heart of Chainspace. Next, we propose FastPay, a high-integrity settlement system for pre-funded payments that can be used as a financial side-infrastructure for Chainspace to support low-latency retail payments. This settlement system is based on Byzantine Consistent Broadcast as its core primitive, foregoing the expenses of full atomic commit channels (consensus). The resulting system has extremely low-latency for both confirmation and payment finality. Finally, this thesis proposes Coconut, a selective disclosure credential scheme supporting distributed threshold issuance, public and private attributes, re-randomization, and multiple unlinkable selective attribute revelations. It ensures authenticity and availability even when a subset of credential issuing authorities are malicious or offline, and natively integrates with Chainspace to enable a number of scalable privacy-preserving applications
    corecore