49,201 research outputs found
Peak Alignment of Gas Chromatography-Mass Spectrometry Data with Deep Learning
We present ChromAlignNet, a deep learning model for alignment of peaks in Gas
Chromatography-Mass Spectrometry (GC-MS) data. In GC-MS data, a compound's
retention time (RT) may not stay fixed across multiple chromatograms. To use
GC-MS data for biomarker discovery requires alignment of identical analyte's RT
from different samples. Current methods of alignment are all based on a set of
formal, mathematical rules. We present a solution to GC-MS alignment using deep
learning neural networks, which are more adept at complex, fuzzy data sets. We
tested our model on several GC-MS data sets of various complexities and
analysed the alignment results quantitatively. We show the model has very good
performance (AUC for simple data sets and AUC for very
complex data sets). Further, our model easily outperforms existing algorithms
on complex data sets. Compared with existing methods, ChromAlignNet is very
easy to use as it requires no user input of reference chromatograms and
parameters. This method can easily be adapted to other similar data such as
those from liquid chromatography. The source code is written in Python and
available online
Learning Reputation in an Authorship Network
The problem of searching for experts in a given academic field is hugely
important in both industry and academia. We study exactly this issue with
respect to a database of authors and their publications. The idea is to use
Latent Semantic Indexing (LSI) and Latent Dirichlet Allocation (LDA) to perform
topic modelling in order to find authors who have worked in a query field. We
then construct a coauthorship graph and motivate the use of influence
maximisation and a variety of graph centrality measures to obtain a ranked list
of experts. The ranked lists are further improved using a Markov Chain-based
rank aggregation approach. The complete method is readily scalable to large
datasets. To demonstrate the efficacy of the approach we report on an extensive
set of computational simulations using the Arnetminer dataset. An improvement
in mean average precision is demonstrated over the baseline case of simply
using the order of authors found by the topic models
Overview of Polkadot and its Design Considerations
In this paper we describe the design components of the heterogenous
multi-chain protocol Polkadot and explain how these components help Polkadot
address some of the existing shortcomings of blockchain technologies. At
present, a vast number of blockchain projects have been introduced and employed
with various features that are not necessarily designed to work with each
other. This makes it difficult for users to utilise a large number of
applications on different blockchain projects. Moreover, with the increase in
number of projects the security that each one is providing individually becomes
weaker. Polkadot aims to provide a scalable and interoperable framework for
multiple chains with pooled security that is achieved by the collection of
components described in this paper
Scaling Distributed Ledgers and Privacy-Preserving Applications
This thesis proposes techniques aiming to make blockchain technologies and smart contract platforms practical by improving their scalability, latency, and privacy. This thesis starts by presenting the design and implementation of Chainspace, a distributed ledger that supports user defined smart contracts and execute user-supplied transactions on their objects. The correct execution of smart contract transactions is publicly verifiable. Chainspace is scalable by sharding state; it is secure against subsets of nodes trying to compromise its integrity or availability properties through Byzantine Fault Tolerance (BFT). This thesis also introduces a family of replay attacks against sharded distributed ledgers targeting cross-shard consensus protocols; they allow an attacker, with network access only, to double-spend resources with minimal efforts. We then build Byzcuit, a new cross-shard consensus protocol that is immune to those attacks and that is tailored to run at the heart of Chainspace. Next, we propose FastPay, a high-integrity settlement system for pre-funded payments that can be used as a financial side-infrastructure for Chainspace to support low-latency retail payments. This settlement system is based on Byzantine Consistent Broadcast as its core primitive, foregoing the expenses of full atomic commit channels (consensus). The resulting system has extremely low-latency for both confirmation and payment finality. Finally, this thesis proposes Coconut, a selective disclosure credential scheme supporting distributed threshold issuance, public and private attributes, re-randomization, and multiple unlinkable selective attribute revelations. It ensures authenticity and availability even when a subset of credential issuing authorities are malicious or offline, and natively integrates with Chainspace to enable a number of scalable privacy-preserving applications
- …