2,727,147 research outputs found
A Novel and Fast Approach for Population Structure Inference Using Kernel-PCA and Optimization (PSIKO)
Population structure is a confounding factor in Genome Wide Association Studies, increasing the rate of false positive associations. In order to correct for it, several model-based algorithms such as ADMIXTURE and STRUCTURE have been proposed. These tend to suffer from the fact that they have a considerable computational burden, limiting their applicability when used with large datasets, such as those produced by Next Generation Sequencing (NGS) techniques. To address this, non-model based approaches such as SNMF and EIGENSTRAT have been proposed, which scale better with larger data. Here we present a novel non-model based approach, PSIKO, which is based on a unique combination of linear kernel-PCA and least-squares optimization and allows for the inference of admixture coefficients, principal components, and number of founder populations of a dataset. PSIKO has been compared against existing leading methods on a variety of simulation scenarios, as well as on real biological data. We found that in addition to producing results of the same quality as other tested methods, PSIKO scales extremely well with dataset size, being considerably (up to 30 times) faster for longer sequences than even state of the art methods such as SNMF. PSIKO and accompanying manual are freely available at https://www.uea.ac.uk/computing/psiko
Playing with Derivation Modes and Halting Conditions
In the area of P systems, besides the standard maximally parallel derivation
mode, many other derivation modes have been investigated, too. In this paper, many
variants of hierarchical P systems and tissue P systems using different derivation modes
are considered and the effects of using di erent derivation modes, especially the maximally
parallel derivation modes and the maximally parallel set derivation modes, on the
generative and accepting power are illustrated. Moreover, an overview on some control
mechanisms used for (tissue) P systems is given.
Furthermore, besides the standard total halting mode, we also consider different halting
conditions such as unconditional halting and partial halting and explain how the use
of different halting modes may considerably change the computing power of P systems
and tissue P systems
The DBSCAN Clustering Algorithm on P Systems
We show how to implement the DBSCAN clustering algorithm (Density
Based Spatial Clustering of Applications with Noise) on membrane systems using evolution
rules with promoters and priorities
Prediction of secondary structures for large RNA molecules
The prediction of correct secondary structures of large RNAs is one of the unsolved challenges of computational molecular biology. Among the major obstacles is the fact that accurate calculations scale as O(n⁴), so the computational requirements become prohibitive as the length increases. We present a new parallel multicore and scalable program called GTfold, which is one to two orders of magnitude faster than the de facto standard programs mfold and RNAfold for folding large RNA viral sequences and achieves comparable accuracy of prediction. We analyze the algorithm's concurrency and describe the parallelism for a shared memory environment such as a symmetric multiprocessor or multicore chip. We are seeing a paradigm shift to multicore chips and parallelism must be explicitly addressed to continue gaining performance with each new generation of systems.
We provide a rigorous proof of correctness of an optimized algorithm for internal loop calculations called internal loop speedup algorithm (ILSA), which reduces the time complexity of internal loop computations from O(n⁴) to O(n³) and show that the exact algorithms such as ILSA are executed with our method in affordable amount of time. The proof gives insight into solving these kinds of combinatorial problems. We have documented detailed pseudocode of the algorithm for predicting minimum free energy secondary structures which provides a base to implement future algorithmic improvements and improved thermodynamic model in GTfold. GTfold is written in C/C++ and freely available as open source from our website.M.S.Committee Chair: Bader, David; Committee Co-Chair: Heitsch, Christine; Committee Member: Harvey, Stephen; Committee Member: Vuduc, Richar
Tree-formed Verification Data for Trusted Platforms
The establishment of trust relationships to a computing platform relies on
validation processes. Validation allows an external entity to build trust in
the expected behaviour of the platform based on provided evidence of the
platform's configuration. In a process like remote attestation, the 'trusted'
platform submits verification data created during a start up process. These
data consist of hardware-protected values of platform configuration registers,
containing nested measurement values, e.g., hash values, of loaded or started
components. Commonly, the register values are created in linear order by a
hardware-secured operation. Fine-grained diagnosis of components, based on the
linear order of verification data and associated measurement logs, is not
optimal. We propose a method to use tree-formed verification data to validate a
platform. Component measurement values represent leaves, and protected
registers represent roots of a hash tree. We describe the basic mechanism of
validating a platform using tree-formed measurement logs and root registers and
show an logarithmic speed-up for the search of faults. Secure creation of a
tree is possible using a limited number of hardware-protected registers and a
single protected operation. In this way, the security of tree-formed
verification data is maintained.Comment: 15 pages, 11 figures, v3: Reference added, v4: Revised, accepted for
publication in Computers and Securit
A hybrid keyboard-guitar interface using capacitive touch sensing and physical modeling
This paper was presented at the 9th Sound and Music Computing Conference, Copenhagen, Denmark.This paper presents a hybrid interface based on a touch- sensing keyboard which gives detailed expressive control over a physically-modeled guitar. Physical modeling al- lows realistic guitar synthesis incorporating many expres- sive dimensions commonly employed by guitarists, includ- ing pluck strength and location, plectrum type, hand damp- ing and string bending. Often, when a physical model is used in performance, most control dimensions go unused when the interface fails to provide a way to intuitively con- trol them. Techniques as foundational as strumming lack a natural analog on the MIDI keyboard, and few digital controllers provide the independent control of pitch, vol- ume and timbre that even novice guitarists achieve. Our interface combines gestural aspects of keyboard and guitar playing. Most dimensions of guitar technique are control- lable polyphonically, some of them continuously within each note. Mappings are evaluated in a user study of key- boardists and guitarists, and the results demonstrate its playa- bility by performers of both instruments
Membrane Systems with Priority, Dissolution, Promoters and Inhibitors and Time Petri Nets
We continue the investigations on exploring the connection between membrane
systems and time Petri nets already commenced in [4] by extending membrane
systems with promoters/inhibitors, membrane dissolution and priority for rules compared
to the simple symbol-object membrane system. By constructing the simulating
Petri net, we retain one of the main characteristics of the Petri net model, namely, the
firings of the transitions can take place in any order: we do not impose any additional
stipulation on the transition sequences in order to obtain a Petri net model equivalent to
the general Turing machine. Instead, we substantially exploit the gain in computational
strength obtained by the introduction of the timing feature for Petri nets
(Tissue) P Systems with Anti-Membranes
The concept of a matter object being annihilated when meeting its corresponding
anti-matter object is taken over for membranes as objects and anti-membranes
as the corresponding annihilation counterpart in P systems. Natural numbers can be
represented by the corresponding number of membranes with a speci c label. Computational
completeness in this setting then can be obtained with using only elementary
membrane division rules, without using objects. A similar result can be obtained for tissue
P systems with cell division rules and cell / anti-cell annihilation rules. In both cases,
as derivation modes we may take the standard maximally parallel derivation modes as
well as any of the maximally parallel set derivation modes (non-extendable (multi)sets of
rules, (multi)sets with maximal number of rules, (multi)sets of rules a ecting the maximal
number of objects)
- …