12,089 research outputs found
Quantitative transcription factor binding kinetics at the single-molecule level
We have investigated the binding interaction between the bacteriophage lambda
repressor CI and its target DNA using total internal reflection fluorescence
microscopy. Large, step-wise changes in the intensity of the red fluorescent
protein fused to CI were observed as it associated and dissociated from
individually labeled single molecule DNA targets. The stochastic association
and dissociation were characterized by Poisson statistics. Dark and bright
intervals were measured for thousands of individual events. The exponential
distribution of the intervals allowed direct determination of the association
and dissociation rate constants, ka and kd respectively. We resolved in detail
how ka and kd varied as a function of 3 control parameters, the DNA length L,
the CI dimer concentration, and the binding affinity. Our results show that
although interaction with non-operator DNA sequences are observable, CI binding
to the operator site is not dependent on the length of flanking non-operator
DNA.Comment: 34 pages, 10 figures, accepted by Biophysical Journa
Unconventional machine learning of genome-wide human cancer data
Recent advances in high-throughput genomic technologies coupled with
exponential increases in computer processing and memory have allowed us to
interrogate the complex aberrant molecular underpinnings of human disease from
a genome-wide perspective. While the deluge of genomic information is expected
to increase, a bottleneck in conventional high-performance computing is rapidly
approaching. Inspired in part by recent advances in physical quantum
processors, we evaluated several unconventional machine learning (ML)
strategies on actual human tumor data. Here we show for the first time the
efficacy of multiple annealing-based ML algorithms for classification of
high-dimensional, multi-omics human cancer data from the Cancer Genome Atlas.
To assess algorithm performance, we compared these classifiers to a variety of
standard ML methods. Our results indicate the feasibility of using
annealing-based ML to provide competitive classification of human cancer types
and associated molecular subtypes and superior performance with smaller
training datasets, thus providing compelling empirical evidence for the
potential future application of unconventional computing architectures in the
biomedical sciences
From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator Ansatz
The next few years will be exciting as prototype universal quantum processors
emerge, enabling implementation of a wider variety of algorithms. Of particular
interest are quantum heuristics, which require experimentation on quantum
hardware for their evaluation, and which have the potential to significantly
expand the breadth of quantum computing applications. A leading candidate is
Farhi et al.'s Quantum Approximate Optimization Algorithm, which alternates
between applying a cost-function-based Hamiltonian and a mixing Hamiltonian.
Here, we extend this framework to allow alternation between more general
families of operators. The essence of this extension, the Quantum Alternating
Operator Ansatz, is the consideration of general parametrized families of
unitaries rather than only those corresponding to the time-evolution under a
fixed local Hamiltonian for a time specified by the parameter. This ansatz
supports the representation of a larger, and potentially more useful, set of
states than the original formulation, with potential long-term impact on a
broad array of application areas. For cases that call for mixing only within a
desired subspace, refocusing on unitaries rather than Hamiltonians enables more
efficiently implementable mixers than was possible in the original framework.
Such mixers are particularly useful for optimization problems with hard
constraints that must always be satisfied, defining a feasible subspace, and
soft constraints whose violation we wish to minimize. More efficient
implementation enables earlier experimental exploration of an alternating
operator approach to a wide variety of approximate optimization, exact
optimization, and sampling problems. Here, we introduce the Quantum Alternating
Operator Ansatz, lay out design criteria for mixing operators, detail mappings
for eight problems, and provide brief descriptions of mappings for diverse
problems.Comment: 51 pages, 2 figures. Revised to match journal pape
K-Nearest-Neighbors Induced Topological PCA for scRNA Sequence Data Analysis
Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity
in cells, which has given us insights into cell-cell communication, cell
differentiation, and differential gene expression. However, analyzing scRNA-seq
data is a challenge due to sparsity and the large number of genes involved.
Therefore, dimensionality reduction and feature selection are important for
removing spurious signals and enhancing downstream analysis. Traditional PCA, a
main workhorse in dimensionality reduction, lacks the ability to capture
geometrical structure information embedded in the data, and previous graph
Laplacian regularizations are limited by the analysis of only a single scale.
We propose a topological Principal Components Analysis (tPCA) method by the
combination of persistent Laplacian (PL) technique and L norm
regularization to address multiscale and multiclass heterogeneity issues in
data. We further introduce a k-Nearest-Neighbor (kNN) persistent Laplacian
technique to improve the robustness of our persistent Laplacian method. The
proposed kNN-PL is a new algebraic topology technique which addresses the many
limitations of the traditional persistent homology. Rather than inducing
filtration via the varying of a distance threshold, we introduced kNN-tPCA,
where filtrations are achieved by varying the number of neighbors in a kNN
network at each step, and find that this framework has significant implications
for hyper-parameter tuning. We validate the efficacy of our proposed tPCA and
kNN-tPCA methods on 11 diverse benchmark scRNA-seq datasets, and showcase that
our methods outperform other unsupervised PCA enhancements from the literature,
as well as popular Uniform Manifold Approximation (UMAP), t-Distributed
Stochastic Neighbor Embedding (tSNE), and Projection Non-Negative Matrix
Factorization (NMF) by significant margins.Comment: 28 pages, 11 figure
Understanding Algorithm Performance on an Oversubscribed Scheduling Application
The best performing algorithms for a particular oversubscribed scheduling
application, Air Force Satellite Control Network (AFSCN) scheduling, appear to
have little in common. Yet, through careful experimentation and modeling of
performance in real problem instances, we can relate characteristics of the
best algorithms to characteristics of the application. In particular, we find
that plateaus dominate the search spaces (thus favoring algorithms that make
larger changes to solutions) and that some randomization in exploration is
critical to good performance (due to the lack of gradient information on the
plateaus). Based on our explanations of algorithm performance, we develop a new
algorithm that combines characteristics of the best performers; the new
algorithms performance is better than the previous best. We show how hypothesis
driven experimentation and search modeling can both explain algorithm
performance and motivate the design of a new algorithm
- …