6 research outputs found
Higher-order chromatin domains link eQTLs with the expression of far-away genes.
Distal expression quantitative trait loci (distal eQTLs) are genetic mutations that affect the expression of genes genomically far away. However, the mechanisms that cause a distal eQTL to modulate gene expression are not yet clear. Recent high-resolution chromosome conformation capture experiments along with a growing database of eQTLs provide an opportunity to understand the spatial mechanisms influencing distal eQTL associations on a genome-wide scale. We test the hypothesis that spatial proximity contributes to eQTL-gene regulation in the context of the higher-order domain structure of chromatin as determined from recent Hi-C chromosome conformation experiments. This analysis suggests that the large-scale topology of chromatin is coupled with eQTL associations by providing evidence that eQTLs are in general spatially close to their target genes, occur often around topological domain boundaries and preferentially associate with genes across domains. We also find that within-domain eQTLs that overlap with regulatory elements such as promoters and enhancers are spatially more close than the overall set of within-domain eQTLs, suggesting that spatial proximity derived from the domain structure in chromatin plays an important role in the regulation of gene expression.</p
Isoform-level Ribosome Occupancy Estimation Guided by Transcript Abundance with Ribomap
Ribosome profiling is a recently developed high-throughput sequencing technique that captures approximately 30 bp long ribosome-protected mRNA fragments during translation. Because of alternative splicing and repetitive sequences, a ribosome-protected read may map to many places in the transcriptome, leading to discarded or arbitrary mappings when standard approaches are used. We present a technique and software that addresses this problem by assigning reads to potential origins proportional to estimated transcript abundance. This yields a more accurate estimate of ribosome profiles compared with a naïve mapping. Ribomap is available as open source at http://www.cs.cmu.edu/∼ckingsf/software/ribomap.</p
Isoform-level Ribosome Occupancy Estimation Guided by Transcript Abundance with Ribomap
<p>Ribosome profiling is a recently developed high-throughput sequencing technique that captures approximately 30 bp long ribosome-protected mRNA fragments during translation. Because of alternative splicing and repetitive sequences, a ribosome-protected read may map to many places in the transcriptome, leading to discarded or arbitrary mappings when standard approaches are used. We present a technique and software that addresses this problem by assigning reads to potential origins proportional to estimated transcript abundance. This yields a more accurate estimate of ribosome profiles compared with a naïve mapping. Ribomap is available as open source at http://www.cs.cmu.edu/∼ckingsf/software/ribomap.</p
Creating Vulnerability Signatures Using Weakest Preconditions
Signature-based tools such as network intrusion detection
systems are widely used to protect critical systems. Automatic
signature generation techniques are needed to enable
these tools due to the speed at which new vulnerabilities
are discovered. In particular, we need automatic
techniques which generate sound signatures — signatures
which will not mistakenly block legitimate traffic or raise
false alarms. In addition, we need signatures to have few
false negatives and will catch many different exploit variants.
We investigate new techniques for automatically generating
sound vulnerability signatures with fewer false negatives
than previous research using program binary analysis.
The key problem to reducing false negatives is to consider
as many as possible different program paths an exploit may
take. Previous work considered each possible program path
an exploit may take separately, thus generating signatures
that are exponential in the size of the number of branches
considered. In the exact same scenario, we show how to
reduce the overall signature size and the generation time
from exponential to polynomial. We do this without requiring
any additional assumptions, or relaxing any properties.
This efficiency gain allows us to consider many more program
paths, which results in reducing the false negatives
of generated signatures. We achieve these results by creating
algorithms for generating vulnerability signatures that
are based on computing weakest preconditions (WP). The
weakest precondition for a program path to a vulnerability
is a function which matches all exploits that may exploit the
vulnerability along that path.
We have implemented our techniques and generated signatures
for several binary programs. Our results demonstrate
that our WP-based algorithm generates more succinct
signatures than previous approaches which were
based on forward symbolic execution
Towards Automatic Generation of Vulnerability-Based Signatures
In this paper we explore the problem of creating vulnerability
signatures. A vulnerability signature matches all exploits of a given vulnerability, even polymorphic or metamorphic variants. Our work departs from previous approaches by focusing on the semantics of the program and
vulnerability exercised by a sample exploit instead of the
semantics or syntax of the exploit itself. We show the semantics of a vulnerability define a language which contains
all and only those inputs that exploit the vulnerability. A
vulnerability signature is a representation (e.g., a regular
expression) of the vulnerability language. Unlike exploit-based signatures whose error rate can only be empirically
measured for known test cases, the quality of a vulnerability
signature can be formally quantified for all possible inputs.
We provide a formal definition of a vulnerability signature and investigate the computational complexity of creating and matching vulnerability signatures. We also systematically explore the design space of vulnerability signatures.
We identify three central issues in vulnerability-signature
creation: how a vulnerability signature represents the set
of inputs that may exercise a vulnerability, the vulnerability
coverage (i.e., number of vulnerable program paths) that is
subject to our analysis during signature creation, and how
a vulnerability signature is then created for a given representation and coverage.
We propose new data-flow analysis and novel adoption
of existing techniques such as constraint solving for automatically generating vulnerability signatures. We have
built a prototype system to test our techniques. Our experiments show that we can automatically generate a vulnerability signature using a single exploit which is of much
higher quality than previous exploit-based signatures. In
addition, our techniques have several other security applications, and thus may be of independent interest
Resolving spatial inconsistencies in chromosome conformation measurements.
<p>BACKGROUND: Chromosome structure is closely related to its function and Chromosome Conformation Capture (3C) is a widely used technique for exploring spatial properties of chromosomes. 3C interaction frequencies are usually associated with spatial distances. However, the raw data from 3C experiments is an aggregation of interactions from many cells, and the spatial distances of any given interaction are uncertain.</p>
<p>RESULTS: We introduce a new method for filtering 3C interactions that selects subsets of interactions that obey metric constraints of various strictness. We demonstrate that, although the problem is computationally hard, near-optimal results are often attainable in practice using well-designed heuristics and approximation algorithms. Further, we show that, compared with a standard technique, this metric filtering approach leads to (a) subgraphs with higher statistical significance, (b) lower embedding error, (c) lower sensitivity to initial conditions of the embedding algorithm, and (d) structures with better agreement with light microscopy measurements. Our filtering scheme is applicable for a strict frequency-to-distance mapping and a more relaxed mapping from frequency to a range of distances.</p>
<p>CONCLUSIONS: Our filtering method for 3C data considers both metric consistency and statistical confidence simultaneously resulting in lower-error embeddings that are biologically more plausible.</p