Search CORE

6 research outputs found

Higher-order chromatin domains link eQTLs with the expression of far-away genes.

Author: Carl Kingsford (5455571)
Geet Duggal (5455586)
Hao Wang (5357873)
Publication venue
Publication date: 01/01/2014
Field of study

Distal expression quantitative trait loci (distal eQTLs) are genetic mutations that affect the expression of genes genomically far away. However, the mechanisms that cause a distal eQTL to modulate gene expression are not yet clear. Recent high-resolution chromosome conformation capture experiments along with a growing database of eQTLs provide an opportunity to understand the spatial mechanisms influencing distal eQTL associations on a genome-wide scale. We test the hypothesis that spatial proximity contributes to eQTL-gene regulation in the context of the higher-order domain structure of chromatin as determined from recent Hi-C chromosome conformation experiments. This analysis suggests that the large-scale topology of chromatin is coupled with eQTL associations by providing evidence that eQTLs are in general spatially close to their target genes, occur often around topological domain boundaries and preferentially associate with genes across domains. We also find that within-domain eQTLs that overlap with regulatory elements such as promoters and enhancers are spatially more close than the overall set of within-domain eQTLs, suggesting that spatial proximity derived from the domain structure in chromatin plays an important role in the regulation of gene expression.</p

Isoform-level Ribosome Occupancy Estimation Guided by Transcript Abundance with Ribomap

Author: C. Joel McManus (3890095)
Carl Kingsford (5455571)
Hao Wang (5357873)
Publication venue
Publication date: 30/06/2018
Field of study

Ribosome profiling is a recently developed high-throughput sequencing technique that captures approximately 30 bp long ribosome-protected mRNA fragments during translation. Because of alternative splicing and repetitive sequences, a ribosome-protected read may map to many places in the transcriptome, leading to discarded or arbitrary mappings when standard approaches are used. We present a technique and software that addresses this problem by assigning reads to potential origins proportional to estimated transcript abundance. This yields a more accurate estimate of ribosome profiles compared with a naïve mapping. Ribomap is available as open source at http://www.cs.cmu.edu/∼ckingsf/software/ribomap.</p

Isoform-level Ribosome Occupancy Estimation Guided by Transcript Abundance with Ribomap

Author: Carl Kingsford (5455571)
Charles Mcmanus (3890095)
Hao Wang (5357873)
Publication venue
Publication date
Field of study

Ribosome profiling is a recently developed high-throughput sequencing technique that captures approximately 30 bp long ribosome-protected mRNA fragments during translation. Because of alternative splicing and repetitive sequences, a ribosome-protected read may map to many places in the transcriptome, leading to discarded or arbitrary mappings when standard approaches are used. We present a technique and software that addresses this problem by assigning reads to potential origins proportional to estimated transcript abundance. This yields a more accurate estimate of ribosome profiles compared with a naïve mapping. Ribomap is available as open source at http://www.cs.cmu.edu/∼ckingsf/software/ribomap.</p

FigShare

Creating Vulnerability Signatures Using Weakest Preconditions

Author: David Brumley (5357621)
Dawn Song (5046983)
Hao Wang (5357873)
Somesh Somesh (5357870)
Publication venue
Publication date: 29/06/2018
Field of study

Signature-based tools such as network intrusion detection systems are widely used to protect critical systems. Automatic signature generation techniques are needed to enable these tools due to the speed at which new vulnerabilities are discovered. In particular, we need automatic techniques which generate sound signatures — signatures which will not mistakenly block legitimate traffic or raise false alarms. In addition, we need signatures to have few false negatives and will catch many different exploit variants. We investigate new techniques for automatically generating sound vulnerability signatures with fewer false negatives than previous research using program binary analysis. The key problem to reducing false negatives is to consider as many as possible different program paths an exploit may take. Previous work considered each possible program path an exploit may take separately, thus generating signatures that are exponential in the size of the number of branches considered. In the exact same scenario, we show how to reduce the overall signature size and the generation time from exponential to polynomial. We do this without requiring any additional assumptions, or relaxing any properties. This efficiency gain allows us to consider many more program paths, which results in reducing the false negatives of generated signatures. We achieve these results by creating algorithms for generating vulnerability signatures that are based on computing weakest preconditions (WP). The weakest precondition for a program path to a vulnerability is a function which matches all exploits that may exploit the vulnerability along that path. We have implemented our techniques and generated signatures for several binary programs. Our results demonstrate that our WP-based algorithm generates more succinct signatures than previous approaches which were based on forward symbolic execution

Towards Automatic Generation of Vulnerability-Based Signatures

Author: David Brumley (5357621)
Dawn Song (5046983)
Hao Wang (5357873)
James Newsome (5356658)
Somesh Jha (5358554)
Publication venue
Publication date: 29/06/2018
Field of study

In this paper we explore the problem of creating vulnerability signatures. A vulnerability signature matches all exploits of a given vulnerability, even polymorphic or metamorphic variants. Our work departs from previous approaches by focusing on the semantics of the program and vulnerability exercised by a sample exploit instead of the semantics or syntax of the exploit itself. We show the semantics of a vulnerability define a language which contains all and only those inputs that exploit the vulnerability. A vulnerability signature is a representation (e.g., a regular expression) of the vulnerability language. Unlike exploit-based signatures whose error rate can only be empirically measured for known test cases, the quality of a vulnerability signature can be formally quantified for all possible inputs. We provide a formal definition of a vulnerability signature and investigate the computational complexity of creating and matching vulnerability signatures. We also systematically explore the design space of vulnerability signatures. We identify three central issues in vulnerability-signature creation: how a vulnerability signature represents the set of inputs that may exercise a vulnerability, the vulnerability coverage (i.e., number of vulnerable program paths) that is subject to our analysis during signature creation, and how a vulnerability signature is then created for a given representation and coverage. We propose new data-flow analysis and novel adoption of existing techniques such as constraint solving for automatically generating vulnerability signatures. We have built a prototype system to test our techniques. Our experiments show that we can automatically generate a vulnerability signature using a single exploit which is of much higher quality than previous exploit-based signatures. In addition, our techniques have several other security applications, and thus may be of independent interest

Resolving spatial inconsistencies in chromosome conformation measurements.

Author: Carl Kingsford (5455571)
Darya Filippova (5455574)
Emre Sefer (5455619)
Geet Duggal (5455586)
Hao Wang (5357873)
Rob Patro (5455577)
Samir Khuller (5455616)
Publication venue
Publication date: 30/06/2018
Field of study

BACKGROUND: Chromosome structure is closely related to its function and Chromosome Conformation Capture (3C) is a widely used technique for exploring spatial properties of chromosomes. 3C interaction frequencies are usually associated with spatial distances. However, the raw data from 3C experiments is an aggregation of interactions from many cells, and the spatial distances of any given interaction are uncertain. RESULTS: We introduce a new method for filtering 3C interactions that selects subsets of interactions that obey metric constraints of various strictness. We demonstrate that, although the problem is computationally hard, near-optimal results are often attainable in practice using well-designed heuristics and approximation algorithms. Further, we show that, compared with a standard technique, this metric filtering approach leads to (a) subgraphs with higher statistical significance, (b) lower embedding error, (c) lower sensitivity to initial conditions of the embedding algorithm, and (d) structures with better agreement with light microscopy measurements. Our filtering scheme is applicable for a strict frequency-to-distance mapping and a more relaxed mapping from frequency to a range of distances. CONCLUSIONS: Our filtering method for 3C data considers both metric consistency and statistical confidence simultaneously resulting in lower-error embeddings that are biologically more plausible.</p

FigShare