167 research outputs found
Central Limit Theorems and Approximation Theory: Part I
Central limit theorems (CLTs) have a long history in probability and
statistics. They play a fundamental role in constructing valid statistical
inference procedures. Over the last century, various techniques have been
developed in probability and statistics to prove CLTs under a variety of
assumptions on random variables. Quantitative versions of CLTs (e.g.,
Berry--Esseen bounds) have also been parallelly developed. In this article, we
propose to use approximation theory from functional analysis to derive explicit
bounds on the difference between expectations of functions.Comment: 25 page
Nested conformal prediction and quantile out-of-bag ensemble methods
Conformal prediction is a popular tool for providing valid prediction sets
for classification and regression problems, without relying on any
distributional assumptions on the data. While the traditional description of
conformal prediction starts with a nonconformity score, we provide an alternate
(but equivalent) view that starts with a sequence of nested sets and calibrates
them to find a valid prediction set. The nested framework subsumes all
nonconformity scores, including recent proposals based on quantile regression
and density estimation. While these ideas were originally derived based on
sample splitting, our framework seamlessly extends them to other aggregation
schemes like cross-conformal, jackknife+ and out-of-bag methods. We use the
framework to derive a new algorithm (QOOB, pronounced cube) that combines four
ideas: quantile regression, cross-conformalization, ensemble methods and
out-of-bag predictions. We develop a computationally efficient implementation
of cross-conformal, that is also used by QOOB. In a detailed numerical
investigation, QOOB performs either the best or close to the best on all
simulated and real datasets.Comment: 38 pages, 5 figures, 8 table
Chemical and biochemical studies of ubiquitin conjugation machinery
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Biology, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references.The post-translational modification of proteins is a major mechanism employed in eukaryotic cells to expand the functional diversity of the proteome. Covalent modification of amino acid side chains confers new or altered functionality to the modified protein by creating new recognition surfaces on the protein for the interaction of nucleic acids or other proteins, modulating enzymatic activity, or altering cellular localization or half-life. The post-translational modification of proteins with ubiquitin (Ub) is an important mechanism of regulating protein function. Ub is a 76-residue protein that is primarily attached to lysine residues in target proteins through an enzymatic cascade catalyzed by E1, E2, and E3 enzymes. Ub conjugation is important for fundamental cellular processes, including transcription, DNA repair, endocytosis, apoptosis, and signal transduction. Ub conjugation is reversible. Proteases termed deubiquitinating enzymes (DUBs) function to remove Ub from target proteins. Genome sequencing efforts have uncovered the existence of many predicted enzymes with unknown function. Many enzymes have been assigned function based on sequence homology to proteins with known function without confirmation of enzymatic activity. A powerful chemical approach to determine enzyme function from a complex mixture of proteins is activity-based protein profiling. This method makes use of chemical probes that are active site-directed for the assignment of function to proteins. We describe here the design and generation of an expanded set of Ub-based chemical probes with which we identified and recovered E1, E2, and E3 Ub ligases from cell lysates. Furthermore, we describe the biochemical and structural characterization of the catalytic domain of one E3 Ub ligase we recovered, HUWE1, and the identification of a structural element within the catalytic domain of HUWE1 that modulates its activity. Finally, we discuss a protein engineering method that we are applying to the HUWE1 catalytic domain to understand how the conformational flexibility of this domain is important to its function.by Renuka K. Pandya.Ph.D
TMC-1C: an accreting starless core
We have mapped the starless core TMC-1C in a variety of molecular lines with
the IRAM 30m telescope. High density tracers show clear signs of
self-absorption and sub-sonic infall asymmetries are present in N2H+ (1-0) and
DCO+ (2-1) lines. The inward velocity profile in N2H+ (1-0) is extended over a
region of about 7,000 AU in radius around the dust continuum peak, which is the
most extended ``infalling'' region observed in a starless core with this
tracer. The kinetic temperature (~12 K) measured from C17O and C18O suggests
that their emission comes from a shell outside the colder interior traced by
the mm continuum dust. The C18O (2-1) excitation temperature drops from 12 K to
~10 K away from the center. This is consistent with a volume density drop of
the gas traced by the C18O lines, from ~4x10^4 cm^-3 towards the dust peak to
~6x10^3 cm^-3 at a projected distance from the dust peak of 80" (or 11,000 AU).
The column density implied by the gas and dust show similar N2H+ and CO
depletion factors (f_D < 6). This can be explained with a simple scenario in
which: (i) the TMC-1C core is embedded in a relatively dense environment (H2
~10^4 cm^-3), where CO is mostly in the gas phase and the N2H+ abundance had
time to reach equilibrium values; (ii) the surrounding material (rich in CO and
N2H+) is accreting onto the dense core nucleus; (iii) TMC-1C is older than
3x10^5 yr, to account for the observed abundance of N2H+ across the core
(~10^-10 w.r.t. H2); and (iv) the core nucleus is either much younger (~10^4
yr) or ``undepleted'' material from the surrounding envelope has fallen towards
it in the past 10,000 yr.Comment: 29 pages, including 5 tables and 15 figure
- ā¦