65 research outputs found
Nonconventional Large Deviations Theorems
We obtain large deviations theorems for nonconventional sums with underlying
process being a Markov process satisfying the Doeblin condition or a dynamical
system such as subshift of finite type or hyperbolic or expanding
transformation
Detecting Outliers in Data with Correlated Measures
Advances in sensor technology have enabled the collection of large-scale
datasets. Such datasets can be extremely noisy and often contain a significant
amount of outliers that result from sensor malfunction or human operation
faults. In order to utilize such data for real-world applications, it is
critical to detect outliers so that models built from these datasets will not
be skewed by outliers.
In this paper, we propose a new outlier detection method that utilizes the
correlations in the data (e.g., taxi trip distance vs. trip time). Different
from existing outlier detection methods, we build a robust regression model
that explicitly models the outliers and detects outliers simultaneously with
the model fitting.
We validate our approach on real-world datasets against methods specifically
designed for each dataset as well as the state of the art outlier detectors.
Our outlier detection method achieves better performances, demonstrating the
robustness and generality of our method. Last, we report interesting case
studies on some outliers that result from atypical events.Comment: 10 page
A Simple Baseline for Travel Time Estimation using Large-Scale Trip Data
The increased availability of large-scale trajectory data around the world
provides rich information for the study of urban dynamics. For example, New
York City Taxi Limousine Commission regularly releases source-destination
information about trips in the taxis they regulate. Taxi data provide
information about traffic patterns, and thus enable the study of urban flow --
what will traffic between two locations look like at a certain date and time in
the future? Existing big data methods try to outdo each other in terms of
complexity and algorithmic sophistication. In the spirit of "big data beats
algorithms", we present a very simple baseline which outperforms
state-of-the-art approaches, including Bing Maps and Baidu Maps (whose APIs
permit large scale experimentation). Such a travel time estimation baseline has
several important uses, such as navigation (fast travel time estimates can
serve as approximate heuristics for A search variants for path finding) and
trip planning (which uses operating hours for popular destinations along with
travel time estimates to create an itinerary).Comment: 12 page
Ruelle-Perron-Frobenius spectrum for Anosov maps
We extend a number of results from one dimensional dynamics based on spectral
properties of the Ruelle-Perron-Frobenius transfer operator to Anosov
diffeomorphisms on compact manifolds. This allows to develop a direct operator
approach to study ergodic properties of these maps. In particular, we show that
it is possible to define Banach spaces on which the transfer operator is
quasicompact. (Information on the existence of an SRB measure, its smoothness
properties and statistical properties readily follow from such a result.) In
dimension we show that the transfer operator associated to smooth random
perturbations of the map is close, in a proper sense, to the unperturbed
transfer operator. This allows to obtain easily very strong spectral stability
results, which in turn imply spectral stability results for smooth
deterministic perturbations as well. Finally, we are able to implement an Ulam
type finite rank approximation scheme thus reducing the study of the spectral
properties of the transfer operator to a finite dimensional problem.Comment: 58 pages, LaTe
Stochastic stability versus localization in chaotic dynamical systems
We prove stochastic stability of chaotic maps for a general class of Markov
random perturbations (including singular ones) satisfying some kind of mixing
conditions. One of the consequences of this statement is the proof of Ulam's
conjecture about the approximation of the dynamics of a chaotic system by a
finite state Markov chain. Conditions under which the localization phenomenon
(i.e. stabilization of singular invariant measures) takes place are also
considered. Our main tools are the so called bounded variation approach
combined with the ergodic theorem of Ionescu-Tulcea and Marinescu, and a random
walk argument that we apply to prove the absence of ``traps'' under the action
of random perturbations.Comment: 27 pages, LaTe
Convergence to equilibrium for many particle systems
The goal of this paper is to give a short review of recent results of the
authors concerning classical Hamiltonian many particle systems. We hope that
these results support the new possible formulation of Boltzmann's ergodicity
hypothesis which sounds as follows. For almost all potentials, the minimal
contact with external world, through only one particle of , is sufficient
for ergodicity. But only if this contact has no memory. Also new results for
quantum case are presented
Dissipation time and decay of correlations
We consider the effect of noise on the dynamics generated by
volume-preserving maps on a d-dimensional torus. The quantity we use to measure
the irreversibility of the dynamics is the dissipation time. We focus on the
asymptotic behaviour of this time in the limit of small noise. We derive
universal lower and upper bounds for the dissipation time in terms of various
properties of the map and its associated propagators: spectral properties,
local expansivity, and global mixing properties. We show that the dissipation
is slow for a general class of non-weakly-mixing maps; on the opposite, it is
fast for a large class of exponentially mixing systems which include uniformly
expanding maps and Anosov diffeomorphisms.Comment: 26 Pages, LaTex. Submitted to Nonlinearit
Predictive metabolites for incident myocardial infarction:a two-step meta-analysis of individual patient data from six cohorts comprising 7,897 individuals from the the COnsortium of METabolomic Studies
Aims: Myocardial infarction (MI) is a major cause of death and disability worldwide. Most metabolomics studies investigating metabolites predicting MI are limited by the participant number and/or the demographic diversity. We sought to identify biomarkers of incident MI in the COnsortium of METabolomics Studies. Methods and results: We included 7897 individuals aged on average 66 years from six intercontinental cohorts with blood metabolomic profiling (n = 1428 metabolites, of which 168 were present in at least three cohorts with over 80% prevalence) and MI information (1373 cases). We performed a two-stage individual patient data meta-analysis. We first assessed the associations between circulating metabolites and incident MI for each cohort adjusting for traditional risk factors and then performed a fixed effect inverse variance meta-analysis to pull the results together. Finally, we conducted a pathway enrichment analysis to identify potential pathways linked to MI. On meta-analysis, 56 metabolites including 21 lipids and 17 amino acids were associated with incident MI after adjusting for multiple testing (false discovery rate < 0.05), and 10 were novel. The largest increased risk was observed for the carbohydrate mannitol/sorbitol {hazard ratio [HR] [95% confidence interval (CI)] = 1.40 [1.26-1.56], P < 0.001}, whereas the largest decrease in risk was found for glutamine [HR (95% CI) = 0.74 (0.67-0.82), P < 0.001]. Moreover, the identified metabolites were significantly enriched (corrected P < 0.05) in pathways previously linked with cardiovascular diseases, including aminoacyl-tRNA biosynthesis. Conclusions: In the most comprehensive metabolomic study of incident MI to date, 10 novel metabolites were associated with MI. Metabolite profiles might help to identify high-risk individuals before disease onset. Further research is needed to fully understand the mechanisms of action and elaborate pathway findings.This research was funded in whole, or in part, by the Wellcome Trust (WT212904/Z/18/Z) and by the UKRI Medical Research Council (MRC)/British Heart Foundation Ancestry and Biological Informative Markers for Stratification of Hypertension (AIM-HY; MR/M016560/1). For the purpose of open access, the authors have applied a CC BY public copyright to any author-accepted manuscript version arising from this submission. TwinsUK receives funding from the Wellcome Trust, the European Commission H2020 grants SYSCID (contract #733100), the National Institute for Health Research (NIHR) Clinical Research Facility and the Biomedical Research Centre based at Guy's and St Thomas’ NHS Foundation Trust in partnership with King's College London, the Chronic Disease Research Foundation, the UKRI Medical Research Council (MRC)/British Heart Foundation Ancestry and Biological Informative Markers for Stratification of Hypertension (AIM-HY; MR/M016560/1), and Zoe Limited. C.M. and A.N. are funded by the Chronic Disease Research Foundation. C.M. is also funded by the MRC AIM-HY grant. The Atherosclerosis Risk in Communities (ARIC) study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (75N92022D00001, 75N92022D00002, 75N92022D00003, 75N92022D00004, and 75N92022D00005). The authors thank the staff and participants of the ARIC study for their important contributions). B.Y. was in part supported by R01HL168683. Metabolomics measurements were sponsored by the National Human Genome Research Institute (3U01HG004402-02S1). The ET2DS was funded by the Medical Research Council (UK) (Project Grant G0500877) and the Chief Scientist Office of Scotland (Program Support Grand CZQ/1/38). C.B. was funded by the grant FIS-FEDER-ISCIII PI16/00620 (Ext 2021) and the Strategic Plan for Research and Innovation in Health, CatSalut, PERIS STL008 (2019–2021), and RICORS RD21/0005, to develop clinical and epidemiological studies mainly focused on diabetes and its associations with new biomarkers. HABC was supported in part by the Intramural Research Program of the National Institutes of Health, National Institute on Aging (NIA); contracts: N01-AG-6-2101, N01-AG-6-2103, and N01-AG-6-2106; NIA grant: R01-AG028050, and NINR grant R01-NR012459; and the National Center for Advancing Translational Sciences of the National Institutes of Health under Award Number UL1TR000454. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Dr Murphy is supported by the Michael Smith Foundation for Health Research (grant #17644). The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through 75N92021D00001, 75N92021D00002, 75N92021D00003, 75N92021D00004, and 75N92021D00005. The authors thank the WHI investigators and staff for their dedication and the study participants for making the program possible. A full listing of WHI investigators can be found at https://www-whi-org.s3.us-west-2.amazonaws.com/wp-content/uploads/WHI-Investigator-Long-List.pdf
- …