3 research outputs found
Numerical Simulations of the Dark Universe: State of the Art and the Next Decade
We present a review of the current state of the art of cosmological dark
matter simulations, with particular emphasis on the implications for dark
matter detection efforts and studies of dark energy. This review is intended
both for particle physicists, who may find the cosmological simulation
literature opaque or confusing, and for astro-physicists, who may not be
familiar with the role of simulations for observational and experimental probes
of dark matter and dark energy. Our work is complementary to the contribution
by M. Baldi in this issue, which focuses on the treatment of dark energy and
cosmic acceleration in dedicated N-body simulations. Truly massive dark
matter-only simulations are being conducted on national supercomputing centers,
employing from several billion to over half a trillion particles to simulate
the formation and evolution of cosmologically representative volumes (cosmic
scale) or to zoom in on individual halos (cluster and galactic scale). These
simulations cost millions of core-hours, require tens to hundreds of terabytes
of memory, and use up to petabytes of disk storage. The field is quite
internationally diverse, with top simulations having been run in China, France,
Germany, Korea, Spain, and the USA. Predictions from such simulations touch on
almost every aspect of dark matter and dark energy studies, and we give a
comprehensive overview of this connection. We also discuss the limitations of
the cold and collisionless DM-only approach, and describe in some detail
efforts to include different particle physics as well as baryonic physics in
cosmological galaxy formation simulations, including a discussion of recent
results highlighting how the distribution of dark matter in halos may be
altered. We end with an outlook for the next decade, presenting our view of how
the field can be expected to progress. (abridged)Comment: 54 pages, 4 figures, 3 tables; invited contribution to the special
issue "The next decade in Dark Matter and Dark Energy" of the new Open Access
journal "Physics of the Dark Universe". Replaced with accepted versio
Doctor of Philosophy
dissertationKernel smoothing provides a simple way of finding structures in data sets without the imposition of a parametric model, for example, nonparametric regression and density estimates. However, in many data-intensive applications, the data set could be large. Thus, evaluating a kernel density estimate or kernel regression over the data set directly can be prohibitively expensive in big data. This dissertation is working on how to efficiently find a smaller data set that can approximate the original data set with a theoretical guarantee in the kernel smoothing setting and how to extend it to more general smooth range spaces. For kernel density estimates, we propose randomized and deterministic algorithms with quality guarantees that are orders of magnitude more efficient than previous algorithms, which do not require knowledge of the kernel or its bandwidth parameter and are easily parallelizable. Our algorithms are applicable to any large-scale data processing framework. We then further investigate how to measure the error between two kernel density estimates, which is usually measured either in L1 or L2 error. In this dissertation, we investigate the challenges in using a stronger error, L ∞ (or worst case) error. We present efficient solutions for how to estimate the L∞ error and how to choose the bandwidth parameter for a kernel density estimate built on a subsample of a large data set. We next extend smoothed versions of geometric range spaces from kernel range spaces to more general types of ranges, so that an element of the ground set can be contained in a range with a non-binary value in [0,1]. We investigate the approximation of these range spaces through ϵ-nets and ϵ-samples. Finally, we study coresets algorithms for kernel regression. The size of the coresets are independent of the size of the data set, rather they only depend on the error guarantee, and in some cases the size of domain and amount of smoothing. We evaluate our methods on very large time series and spatial data, demonstrate that they can be constructed extremely efficiently, and allow for great computational gains