694 research outputs found

    A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation

    Get PDF
    Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics from the observed data with minimal loss of information. In this article we provide a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature. The methods are split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization. In addition, we introduce two new methods of dimension reduction. The first is a best subset selection method based on Akaike and Bayesian information criteria, and the second uses ridge regression as a regularization procedure. We illustrate the performance of these dimension reduction techniques through the analysis of three challenging models and data sets.Comment: Published in at http://dx.doi.org/10.1214/12-STS406 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Morphological Study of Voids in Ultra-Large Models of Amorphous Silicon

    Get PDF
    The microstructure of voids in pure and hydrogen-rich amorphous silicon (a:Si) network was studied in ultra-large models of amorphous silicon, using classical and quantum- mechanical simulations, on the nanometer length scale. The nanostructure, particularly voids of device grade ultra-large models of a:Si was studied, in which observed three-dimensional realistic voids were extended using geometrical approach within the experimental limit of void-volume fractions. In device-grade simulated models, the effect of void morphology; size, shape, number density, and distribution on simulated scattering intensities in small- angle region were investigated. The evolution of voids on annealing below the crystallization temperature (≤ 800 K) was examined, where the extent of the void reconstruction was reported by using high-quality three-dimensional rendering software and calculating an average size and volume of the voids. Additionally, the role of bonded and non-bonded hydrogens near the vicinity of the void’s wall in a:Si network was observed. Our simulated results suggested that, in extended void structures, X-ray scattering intensities in the small- angle region were sensitive to the number density, size, shape and the distribution of the voids in unequal strength. In both classical and local ab initio molecular dynamics models of a:Si, the reconstruction of the voids were observed but in later models, with and without present hydrogen reconstruction effect was observed greater. The distribution and dynamics of bonded and non-bonded hydrogen in heavily hydrogenated (≥ 14 at.%) ultra-large models of a:Si suggested that, void’s wall were decorated with more silicon dihydride (SiH2) bonds and 9-13% of the total H were realized as molecular hydrogen (H2) respectively from 300 K- 800 K annealing temperature. This work suggested that, a:Si sample with≥14 at.% H and ≤ 0.2% volume-fraction of voids, may be appropriate for interface hydrogenated amorphous silicon/crystalline silicon (a:Si:H/c-Si) material used in heterojunction silicon solar cell to obtain the better-passivated surface due to the presence of mobile non-bonded hydrogens

    Isogeometric Analysis of Acoustic Scattering with Perfectly Matched Layers (IGAPML)

    Get PDF
    The perfectly matched layer (PML) formulation is a prominent way of handling radiation problems in unbounded domain and has gained interest due to its simple implementation in finite element codes. However, its simplicity can be advanced further using the isogeometric framework. This work presents a spline based PML formulation which avoids additional coordinate transformation as the formulation is based on the same space in which the numerical solution is sought. The procedure can be automated for any convex artificial boundary. This removes restrictions on the domain construction using PML and can therefore reduce computational cost and improve mesh quality. The usage of spline basis functions with higher continuity also improves the accuracy of the numerical solution

    07291 Abstracts Collection -- Scientific Visualization

    Get PDF
    From 15.07. to 20.07.07, the Dagstuhl Seminar 07291 ``Scientific Visualization\u27\u27 was held in the International Conference and Research Center (IBFI),Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Modelling and interpreting spectral energy distributions of galaxies with BEAGLE

    Full text link
    We present a new-generation tool to model and interpret spectral energy distributions (SEDs) of galaxies, which incorporates in a consistent way the production of radiation and its transfer through the interstellar and intergalactic media. This flexible tool, named BEAGLE (for BayEsian Analysis of GaLaxy sEds), allows one to build mock galaxy catalogues as well as to interpret any combination of photometric and spectroscopic galaxy observations in terms of physical parameters. The current version of the tool includes versatile modeling of the emission from stars and photoionized gas, attenuation by dust and accounting for different instrumental effects, such as spectroscopic flux calibration and line spread function. We show a first application of the BEAGLE tool to the interpretation of broadband SEDs of a published sample of 104{\sim}10^4 galaxies at redshifts 0.1z80.1 \lesssim z\lesssim8. We find that the constraints derived on photometric redshifts using this multi-purpose tool are comparable to those obtained using public, dedicated photometric-redshift codes and quantify this result in a rigorous statistical way. We also show how the post-processing of BEAGLE output data with the Python extension PYP-BEAGLE allows the characterization of systematic deviations between models and observations, in particular through posterior predictive checks. The modular design of the BEAGLE tool allows easy extensions to incorporate, for example, the absorption by neutral galactic and circumgalactic gas, and the emission from an active galactic nucleus, dust and shock-ionized gas. Information about public releases of the BEAGLE tool will be maintained on http://www.jacopochevallard.org/beagle.Comment: added missing term in equation 4.1 (Erratum submitted to MNRAS

    Data depth and floating body

    Full text link
    Little known relations of the renown concept of the halfspace depth for multivariate data with notions from convex and affine geometry are discussed. Halfspace depth may be regarded as a measure of symmetry for random vectors. As such, the depth stands as a generalization of a measure of symmetry for convex sets, well studied in geometry. Under a mild assumption, the upper level sets of the halfspace depth coincide with the convex floating bodies used in the definition of the affine surface area for convex bodies in Euclidean spaces. These connections enable us to partially resolve some persistent open problems regarding theoretical properties of the depth

    Clustering in the Big Data Era: methods for efficient approximation, distribution, and parallelization

    Get PDF
    Data clustering is an unsupervised machine learning task whose objective is to group together similar items. As a versatile data mining tool, data clustering has numerous applications, such as object detection and localization using data from 3D laser-based sensors, finding popular routes using geolocation data, and finding similar patterns of electricity consumption using smart meters.The datasets in modern IoT-based applications are getting more and more challenging for conventional clustering schemes. Big Data is a term used to loosely describe hard-to-manage datasets. Particularly, large numbers of data points, high rates of data production, large numbers of dimensions, high skewness, and distributed data sources are aspects that challenge the classical data processing schemes, including clustering methods. This thesis contributes to efficient big data clustering for distributed and parallel computing architectures, representative of the processing environments in edge-cloud computing continuum. The thesis also proposes approximation techniques to cope with certain challenging aspects of big data.Regarding distributed clustering, the thesis proposes MAD-C, abbreviating Multi-stage Approximate Distributed Cluster-Combining. MAD-C leverages an approximation-based data synopsis that drastically lowers the required communication bandwidth among the distributed nodes and achieves multiplicative savings in computation time, compared to a baseline that centrally gathers and clusters the data. The thesis shows MAD-C can be used to detect and localize objects using data from distributed 3D laser-based sensors with high accuracy. Furthermore, the work in the thesis shows how to utilize MAD-C to efficiently detect the objects within a restricted area for geofencing purposes.Regarding parallel clustering, the thesis proposes a family of algorithms called PARMA-CC, abbreviating Parallel Multistage Approximate Cluster Combining. Using approximation-based data synopsis, PARMA-CC algorithms achieve scalability on multi-core systems by facilitating parallel execution of threads with limited dependencies which get resolved using fine-grained synchronization techniques. To further enhance the efficiency, PARMA-CC algorithms can be configured with respect to different data properties. Analytical and empirical evaluations show PARMA-CC algorithms achieve significantly higher scalability than the state-of-the-art methods while preserving a high accuracy.On parallel high dimensional clustering, the thesis proposes IP.LSH.DBSCAN, abbreviating Integrated Parallel Density-Based Clustering through Locality-Sensitive Hashing (LSH). IP.LSH.DBSCAN fuses the process of creating an LSH index into the process of data clustering, and it takes advantage of data parallelization and fine-grained synchronization. Analytical and empirical evaluations show IP.LSH.DBSCAN facilitates parallel density-based clustering of massive datasets using desired distance measures resulting in several orders of magnitude lower latency than state-of-the-art for high dimensional data.In essence, the thesis proposes methods and algorithmic implementations targeting the problem of big data clustering and applications using distributed and parallel processing. The proposed methods (available as open source software) are extensible and can be used in combination with other methods

    Spherical Harmonics Models and their Application to non-Spherical Shape Particles

    Get PDF
    The dissertation investigates spherical harmonics method for describing a particle shape. The main object of research is the non-spherical shape particles. The purpose of this dissertation is to create spherical harmonics model for a non-pherical particle. The dissertation also focuses on determining the suitability of the lowresolution spherical harmonics for describing various non-spherical particles. The work approaches a few tasks such as testing the suitability of a spherical harmonics model for simple symmetric particles and applying it to complex shape particles. The first task is formulated aiming to test the modelling concept and strategy using simple shapes. The second task is related to the practical applications, when complex shape particles are considered. The dissertation consists of introduction, 4 chapters, general conclusions, references, a list of publications by the author on the topic of the dissertation, a summary in Lithuanian and 5 annexes. The introduction reveals the investigated problem, importance of the thesis and the object of research, describes the purpose and tasks of the thesis, research methodology, scientific novelty, the practical significance of results and defended statements. The introduction ends in presenting the author’s publications on the topic of the dissertation, offering the material of made presentations in conferences and defining the structure of the dissertation. Chapter 1 revises the literature: the particulate systems and their processes, shapes of the particles and methods for describing the shape, shape indicators. At the end of the chapter, conclusions are drawn and the tasks for the dissertation are reconsidered. Chapter 2 presents the modelling approach and strategies for the points of the particle surface, spherical harmonics, the calculation of the expansion coefficients, integral parameters and curvature and also the conclusions. Chapters 3 and 4 analize the modelling results of the simple and complex particles. At the end of the both chapters conclusions are drawn. 5 articles focusing on the topic of the dissertation have been published: two articles – in the Thomson ISI register, one article – in conference material and scientific papers in Thomson ISI Proceedings data base, one article – in the journal quoted by other international data base, one article – in material reviewed during international conference. 8 presentations on the subject of the dissertation have been given in conferences at national and international levels
    corecore