662 research outputs found

    Clustering files of chemical structures using the Szekely-Rizzo generalization of Ward's method

    Get PDF
    Ward's method is extensively used for clustering chemical structures represented by 2D fingerprints. This paper compares Ward clusterings of 14 datasets (containing between 278 and 4332 molecules) with those obtained using the Szekely–Rizzo clustering method, a generalization of Ward's method. The clusters resulting from these two methods were evaluated by the extent to which the various classifications were able to group active molecules together, using a novel criterion of clustering effectiveness. Analysis of a total of 1400 classifications (Ward and Székely–Rizzo clustering methods, 14 different datasets, 5 different fingerprints and 10 different distance coefficients) demonstrated the general superiority of the Székely–Rizzo method. The distance coefficient first described by Soergel performed extremely well in these experiments, and this was also the case when it was used in simulated virtual screening experiments

    Development of climate-based thermal comfort ranges from existing data: Analysis of the Smart Controls and thermal comfort (SCATS) database

    Get PDF
    Despite the multifaceted nature of notion of thermal comfort, designers have embraced a very strict definition of it, which consists of very tight and static environments, were transition and stimuli are not admitted, and with very narrow ranges of microclimatic parameters required equally for all the subjects. This neglects all the potential implications related to different users. However, when it comes to thermal comfort, the long-term history of subjects and their climatic background play a pivotal role towards their own thermal sensations and preferences. In this work, to address these diversities, the authors analysed the existing database of the Smart Controls and Thermal Comfort (SCATS) project, which was built from monitoring and survey campaigns conducted in the late 90s in five different European countries. Data were studied by means of statistical techniques to grasp and define the potential combined influence of climatic location, seasonal variations, subjective variables and ventilation modes on the occupants’ thermal feeling and preference. Different scenarios recommended by standard EN 16798 were tested to address the differences in the thermal feelings of users living in different European countries. Finally, country-based operative temperatures that optimize users’ thermal feeling and preference were determined. Results highlight that users in different countries differently evaluate indoor thermal parameters both in terms of thermal feeling and thermal preferences. This results in differences among countries for acceptability levels associated with standardised indoor conditions. Furthermore, the results highlight the importance of air movement to improve acceptability at higher indoor temperatures for all the countries

    The R Package metaLik for Likelihood Inference in Meta-Analysis

    Get PDF
    Meta-analysis is a statistical method for combining information from different studies about the same issue of interest. Meta-analysis is widely diffuse in medical investigation and more recently it received a growing interest also in social disciplines. Typical applications involve a small number of studies, thus making ordinary inferential methods based on first-order asymptotics unreliable. More accurate results can be obtained by exploiting the theory of higher-order asymptotics. This paper describes the metaLik package which provides an R implementation of higher-order likelihood methods in meta-analysis. The extension to meta-regression is included. Two real data examples are used to illustrate the capabilities of the package

    Gaussian Copula Marginal Regression

    Get PDF
    This paper identifies and develops the class of Gaussian copula models for marginal regression analysis of non-normal dependent observations. The class provides a natural extension of traditional linear regression models with normal correlated errors. Any kind of continuous, discrete and categorical responses is allowed. Dependence is conveniently modelled in terms of multivariate normal errors. Inference is performed through a likelihood approach. While the likelihood function is available in closed-form for continuous responses, in the non-continuous setting numerical approximations are used. Residual analysis and a specification test are suggested for validating the adequacy of the assumed multivariate model. Methodology is implemented in a R package called gcmr. Illustrations include simulations and real data applications regarding time series, cross-design data, longitudinal studies, survival analysis and spatial regression

    Hybrid copula mixed models for combining case-control and cohort studies in meta-analysis of diagnostic tests

    Get PDF
    Copula mixed models for trivariate (or bivariate) meta-analysis of diagnostic test accuracy studies accounting (or not) for disease prevalence have been proposed in the biostatistics literature to synthesize information. However, many systematic reviews often include case-control and cohort studies, so one can either focus on the bivariate meta-analysis of the case-control studies or the trivariate meta-analysis of the cohort studies, as only the latter contains information on disease prevalence. In order to remedy this situation of wasting data we propose a hybrid copula mixed model via a combination of the bivariate and trivariate copula mixed model for the data from the case-control studies and cohort studies, respectively. Hence, this hybrid model can account for study design and also due to its generality can deal with dependence in the joint tails. We apply the proposed hybrid copula mixed model to a review of the performance of contemporary diagnostic imaging modalities for detecting metastases in patients with melanoma

    Plausibility functions and exact frequentist inference

    Full text link
    In the frequentist program, inferential methods with exact control on error rates are a primary focus. The standard approach, however, is to rely on asymptotic approximations, which may not be suitable. This paper presents a general framework for the construction of exact frequentist procedures based on plausibility functions. It is shown that the plausibility function-based tests and confidence regions have the desired frequentist properties in finite samples---no large-sample justification needed. An extension of the proposed method is also given for problems involving nuisance parameters. Examples demonstrate that the plausibility function-based method is both exact and efficient in a wide variety of problems.Comment: 21 pages, 5 figures, 3 table

    Halogen species record Antarctic sea ice extent over glacial–interglacial periods

    Get PDF
    Abstract. Sea ice is an integral part of the earth's climate system because it affects planetary albedo, sea-surface salinity, and the atmosphere–ocean exchange of reactive gases and aerosols. Bromine and iodine chemistry is active at polar sea ice margins with the occurrence of bromine explosions and the biological production of organoiodine from sea ice algae. Satellite measurements demonstrate that concentrations of bromine oxide (BrO) and iodine oxide (IO) decrease over sea ice toward the Antarctic interior. Here we present speciation measurements of bromine and iodine in the TALDICE (TALos Dome Ice CorE) ice core (159°11' E, 72°49' S; 2315 m a.s.l.) spanning the last 215 ky. The Talos Dome ice core is located 250 km inland and is sensitive to marine air masses intruding onto the Antarctic Plateau. Talos Dome bromide (Br−) is positively correlated with temperature and negatively correlated with sodium (Na). Based on the Br−/Na seawater ratio, bromide is depleted in the ice during glacial periods and enriched during interglacial periods. Total iodine, consisting of iodide (I−) and iodate (IO3−), peaks during glacials with lower values during interglacial periods. Although IO3− is considered the most stable iodine species in the atmosphere it was only observed in the TALDICE record during glacial maxima. Sea ice dynamics are arguably the primary driver of halogen fluxes over glacial–interglacial timescales, by altering the distance between the sea ice edge and the Antarctic plateau and by altering the surface area of sea ice available to algal colonization. Based on our results we propose the use of both halogens for examining Antarctic variability of past sea ice extent
    • …
    corecore