8,206 research outputs found

    Maximum likelihood estimation using composite likelihoods for closed exponential families

    Get PDF
    In certain multivariate problems the full probability density has an awkward normalizing constant, but the conditional and/or marginal distributions may be much more tractable. In this paper we investigate the use of composite likelihoods instead of the full likelihood. For closed exponential families, both are shown to be maximized by the same parameter values for any number of observations. Examples include log-linear models and multivariate normal models. In other cases the parameter estimate obtained by maximizing a composite likelihood can be viewed as an approximation to the full maximum likelihood estimate. An application is given to an example in directional data based on a bivariate von Mises distributio

    Isolation of environmental lignin-degrading bacteria and identification of extracellular enzymes

    Get PDF
    A novel screening method for detecting lignin-degradation activity on agar plates was developed using nitrated lignin. Using this method, ten lignindegrading bacteria have been isolated from environmental sources, including seven mesophilic soil bacteria and three thermotolerant strains from composted wheat straw. All of the isolates have demonstrated activity towards lignin degradation in the assays, the most active strain being a thermotolerant Sphingobacterium strain from the Bacteroidetes family. The ability of each strain to degrade a variety of aromatic carbon sources and size-fractionated Kraft lignin has been examined by laboratory-scale growth experiments and gel filtration chromatography respectively, and the bioconversion of different lignin-containing feedstocks by three of the most active strains has been examined in a series of laboratory-scale fermentation experiments. Purification of extracellular lignin-degrading enzymes from the culture supernatant of Sphingobacterium sp. has highlighted several different enzyme activities and possible lignin-degrading enzymes

    Forests of Stumps

    Get PDF
    Many numerical studies (Hansen and Salamon (1990), Schapire (1990)) indicate that bagged decision stumps perform more accurately than a single stump. In this work, we will investigate two approaches to create a forest of stumps for classification. The first method is bagging with stumps, that is growing a stump on different bootstrap sample size drawn from the training dataset. The second method is Gini-sampled stumps, where we sample split points with probability proportional to the Gini index. These two methods are combined with two aggregation methods: Majority vote and weighted vote. We use simulation studies to compare the performance and consumed time for these two methods. The computing time of generating split points by Gini-sampled stumps is less than half of the time needed to generate split points from bootstrap samples. Also, weighted vote aggregation results in more accurate performance than majority vote aggregation

    Statistical analysis of particulate matter data in Doha, Qatar

    Get PDF
    Pollution in Doha is measured using passive, active and automatic sampling. In this paper we consider data automatically sampled in which various pollutants were continually collected and analysed every hour. At each station the sample is analysed on-line and in real time and the data is stored within the analyser, or a separate logger so it can be downloaded remotely by a modem. The accuracy produced enables pollution episodes to be analysed in detail and related to traffic flows, meteorology and other variables. Data has been collected hourly over more than 6 years at 3 different locations, with measurements available for various pollutants – for example, ozone, nitrogen oxides, sulphur dioxide, carbon monoxide, THC, methane and particulate matter (PM1.0, PM2.5 and PM10), as well as meteorological data such as humidity, temperature, and wind speed and direction. Despite much care in the data collection process, the resultant data has long stretches of missing values, when the equipment has malfunctioned – often as a result of more extreme conditions. Our analysis is twofold. Firstly, we consider ways to “clean” the data, by imputing missing values, including identified outliers. The second aspect specifically considers prediction of each particulate (PM1.0, PM2.5 and PM10) 24 hours ahead, using current (and previous) pollution and meteorological data. In this case, we use vector autoregressive models, compare with decision trees and propose variable selection criteria which explicitly adapt to missing data. Our results show that the regression tree models, with no variable transformations, perform the best, and that attempts to impute missing values are hampered by non-random missingness

    Local polynomial regression for circular predictors

    No full text
    We consider local smoothing of datasets where the design space is the d-dimensional (d >= 1) torus and the response variable is real-valued. Our purpose is to extend least squares local polynomial fitting to this situation. We give both theoretical and empirical results

    Nonparametric regression for spherical data

    Get PDF

    Circular local likelihood

    Get PDF

    The package: nonparametric regression using local rotation matrices in

    Get PDF
    The package implements nonparametric (smooth) regression for spherical data in , and is freely available from the Comprehensive Archive Network (CRAN), licensed under the MIT License. It can be use..
    • 

    corecore