15 research outputs found

    Quantile regression with interval-censored data in questionnaire-based studies

    Get PDF
    Interval-censored data can arise in questionnaire-based studies when the respondent gives an answer in the form of an interval without having pre-specified ranges. Such data are called self-selected interval data. In this case, the assumption of independent censoring is not fulfilled, and therefore the ordinary methods for interval-censored data are not suitable. This paper explores a quantile regression model for self-selected interval data and suggests an estimator based on estimating equations. The consistency of the estimator is shown. Bootstrap procedures for constructing confidence intervals are considered. A simulation study indicates satisfactory performance of the proposed methods. An application to data concerning price estimates is presented

    Identification of novel esterase-active enzymes from hot environments by use of the host bacterium Thermus thermophilus

    Get PDF
    Functional metagenomic screening strategies, which are independent of known sequence information, can lead to the identification of truly novel genes and enzymes. Since E. coli has been used exhaustively for this purpose as a host, it is important to establish alternative expression hosts and to use them for functional metagenomic screening for new enzymes. In this study we show that Thermus thermophilus HB27 is an excellent screening host and can be used as an alternative provider of truly novel biocatalysts. In a previous study we constructed the mutant strain BL03 that was no longer able to grow on defined minimal medium supplemented with tributyrin as the sole carbon source and could be used as a host to screen for metagenomic DNA fragments that could complement growth on tributyrin. Several thousand single fosmid clones from thermophilic metagenomic libraries from heated compost and hot spring water samples were subjected to a comparative screening for esterase activity in both T. thermophilus strain BL03 and E. coli EPI300. We scored a greater number of active clones in the thermophilic bacterium than in the mesophilic E. coli. From all clones functionally screened in E. coli, only two thermostable α/ÎČ-fold hydrolase enzymes with high amino acid sequence similarity to already characterized enzymes were identifiable. In contrast, five further fosmids were found that conferred lipolytic activities in T. thermophilus. Four open reading frames (ORFs) were found which did not share significant similarity to known esterase enzymes. Two of the genes were expressed in both hosts and the novel thermophilic esterases, which based on their primary structures could not be assigned to known esterase or lipase families, were purified and preliminarily characterized. Our work underscores the benefit of using additional screening hosts other than E. coli for the identification of novel biocatalysts with industrial relevance

    The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance

    Get PDF
    INTRODUCTION Investment in Africa over the past year with regard to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing has led to a massive increase in the number of sequences, which, to date, exceeds 100,000 sequences generated to track the pandemic on the continent. These sequences have profoundly affected how public health officials in Africa have navigated the COVID-19 pandemic. RATIONALE We demonstrate how the first 100,000 SARS-CoV-2 sequences from Africa have helped monitor the epidemic on the continent, how genomic surveillance expanded over the course of the pandemic, and how we adapted our sequencing methods to deal with an evolving virus. Finally, we also examine how viral lineages have spread across the continent in a phylogeographic framework to gain insights into the underlying temporal and spatial transmission dynamics for several variants of concern (VOCs). RESULTS Our results indicate that the number of countries in Africa that can sequence the virus within their own borders is growing and that this is coupled with a shorter turnaround time from the time of sampling to sequence submission. Ongoing evolution necessitated the continual updating of primer sets, and, as a result, eight primer sets were designed in tandem with viral evolution and used to ensure effective sequencing of the virus. The pandemic unfolded through multiple waves of infection that were each driven by distinct genetic lineages, with B.1-like ancestral strains associated with the first pandemic wave of infections in 2020. Successive waves on the continent were fueled by different VOCs, with Alpha and Beta cocirculating in distinct spatial patterns during the second wave and Delta and Omicron affecting the whole continent during the third and fourth waves, respectively. Phylogeographic reconstruction points toward distinct differences in viral importation and exportation patterns associated with the Alpha, Beta, Delta, and Omicron variants and subvariants, when considering both Africa versus the rest of the world and viral dissemination within the continent. Our epidemiological and phylogenetic inferences therefore underscore the heterogeneous nature of the pandemic on the continent and highlight key insights and challenges, for instance, recognizing the limitations of low testing proportions. We also highlight the early warning capacity that genomic surveillance in Africa has had for the rest of the world with the detection of new lineages and variants, the most recent being the characterization of various Omicron subvariants. CONCLUSION Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve. This is important not only to help combat SARS-CoV-2 on the continent but also because it can be used as a platform to help address the many emerging and reemerging infectious disease threats in Africa. In particular, capacity building for local sequencing within countries or within the continent should be prioritized because this is generally associated with shorter turnaround times, providing the most benefit to local public health authorities tasked with pandemic response and mitigation and allowing for the fastest reaction to localized outbreaks. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century

    Metoder för intervallcensurerade data och test av stokastisk dominans

    No full text
    This thesis includes four papers: the first three of them are concerned with methods for interval-censored data, while the forth paper is devoted to testing for stochastic dominance. In many studies, the variable of interest is observed to lie within an interval instead of being observed exactly, i.e., each observation is an interval and not a single value. This type of data is known as interval-censored. It may arise in questionnaire-based studies when the respondent gives an answer in the form of an interval without having pre-specified ranges. Such data are called self-selected interval data. In this context, the assumption of noninformative censoring is not fulfilled, and therefore the existing methods for interval-censored data are not necessarily applicable. A problem of interest is to estimate the underlying distribution function. There are two main approaches to this problem: (i) parametric estimation, which assumes a particular functional form of the distribution, and (ii) nonparametric estimation, which does not rely on any distributional assumptions. In Paper A, a nonparametric maximum likelihood estimator for self-selected interval data is proposed and its consistency is shown. Paper B suggests a parametric maximum likelihood estimator. The consistency and asymptotic normality of the estimator are proven. Another interesting problem is to infer whether two samples arise from identical distributions. In Paper C, nonparametric two-sample tests suitable for self-selected interval data are suggested and their properties are investigated through simulations. Paper D concerns testing for stochastic dominance with uncensored data. The paper explores a testing problem which involves four hypotheses, that is, based on observations of two random variables X and Y, one wants to discriminate between four possibilities: identical survival functions, stochastic dominance of X over Y, stochastic dominance of Y over X, or crossing survival functions. Permutation-based tests suitable for two independent samples and for paired samples are proposed. The tests are applied to data from an experiment concerning the individual's willingness to pay for a given environmental improvement

    Metoder för intervallcensurerade data och test av stokastisk dominans

    No full text
    This thesis includes four papers: the first three of them are concerned with methods for interval-censored data, while the forth paper is devoted to testing for stochastic dominance. In many studies, the variable of interest is observed to lie within an interval instead of being observed exactly, i.e., each observation is an interval and not a single value. This type of data is known as interval-censored. It may arise in questionnaire-based studies when the respondent gives an answer in the form of an interval without having pre-specified ranges. Such data are called self-selected interval data. In this context, the assumption of noninformative censoring is not fulfilled, and therefore the existing methods for interval-censored data are not necessarily applicable. A problem of interest is to estimate the underlying distribution function. There are two main approaches to this problem: (i) parametric estimation, which assumes a particular functional form of the distribution, and (ii) nonparametric estimation, which does not rely on any distributional assumptions. In Paper A, a nonparametric maximum likelihood estimator for self-selected interval data is proposed and its consistency is shown. Paper B suggests a parametric maximum likelihood estimator. The consistency and asymptotic normality of the estimator are proven. Another interesting problem is to infer whether two samples arise from identical distributions. In Paper C, nonparametric two-sample tests suitable for self-selected interval data are suggested and their properties are investigated through simulations. Paper D concerns testing for stochastic dominance with uncensored data. The paper explores a testing problem which involves four hypotheses, that is, based on observations of two random variables X and Y, one wants to discriminate between four possibilities: identical survival functions, stochastic dominance of X over Y, stochastic dominance of Y over X, or crossing survival functions. Permutation-based tests suitable for two independent samples and for paired samples are proposed. The tests are applied to data from an experiment concerning the individual's willingness to pay for a given environmental improvement

    Metoder för intervallcensurerade data och test av stokastisk dominans

    No full text
    This thesis includes four papers: the first three of them are concerned with methods for interval-censored data, while the forth paper is devoted to testing for stochastic dominance. In many studies, the variable of interest is observed to lie within an interval instead of being observed exactly, i.e., each observation is an interval and not a single value. This type of data is known as interval-censored. It may arise in questionnaire-based studies when the respondent gives an answer in the form of an interval without having pre-specified ranges. Such data are called self-selected interval data. In this context, the assumption of noninformative censoring is not fulfilled, and therefore the existing methods for interval-censored data are not necessarily applicable. A problem of interest is to estimate the underlying distribution function. There are two main approaches to this problem: (i) parametric estimation, which assumes a particular functional form of the distribution, and (ii) nonparametric estimation, which does not rely on any distributional assumptions. In Paper A, a nonparametric maximum likelihood estimator for self-selected interval data is proposed and its consistency is shown. Paper B suggests a parametric maximum likelihood estimator. The consistency and asymptotic normality of the estimator are proven. Another interesting problem is to infer whether two samples arise from identical distributions. In Paper C, nonparametric two-sample tests suitable for self-selected interval data are suggested and their properties are investigated through simulations. Paper D concerns testing for stochastic dominance with uncensored data. The paper explores a testing problem which involves four hypotheses, that is, based on observations of two random variables X and Y, one wants to discriminate between four possibilities: identical survival functions, stochastic dominance of X over Y, stochastic dominance of Y over X, or crossing survival functions. Permutation-based tests suitable for two independent samples and for paired samples are proposed. The tests are applied to data from an experiment concerning the individual's willingness to pay for a given environmental improvement

    Tests of stochastic dominance with repeated measurements data

    Get PDF
    The paper explores a testing problem which involves four hypotheses, that is, based on observations of two random variables X and Y, we wish to discriminate between four possibilities: identical survival functions, stochastic dominance of X over Y, stochastic dominance of Y over X, or crossing survival functions. Four-decision testing procedures for repeated measurements data are proposed. The tests are based on a permutation approach and do not rely on distributional assumptions. One-sided versions of the CramĂ©r–von Mises, Anderson–Darling, and Kolmogorov–Smirnov statistics are utilized. The consistency of the tests is proven. A simulation study shows good power properties and control of false-detection errors. The suggested tests are applied to data from a psychophysical experiment

    Maximum likelihood estimation for survey data with informative interval censoring

    No full text
    Interval-censored data may arise in questionnaire surveys when, instead of being asked to provide an exact value, respondents are free to answer with any interval without having pre-specified ranges. In this context, the assumption of noninformative censoring is violated, and thus, the standard methods for interval-censored data are not appropriate. This paper explores two schemes for data collection and deals with the problem of estimation of the underlying distribution function, assuming that it belongs to a parametric family. The consistency and asymptotic normality of a proposed maximum likelihood estimator are proven. A bootstrap procedure that can be used for constructing confidence intervals is considered, and its asymptotic validity is shown. A simulation study investigates the performance of the suggested methods

    Nonparametric estimation for self-selected interval data collected through a two-stage approach

    No full text
    Self-selected interval data arise in questionnaire surveys when respondents are free to answer with any interval without having pre-specified ranges. This type of data is a special case of interval-censored data in which the assumption of noninformative censoring is violated, and thus the standard methods for interval-censored data (e.g. Turnbull's estimator) are not appropriate because they can produce biased results. Based on a certain sampling scheme, this paper suggests a nonparametric maximum likelihood estimator of the underlying distribution function. The consistency of the estimator is proven under general assumptions, and an iterative procedure for finding the estimate is proposed. The performance of the method is investigated in a simulation study

    Four-decision tests for stochastic dominance, with an application to environmental psychophysics

    No full text
    If the survival function of a random variable X lies to the right of the survival function of a random variable Y, then X is said to stochastically dominate Y. Inferring stochastic dominance is particularly complicated because comparing survival functions raises four possible hypotheses: identical survival functions, dominance of X over Y, dominance of Y over X, or crossing survival functions. In this paper, we suggest four-decision tests for stochastic dominance suitable for paired samples. The tests are permutation-based and do not rely on distributional assumptions. One-sided CramĂ©r–von Mises and Kolmogorov–Smirnov statistics are employed but the general idea may be utilized with other test statistics. The power to detect dominance and the different types of wrong decisions are investigated in an extensive simulation study. The proposed tests are applied to data from an experiment concerning the individual’s willingness to pay for a given environmental improvement
    corecore