104 research outputs found

    lp-Recovery of the Most Significant Subspace among Multiple Subspaces with Outliers

    Full text link
    We assume data sampled from a mixture of d-dimensional linear subspaces with spherically symmetric distributions within each subspace and an additional outlier component with spherically symmetric distribution within the ambient space (for simplicity we may assume that all distributions are uniform on their corresponding unit spheres). We also assume mixture weights for the different components. We say that one of the underlying subspaces of the model is most significant if its mixture weight is higher than the sum of the mixture weights of all other subspaces. We study the recovery of the most significant subspace by minimizing the lp-averaged distances of data points from d-dimensional subspaces, where p>0. Unlike other lp minimization problems, this minimization is non-convex for all p>0 and thus requires different methods for its analysis. We show that if 0<p<=1, then for any fraction of outliers the most significant subspace can be recovered by lp minimization with overwhelming probability (which depends on the generating distribution and its parameters). We show that when adding small noise around the underlying subspaces the most significant subspace can be nearly recovered by lp minimization for any 0<p<=1 with an error proportional to the noise level. On the other hand, if p>1 and there is more than one underlying subspace, then with overwhelming probability the most significant subspace cannot be recovered or nearly recovered. This last result does not require spherically symmetric outliers.Comment: This is a revised version of the part of 1002.1994 that deals with single subspace recovery. V3: Improved estimates (in particular for Lemma 3.1 and for estimates relying on it), asymptotic dependence of probabilities and constants on D and d and further clarifications; for simplicity it assumes uniform distributions on spheres. V4: minor revision for the published versio

    Distributionally robust L1-estimation in multiple linear regression

    Get PDF
    Linear regression is one of the most important and widely used techniques in data analysis, for which a key step is the estimation of the unknown parameters. However, it is often carried out under the assumption that the full information of the error distribution is available. This is clearly unrealistic in practice. In this paper, we propose a distributionally robust formulation of L1-estimation (or the least absolute value estimation) problem, where the only knowledge on the error distribution is that it belongs to a well-defined ambiguity set. We then reformulate the estimation problem as a computationally tractable conic optimization problem by using duality theory. Finally, a numerical example is solved as a conic optimization problem to demonstrate the effectiveness of the proposed approach

    Array algorithms for H^2 and H^∞ estimation

    Get PDF
    Currently, the preferred method for implementing H^2 estimation algorithms is what is called the array form, and includes two main families: square-root array algorithms, that are typically more stable than conventional ones, and fast array algorithms, which, when the system is time-invariant, typically offer an order of magnitude reduction in the computational effort. Using our recent observation that H^∞ filtering coincides with Kalman filtering in Krein space, in this chapter we develop array algorithms for H^∞ filtering. These can be regarded as natural generalizations of their H^2 counterparts, and involve propagating the indefinite square roots of the quantities of interest. The H^∞ square-root and fast array algorithms both have the interesting feature that one does not need to explicitly check for the positivity conditions required for the existence of H^∞ filters. These conditions are built into the algorithms themselves so that an H^∞ estimator of the desired level exists if, and only if, the algorithms can be executed. However, since H^∞ square-root algorithms predominantly use J-unitary transformations, rather than the unitary transformations required in the H^2 case, further investigation is needed to determine the numerical behavior of such algorithms

    Ensembles of nested dichotomies with multiple subset evaluation

    Get PDF
    A system of nested dichotomies (NDs) is a method of decomposing a multiclass problem into a collection of binary problems. Such a system recursively applies binary splits to divide the set of classes into two subsets, and trains a binary classifier for each split. Many methods have been proposed to perform this split, each with various advantages and disadvantages. In this paper, we present a simple, general method for improving the predictive performance of NDs produced by any subset selection techniques that employ randomness to construct the subsets. We provide a theoretical expectation for performance improvements, as well as empirical results showing that our method improves the root mean squared error of NDs, regardless of whether they are employed as an individual model or in an ensemble setting

    Psychometric Properties of the Parent and Teacher Versions of the Strengths and Difficulties Questionnaire for 4- to 12-Year-Olds: A Review

    Get PDF
    Since its development, the Strengths and Difficulties Questionnaire (SDQ) has been widely used in both research and practice. The SDQ screens for positive and negative psychological attributes. This review aims to provide an overview of the psychometric properties of the SDQ for 4- to 12-year-olds. Results from 48 studies (N = 131,223) on reliability and validity of the parent and teacher SDQ are summarized quantitatively and descriptively. Internal consistency, test–retest reliability, and inter-rater agreement are satisfactory for the parent and teacher versions. At subscale level, the reliability of the teacher version seemed stronger compared to that of the parent version. Concerning validity, 15 out of 18 studies confirmed the five-factor structure. Correlations with other measures of psychopathology as well as the screening ability of the SDQ are sufficient. This review shows that the psychometric properties of the SDQ are strong, particularly for the teacher version. For practice, this implies that the use of the SDQ as a screening instrument should be continued. Longitudinal research studies should investigate predictive validity. For both practice and research, we emphasize the use of a multi-informant approach

    Molecular mechanisms of cell death: recommendations of the Nomenclature Committee on Cell Death 2018.

    Get PDF
    Over the past decade, the Nomenclature Committee on Cell Death (NCCD) has formulated guidelines for the definition and interpretation of cell death from morphological, biochemical, and functional perspectives. Since the field continues to expand and novel mechanisms that orchestrate multiple cell death pathways are unveiled, we propose an updated classification of cell death subroutines focusing on mechanistic and essential (as opposed to correlative and dispensable) aspects of the process. As we provide molecularly oriented definitions of terms including intrinsic apoptosis, extrinsic apoptosis, mitochondrial permeability transition (MPT)-driven necrosis, necroptosis, ferroptosis, pyroptosis, parthanatos, entotic cell death, NETotic cell death, lysosome-dependent cell death, autophagy-dependent cell death, immunogenic cell death, cellular senescence, and mitotic catastrophe, we discuss the utility of neologisms that refer to highly specialized instances of these processes. The mission of the NCCD is to provide a widely accepted nomenclature on cell death in support of the continued development of the field

    DNA methylation-based classification of central nervous system tumours.

    Get PDF
    Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging-with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology

    Assessments Related to the Physical, Affective and Cognitive Domains of Physical Literacy Amongst Children Aged 7–11.9 Years: A Systematic Review

    Get PDF
    Background Over the past decade, there has been increased interest amongst researchers, practitioners and policymakers in physical literacy for children and young people and the assessment of the concept within physical education (PE). This systematic review aimed to identify tools to assess physical literacy and its physical, cognitive and affective domains within children aged 7–11.9 years, and to examine the measurement properties, feasibility and elements of physical literacy assessed within each tool. Methods Six databases (EBSCO host platform, MEDLINE, PsycINFO, Scopus, Education Research Complete, SPORTDiscus) were searched up to 10th September 2020. Studies were included if they sampled children aged between 7 and 11.9 years, employed field-based assessments of physical literacy and/or related affective, physical or cognitive domains, reported measurement properties (quantitative) or theoretical development (qualitative), and were published in English in peer-reviewed journals. The methodological quality and measurement properties of studies and assessment tools were appraised using the COnsensus-based Standards for the selection of health Measurement INstruments risk of bias checklist. The feasibility of each assessment was considered using a utility matrix and elements of physical literacy element were recorded using a descriptive checklist. Results The search strategy resulted in a total of 11467 initial results. After full text screening, 11 studies (3 assessments) related to explicit physical literacy assessments. Forty-four studies (32 assessments) were relevant to the affective domain, 31 studies (15 assessments) were relevant to the physical domain and 2 studies (2 assessments) were included within the cognitive domain. Methodological quality and reporting of measurement properties within the included studies were mixed. The Canadian Assessment of Physical Literacy-2 and the Passport For Life had evidence of acceptable measurement properties from studies of very good methodological quality and assessed a wide range of physical literacy elements. Feasibility results indicated that many tools would be suitable for a primary PE setting, though some require a level of expertise to administer and score that would require training. Conclusions This review has identified a number of existing assessments that could be useful in a physical literacy assessment approach within PE and provides further information to empower researchers and practitioners to make informed decisions when selecting the most appropriate assessment for their needs, purpose and context. The review indicates that researchers and tool developers should aim to improve the methodological quality and reporting of measurement properties of assessments to better inform the field. Trial registration PROSPERO: CRD4201706221

    Peripheral quantitative computed tomography (pQCT) for the assessment of bone strength in most of bone affecting conditions in developmental age: a review

    Full text link
    corecore