104 research outputs found
lp-Recovery of the Most Significant Subspace among Multiple Subspaces with Outliers
We assume data sampled from a mixture of d-dimensional linear subspaces with
spherically symmetric distributions within each subspace and an additional
outlier component with spherically symmetric distribution within the ambient
space (for simplicity we may assume that all distributions are uniform on their
corresponding unit spheres). We also assume mixture weights for the different
components. We say that one of the underlying subspaces of the model is most
significant if its mixture weight is higher than the sum of the mixture weights
of all other subspaces. We study the recovery of the most significant subspace
by minimizing the lp-averaged distances of data points from d-dimensional
subspaces, where p>0. Unlike other lp minimization problems, this minimization
is non-convex for all p>0 and thus requires different methods for its analysis.
We show that if 0<p<=1, then for any fraction of outliers the most significant
subspace can be recovered by lp minimization with overwhelming probability
(which depends on the generating distribution and its parameters). We show that
when adding small noise around the underlying subspaces the most significant
subspace can be nearly recovered by lp minimization for any 0<p<=1 with an
error proportional to the noise level. On the other hand, if p>1 and there is
more than one underlying subspace, then with overwhelming probability the most
significant subspace cannot be recovered or nearly recovered. This last result
does not require spherically symmetric outliers.Comment: This is a revised version of the part of 1002.1994 that deals with
single subspace recovery. V3: Improved estimates (in particular for Lemma 3.1
and for estimates relying on it), asymptotic dependence of probabilities and
constants on D and d and further clarifications; for simplicity it assumes
uniform distributions on spheres. V4: minor revision for the published
versio
Distributionally robust L1-estimation in multiple linear regression
Linear regression is one of the most important and widely used techniques in data analysis, for which a key step is the estimation of the unknown parameters. However, it is often carried out under the assumption that the full information of the error distribution is available. This is clearly unrealistic in practice. In this paper, we propose a distributionally robust formulation of L1-estimation (or the least absolute value estimation) problem, where the only knowledge on the error distribution is that it belongs to a well-defined ambiguity set. We then reformulate the estimation problem as a computationally tractable conic optimization problem by using duality theory. Finally, a numerical example is solved as a conic optimization problem to demonstrate the effectiveness of the proposed approach
Array algorithms for H^2 and H^∞ estimation
Currently, the preferred method for implementing H^2 estimation algorithms is what is called the array form, and includes two main families: square-root array algorithms, that are typically more stable than conventional ones, and fast array algorithms, which, when the system is time-invariant, typically offer an order of magnitude reduction in the computational effort. Using our recent observation that H^∞ filtering coincides with Kalman filtering in Krein space, in this chapter we develop array algorithms for H^∞ filtering. These can be regarded as natural generalizations of their H^2 counterparts, and involve propagating the indefinite square roots of the quantities of interest. The H^∞ square-root and fast array algorithms both have the interesting feature that one does not need to explicitly check for the positivity conditions required for the existence of H^∞ filters. These conditions are built into the algorithms themselves so that an H^∞ estimator of the desired level exists if, and only if, the algorithms can be executed. However, since H^∞ square-root algorithms predominantly use J-unitary transformations, rather than the unitary transformations required in the H^2 case, further investigation is needed to determine the numerical behavior of such algorithms
Ensembles of nested dichotomies with multiple subset evaluation
A system of nested dichotomies (NDs) is a method of decomposing a multiclass problem into a collection of binary problems. Such a system recursively applies binary splits to divide the set of classes into two subsets, and trains a binary classifier for each split. Many methods have been proposed to perform this split, each with various advantages and disadvantages. In this paper, we present a simple, general method for improving the predictive performance of NDs produced by any subset selection techniques that employ randomness to construct the subsets. We provide a theoretical expectation for performance improvements, as well as empirical results showing that our method improves the root mean squared error of NDs, regardless of whether they are employed as an individual model or in an ensemble setting
Psychometric Properties of the Parent and Teacher Versions of the Strengths and Difficulties Questionnaire for 4- to 12-Year-Olds: A Review
Since its development, the Strengths and Difficulties Questionnaire (SDQ) has been widely used in both research and practice. The SDQ screens for positive and negative psychological attributes. This review aims to provide an overview of the psychometric properties of the SDQ for 4- to 12-year-olds. Results from 48 studies (N = 131,223) on reliability and validity of the parent and teacher SDQ are summarized quantitatively and descriptively. Internal consistency, test–retest reliability, and inter-rater agreement are satisfactory for the parent and teacher versions. At subscale level, the reliability of the teacher version seemed stronger compared to that of the parent version. Concerning validity, 15 out of 18 studies confirmed the five-factor structure. Correlations with other measures of psychopathology as well as the screening ability of the SDQ are sufficient. This review shows that the psychometric properties of the SDQ are strong, particularly for the teacher version. For practice, this implies that the use of the SDQ as a screening instrument should be continued. Longitudinal research studies should investigate predictive validity. For both practice and research, we emphasize the use of a multi-informant approach
Molecular mechanisms of cell death: recommendations of the Nomenclature Committee on Cell Death 2018.
Over the past decade, the Nomenclature Committee on Cell Death (NCCD) has formulated guidelines for the definition and interpretation of cell death from morphological, biochemical, and functional perspectives. Since the field continues to expand and novel mechanisms that orchestrate multiple cell death pathways are unveiled, we propose an updated classification of cell death subroutines focusing on mechanistic and essential (as opposed to correlative and dispensable) aspects of the process. As we provide molecularly oriented definitions of terms including intrinsic apoptosis, extrinsic apoptosis, mitochondrial permeability transition (MPT)-driven necrosis, necroptosis, ferroptosis, pyroptosis, parthanatos, entotic cell death, NETotic cell death, lysosome-dependent cell death, autophagy-dependent cell death, immunogenic cell death, cellular senescence, and mitotic catastrophe, we discuss the utility of neologisms that refer to highly specialized instances of these processes. The mission of the NCCD is to provide a widely accepted nomenclature on cell death in support of the continued development of the field
DNA methylation-based classification of central nervous system tumours.
Accurate pathological diagnosis is crucial for optimal management of patients with cancer. For the approximately 100 known tumour types of the central nervous system, standardization of the diagnostic process has been shown to be particularly challenging-with substantial inter-observer variability in the histopathological diagnosis of many tumour types. Here we present a comprehensive approach for the DNA methylation-based classification of central nervous system tumours across all entities and age groups, and demonstrate its application in a routine diagnostic setting. We show that the availability of this method may have a substantial impact on diagnostic precision compared to standard methods, resulting in a change of diagnosis in up to 12% of prospective cases. For broader accessibility, we have designed a free online classifier tool, the use of which does not require any additional onsite data processing. Our results provide a blueprint for the generation of machine-learning-based tumour classifiers across other cancer entities, with the potential to fundamentally transform tumour pathology
Assessments Related to the Physical, Affective and Cognitive Domains of Physical Literacy Amongst Children Aged 7–11.9 Years: A Systematic Review
Background
Over the past decade, there has been increased interest amongst researchers, practitioners and policymakers in physical literacy for children and young people and the assessment of the concept within physical education (PE). This systematic review aimed to identify tools to assess physical literacy and its physical, cognitive and affective domains within children aged 7–11.9 years, and to examine the measurement properties, feasibility and elements of physical literacy assessed within each tool.
Methods
Six databases (EBSCO host platform, MEDLINE, PsycINFO, Scopus, Education Research Complete, SPORTDiscus) were searched up to 10th September 2020. Studies were included if they sampled children aged between 7 and 11.9 years, employed field-based assessments of physical literacy and/or related affective, physical or cognitive domains, reported measurement properties (quantitative) or theoretical development (qualitative), and were published in English in peer-reviewed journals. The methodological quality and measurement properties of studies and assessment tools were appraised using the COnsensus-based Standards for the selection of health Measurement INstruments risk of bias checklist. The feasibility of each assessment was considered using a utility matrix and elements of physical literacy element were recorded using a descriptive checklist.
Results
The search strategy resulted in a total of 11467 initial results. After full text screening, 11 studies (3 assessments) related to explicit physical literacy assessments. Forty-four studies (32 assessments) were relevant to the affective domain, 31 studies (15 assessments) were relevant to the physical domain and 2 studies (2 assessments) were included within the cognitive domain. Methodological quality and reporting of measurement properties within the included studies were mixed. The Canadian Assessment of Physical Literacy-2 and the Passport For Life had evidence of acceptable measurement properties from studies of very good methodological quality and assessed a wide range of physical literacy elements. Feasibility results indicated that many tools would be suitable for a primary PE setting, though some require a level of expertise to administer and score that would require training.
Conclusions
This review has identified a number of existing assessments that could be useful in a physical literacy assessment approach within PE and provides further information to empower researchers and practitioners to make informed decisions when selecting the most appropriate assessment for their needs, purpose and context. The review indicates that researchers and tool developers should aim to improve the methodological quality and reporting of measurement properties of assessments to better inform the field.
Trial registration
PROSPERO: CRD4201706221
- …