2,353 research outputs found

    Correcting for Selection Bias and Missing Response in Regression using Privileged Information

    Full text link
    When estimating a regression model, we might have data where some labels are missing, or our data might be biased by a selection mechanism. When the response or selection mechanism is ignorable (i.e., independent of the response variable given the features) one can use off-the-shelf regression methods; in the nonignorable case one typically has to adjust for bias. We observe that privileged data (i.e. data that is only available during training) might render a nonignorable selection mechanism ignorable, and we refer to this scenario as Privilegedly Missing at Random (PMAR). We propose a novel imputation-based regression method, named repeated regression, that is suitable for PMAR. We also consider an importance weighted regression method, and a doubly robust combination of the two. The proposed methods are easy to implement with most popular out-of-the-box regression algorithms. We empirically assess the performance of the proposed methods with extensive simulated experiments and on a synthetically augmented real-world dataset. We conclude that repeated regression can appropriately correct for bias, and can have considerable advantage over weighted regression, especially when extrapolating to regions of the feature space where response is never observed

    5-Approximation for H\mathcal{H}-Treewidth Essentially as Fast as H\mathcal{H}-Deletion Parameterized by Solution Size

    Full text link
    The notion of H\mathcal{H}-treewidth, where H\mathcal{H} is a hereditary graph class, was recently introduced as a generalization of the treewidth of an undirected graph. Roughly speaking, a graph of H\mathcal{H}-treewidth at most kk can be decomposed into (arbitrarily large) H\mathcal{H}-subgraphs which interact only through vertex sets of size O(k)O(k) which can be organized in a tree-like fashion. H\mathcal{H}-treewidth can be used as a hybrid parameterization to develop fixed-parameter tractable algorithms for H\mathcal{H}-deletion problems, which ask to find a minimum vertex set whose removal from a given graph GG turns it into a member of H\mathcal{H}. The bottleneck in the current parameterized algorithms lies in the computation of suitable tree H\mathcal{H}-decompositions. We present FPT approximation algorithms to compute tree H\mathcal{H}-decompositions for hereditary and union-closed graph classes H\mathcal{H}. Given a graph of H\mathcal{H}-treewidth kk, we can compute a 5-approximate tree H\mathcal{H}-decomposition in time f(O(k))nO(1)f(O(k)) \cdot n^{O(1)} whenever H\mathcal{H}-deletion parameterized by solution size can be solved in time f(k)nO(1)f(k) \cdot n^{O(1)} for some function f(k)2kf(k) \geq 2^k. The current-best algorithms either achieve an approximation factor of kO(1)k^{O(1)} or construct optimal decompositions while suffering from non-uniformity with unknown parameter dependence. Using these decompositions, we obtain algorithms solving Odd Cycle Transversal in time 2O(k)nO(1)2^{O(k)} \cdot n^{O(1)} parameterized by bipartite\mathsf{bipartite}-treewidth and Vertex Planarization in time 2O(klogk)nO(1)2^{O(k \log k)} \cdot n^{O(1)} parameterized by planar\mathsf{planar}-treewidth, showing that these can be as fast as the solution-size parameterizations and giving the first ETH-tight algorithms for parameterizations by hybrid width measures.Comment: Conference version to appear at the European Symposium on Algorithms (ESA 2023

    Single-Exponential FPT Algorithms for Enumerating Secluded F\mathcal{F}-Free Subgraphs and Deleting to Scattered Graph Classes

    Full text link
    The celebrated notion of important separators bounds the number of small (S,T)(S,T)-separators in a graph which are 'farthest from SS' in a technical sense. In this paper, we introduce a generalization of this powerful algorithmic primitive that is phrased in terms of kk-secluded vertex sets: sets with an open neighborhood of size at most kk. In this terminology, the bound on important separators says that there are at most 4k4^k maximal kk-secluded connected vertex sets CC containing SS but disjoint from TT. We generalize this statement significantly: even when we demand that G[C]G[C] avoids a finite set F\mathcal{F} of forbidden induced subgraphs, the number of such maximal subgraphs is 2O(k)2^{O(k)} and they can be enumerated efficiently. This allows us to make significant improvements for two problems from the literature. Our first application concerns the 'Connected kk-Secluded F\mathcal{F}-free subgraph' problem, where F\mathcal{F} is a finite set of forbidden induced subgraphs. Given a graph in which each vertex has a positive integer weight, the problem asks to find a maximum-weight connected kk-secluded vertex set CV(G)C \subseteq V(G) such that G[C]G[C] does not contain an induced subgraph isomorphic to any FFF \in \mathcal{F}. The parameterization by kk is known to be solvable in triple-exponential time via the technique of recursive understanding, which we improve to single-exponential. Our second application concerns the deletion problem to scattered graph classes. Here, the task is to find a vertex set of size at most kk whose removal yields a graph whose each connected component belongs to one of the prescribed graph classes Π1,,Πd\Pi_1, \ldots, \Pi_d. We obtain a single-exponential algorithm whenever each class Πi\Pi_i is characterized by a finite number of forbidden induced subgraphs. This generalizes and improves upon earlier results in the literature.Comment: To appear at ISAAC'2

    Pediatric metabolic syndrome definitions impact prevalence and socioeconomic gradients

    Get PDF
    The choice of pediatric metabolic syndrome (MetS) definition influences prevalence estimates, but further implications, especially on the association with socioeconomic status (SES), are not well-known. This hampers a synthesis of the evidence to help guide the relevant stakeholders. For this reason, we aim to assess the impact of alternative definitions on the prevalence of MetS, the children that are identified, and the association between SES and MetS.Data were used from the Lifelines Cohort Study, a prospective multigenerational cohort in the Netherlands. At baseline 9,754 children participated, of which 5,085 (52.1\ were included in the longitudinal analyses. We computed the prevalence of MetS according to five published definitions and measured the observed positive agreement between pairs of definitions, indicating the proportion of agreement across the average number of MetS cases. Logistic regression was used to assess the association between SES and MetS. All models were adjusted for age and sex; the longitudinal models were also adjusted for baseline MetS status.The prevalence rates of MetS varied between definitions (0.7-3.0\, but positive agreement between MetS definitions was generally fair to good ranging from 0.34 (95\CI) 0.28; 0.41) to 0.66 (95\.58; 0.75) at baseline. At both assessments, we found an inverse association between baseline SES and MetS, which ranged from 0.81 (95\.70; 0.93) to 0.92 (95\.86; 0.98) per definition in the longitudinal analyses with a mean follow-up (SD) of 3.0 (0.75) years.Alternative definitions of MetS lead to differing prevalence estimates, and they agreed on 50\ regardless of which definition was used we concluded low SES was a risk factor for developing MetS.Evidence regarding different definitions of metabolic syndrome in children can be combined because the agreement among definitions is generally fair to good.As low socioeconomic status is a consistent risk factor for developing metabolic syndrome, preventive interventions should preferentially target children from low socioeconomic backgrounds

    5-Approximation for ?-Treewidth Essentially as Fast as ?-Deletion Parameterized by Solution Size

    Get PDF
    The notion of ?-treewidth, where ? is a hereditary graph class, was recently introduced as a generalization of the treewidth of an undirected graph. Roughly speaking, a graph of ?-treewidth at most k can be decomposed into (arbitrarily large) ?-subgraphs which interact only through vertex sets of size ?(k) which can be organized in a tree-like fashion. ?-treewidth can be used as a hybrid parameterization to develop fixed-parameter tractable algorithms for ?-deletion problems, which ask to find a minimum vertex set whose removal from a given graph G turns it into a member of ?. The bottleneck in the current parameterized algorithms lies in the computation of suitable tree ?-decompositions. We present FPT-approximation algorithms to compute tree ?-decompositions for hereditary and union-closed graph classes ?. Given a graph of ?-treewidth k, we can compute a 5-approximate tree ?-decomposition in time f(?(k)) ? n^?(1) whenever ?-deletion parameterized by solution size can be solved in time f(k) ? n^?(1) for some function f(k) ? 2^k. The current-best algorithms either achieve an approximation factor of k^?(1) or construct optimal decompositions while suffering from non-uniformity with unknown parameter dependence. Using these decompositions, we obtain algorithms solving Odd Cycle Transversal in time 2^?(k) ? n^?(1) parameterized by bipartite-treewidth and Vertex Planarization in time 2^?(k log k) ? n^?(1) parameterized by planar-treewidth, showing that these can be as fast as the solution-size parameterizations and giving the first ETH-tight algorithms for parameterizations by hybrid width measures

    Search-Space Reduction via Essential Vertices

    Get PDF
    We investigate preprocessing for vertex-subset problems on graphs. While the notion of kernelization, originating in parameterized complexity theory, is a formalization of provably effective preprocessing aimed at reducing the total instance size, our focus is on finding a non-empty vertex set that belongs to an optimal solution. This decreases the size of the remaining part of the solution which still has to be found, and therefore shrinks the search space of fixed-parameter tractable algorithms for parameterizations based on the solution size. We introduce the notion of a c-essential vertex as one that is contained in all c-approximate solutions. For several classic combinatorial problems such as Odd Cycle Transversal and Directed Feedback Vertex Set, we show that under mild conditions a polynomial-time preprocessing algorithm can find a subset of an optimal solution that contains all 2-essential vertices, by exploiting packing/covering duality. This leads to FPT algorithms to solve these problems where the exponential term in the running time depends only on the number of non-essential vertices in the solution

    Adolescents' mental health problems increase after parental divorce, not before, and persist until adulthood:a longitudinal TRAILS study

    Get PDF
    Parental divorce is one of the most stressful life events for youth and is often associated with (long-lasting) emotional and behavioral problems (EBP). However, not much is known about the timing of the emergence of these EBP in adolescents relative to the moment of parental divorce, and its longitudinal effects. We therefore assessed this timing of EBP in adolescents of divorce and its longitudinal effects. We used the first four waves of the TRacking Adolescent's Individual Lives Survey (TRAILS) cohort, which included 2230 10-12 years olds at baseline. EBP were measured through the Youth Self-Report (YSR), as internalizing and externalizing problems. We applied multilevel analysis to assess the effect of divorce on EBP. The levels of both internalizing and externalizing problems were significantly higher in the period after parental divorce (beta = 0.03, and 0.03, respectively; p <0.05), but not in the period before divorce, with a persistent and increasing effect over the follow-up periods compared to adolescents not experiencing divorce. Adolescents tend to develop more EBP in the period after parental divorce, not before. These effects are long-lasting and underline the need for better care for children with divorcing parents

    Effectiveness and cost-effectiveness of a nurse-delivered intervention to improve adherence to treatment for HIV : a pragmatic, multicentre, open-label, randomised clinical trial

    Get PDF
    This trial was funded from public money by the Netherlands Organisation for Health Research and Development (ZonMW; grant number 171002208). Aardex provided support on the development of the study website. We thank all the HIV nurses and physicians from the seven HIV clinics involved in the AIMS study for their input and collaboration (Academic Medical Centre, Slotervaart hospital, and St. Lucas-Andreas hospital, all in Amsterdam; the Leiden University Medical Centre, Leiden; HAGA hospital, The Hague; Erasmus Medical Centre, Rotterdam; and Isala clinic, Zwolle), the study participants, and the Stichting HIV Monitoring (SHM) for their support in accessing the SHM database for identifying patient inclusion criteria and developing the Markov model. Finally, we thank and remember Herman Schaalma (deceased) for his contribution to the study design and grant application.Peer reviewedPostprin
    corecore