405 research outputs found

    Training Complex Models with Multi-Task Weak Supervision

    Full text link
    As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlated labels, and may label different tasks or apply at different levels of granularity. We propose a framework for integrating and modeling such weak supervision sources by viewing them as labeling different related sub-tasks of a problem, which we refer to as the multi-task weak supervision setting. We show that by solving a matrix completion-style problem, we can recover the accuracies of these multi-task sources given their dependency structure, but without any labeled data, leading to higher-quality supervision for training an end model. Theoretically, we show that the generalization error of models trained with this approach improves with the number of unlabeled data points, and characterize the scaling with respect to the task and dependency structures. On three fine-grained classification problems, we show that our approach leads to average gains of 20.2 points in accuracy over a traditional supervised approach, 6.8 points over a majority vote baseline, and 4.1 points over a previously proposed weak supervision method that models tasks separately

    Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision

    Full text link
    Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently. The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources abstracted as labeling functions (LFs). Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process. In this paper, we attempt to incorporate the instance features into a statistical label model via the proposed FABLE. In particular, it is built on a mixture of Bayesian label models, each corresponding to a global pattern of correlation, and the coefficients of the mixture components are predicted by a Gaussian Process classifier based on instance features. We adopt an auxiliary variable-based variational inference algorithm to tackle the non-conjugate issue between the Gaussian Process and Bayesian label models. Extensive empirical comparison on eleven benchmark datasets sees FABLE achieving the highest averaged performance across nine baselines.Comment: 16 page

    Nonlinear Dynamics and Nucleation Kinetics in Near-Critical Liquids

    Get PDF
    The objective of our study is to model the nonlinear behavior of a near-critical liquid following a rapid change of the temperature and/or other thermodynamic parameters (pressure, external electric or gravitational field). The thermodynamic critical point is manifested by large, strongly correlated fluctuations of the order parameter (particle density in liquid-gas systems, concentration in binary solutions) in the critical range of scales. The largest critical length scale is the correlation radius r(sub c). According to the scaling theory, r(sub c) increases as r(sub c) = r(sub 0)epsilon(exp -alpha) when the nondimensional distance epsilon = (T - T(sub c))/T(sub c) to the critical point decreases. The normal gravity alters the nature of correlated long-range fluctuations when one reaches epsilon approximately equal to 10(exp -5), and correspondingly the relaxation time, tau(r(sub c)), is approximately equal to 10(exp -3) seconds; this time is short when compared to the typical experimental time. Close to the critical point, a rapid, relatively small temperature change may perturb the thermodynamic equilibrium on many scales. The critical fluctuations have a hierarchical structure, and the relaxation involves many length and time scales. Above the critical point, in the one-phase region, we consider the relaxation of the liquid following a sudden temperature change that simultaneously violates the equilibrium on many scales. Below T(sub c), a non-equilibrium state may include a distribution of small scale phase droplets; we consider the relaxation of such a droplet following a temperature change that has made the phase of the matrix stable
    • …
    corecore