1,243 research outputs found

    Learning Determinantal Point Processes

    Get PDF
    Determinantal point processes (DPPs), which arise in random matrix theory and quantum physics, are natural models for subset selection problems where diversity is preferred. Among many remarkable properties, DPPs offer tractable algorithms for exact inference, including computing marginal probabilities and sampling; however, an important open question has been how to learn a DPP from labeled training data. In this paper we propose a natural feature-based parameterization of conditional DPPs, and show how it leads to a convex and efficient learning formulation. We analyze the relationship between our model and binary Markov random fields with repulsive potentials, which are qualitatively similar but computationally intractable. Finally, we apply our approach to the task of extractive summarization, where the goal is to choose a small subset of sentences conveying the most important information from a set of documents. In this task there is a fundamental tradeoff between sentences that are highly relevant to the collection as a whole, and sentences that are diverse and not repetitive. Our parameterization allows us to naturally balance these two characteristics. We evaluate our system on data from the DUC 2003/04 multi-document summarization task, achieving state-of-the-art results

    Approximate Inference in Continuous Determinantal Point Processes

    Full text link
    Determinantal point processes (DPPs) are random point processes well-suited for modeling repulsion. In machine learning, the focus of DPP-based models has been on diverse subset selection from a discrete and finite base set. This discrete setting admits an efficient sampling algorithm based on the eigendecomposition of the defining kernel matrix. Recently, there has been growing interest in using DPPs defined on continuous spaces. While the discrete-DPP sampler extends formally to the continuous case, computationally, the steps required are not tractable in general. In this paper, we present two efficient DPP sampling schemes that apply to a wide range of kernel functions: one based on low rank approximations via Nystrom and random Fourier feature techniques and another based on Gibbs sampling. We demonstrate the utility of continuous DPPs in repulsive mixture modeling and synthesizing human poses spanning activity spaces

    Formulation Development, Preclinical Testing, and Primary Packaging Optimization for Cannabinoids and Other Therapeutics

    Get PDF
    Through the process of drug development, a molecule goes through discovery screening; lead selection and optimization, ADME testing, toxicity profiling, dosage form determination, preclinical testing in an in vitro and in vivo setup, folloby clinical research, FDA review and approval until eventually it is manufactured in the determined dosage form and reaches the patient. At every point through this process, scientists actively work towards a smoother transition and a quick and safe approval of the molecule towards the next step. The different chapters in this research would cover various phases of drug development; from discovery stage to fill-finish and primary container compatibility

    Stuctured Predictions Cascades

    Get PDF
    Structured prediction tasks pose a fundamental trade off between the need for model complexity to increase predictive power and the limited computational resources for inference in the exponentially-sized output spaces such models require. We formulate and develop structured prediction cascades: a sequence of increasingly complex models that progressively filter the space of possible outputs. We represent an exponentially large set of filtered outputs using max marginals and propose a novel convex loss function that balances filtering error with filtering efficiency. We provide generalization bounds for these loss functions and evaluate our approach on handwriting recognition and part-of-speech tagging. We find that the learned cascades are capable of reducing the complexity of inference by up to five orders of magnitude, enabling the use of models which incorporate higher order features and yield higher accuracy

    Exploiting bacterial isolates for diesel degrading potential under in vitro conditions

    Get PDF
    Hydrocarbon contaminated oil-spilled areas and oil-products have caused serious harm with increasing attention for development, implementation and removal of these contaminants. Bacterial diversity on succession at the petroleum hydrocarbon contaminated environment can give answer the problem. Such lands have serious problems as totally barren or with rare plantation. Bacteria can thereby be exploited for the mitigation of hydrocarbon to enhance the nutrient availability for vegetation. Present study involves collection of soil samples heavily contaminated with hydrocarbon from Bagru (Rajasthan). Samples were analysed by solid liquid extraction method followed by FTIR (Fourier Transform Infrared) and HPLC (High Performance Liquid Chromatography) analysis. During microbiological analysis hydrocarbon degrading bacteria were screened. FTIR spectral analysis indicated the presence of the functional group’s alkanes and aromatic ringed compounds; 43% to 69% hydrocarbon content recorded by HPLC analysis of all the soil samples respectively. From the soil samples six gram-positive and four gram-negative bacterial isolates were explored possessing hydrocarbon degrading capacities in the range 47.04-87.31% and 10.12-95.24% respectively. Growth kinetic studies revealed the degradation up to 1000 ppm diesel in 3 days under in vitro conditions. These bacteria can further be exploited for diesel degradation and will certainly propose a possible solution to the prevailing issue for its biodegradation in ex-situ conditions after up scaling

    Multi-task feature selection

    Get PDF
    We address joint feature selection across a group of classification or regression tasks. In many multi-task learning scenarios, different but related tasks share a large proportion of relevant features. We propose a novel type of joint regularization for the parameters of support vector machines in order to couple feature selection across tasks. Intuitively, we extend the â„“1 regularization for single-task estimation to the multi-task setting. By penalizing the sum of â„“2-norms of the blocks of coefficients associated with each feature across different tasks, we encourage multiple predictors to have similar parameter sparsity patterns. This approach yields convex, nondifferentiable optimization problems that can be solved efficiently using a simple and scalable extragradient algorithm. We show empirically that our approach outperforms independent â„“1-based feature selection on several datasets. 1

    Sidestepping Intractable Inference with Structured Ensemble Cascades

    Get PDF
    For many structured prediction problems, complex models often require adopting approximate inference techniques such as variational methods or sampling, which generally provide no satisfactory accuracy guarantees. In this work, we propose sidestepping intractable inference altogether by learning ensembles of tractable sub-models as part of a structured prediction cascade. We focus in particular on problems with high-treewidth and large state-spaces, which occur in many computer vision tasks. Unlike other variational methods, our ensembles do not enforce agreement between sub-models, but filter the space of possible outputs by simply adding and thresholding the max-marginals of each constituent model. Our framework jointly estimates parameters for all models in the ensemble for each level of the cascade by minimizing a novel, convex loss function, yet requires only a linear increase in computation over learning or inference in a single tractable sub-model. We provide a generalization bound on the filtering loss of the ensemble as a theoretical justification of our approach, and we evaluate our method on both synthetic data and the task of estimating articulated human pose from challenging videos. We find that our approach significantly outperforms loopy belief propagation on the synthetic data and a state-of-the-art model on the pose estimation/tracking problem
    • …
    corecore