8 research outputs found

    Revisiting the limits of MAP inference by MWSS on perfect graphs

    Get PDF
    This is the author accepted manuscript. The final version is available from MIT Press via http://jmlr.org/proceedings/papers/v38/weller15.pdfA recent, promising approach to identifying a configuration of a discrete graphical model with highest probability (termed MAP inference) is to reduce the problem to finding a maximum weight stable set (MWSS) in a derived weighted graph, which, if perfect, allows a solution to be found in polynomial time. Weller and Jebara (2013) investigated the class of binary pairwise models where this method may be applied. However, their analysis made a seemingly innocuous assumption which simplifies analysis but led to only a subset of possible reparameterizations being considered. Here we introduce novel techniques and consider all cases, demonstrating that this greatly expands the set of tractable models. We provide a simple, exact characterization of the new, enlarged set and show how such models may be efficiently identified, thus settling the power of the approach on this class

    Tightness of LP relaxations for almost balanced models

    Get PDF
    This is the author accepted manuscript. The final version is available from MIcrotome Publishing via http://www.jmlr.org/proceedings/papers/v51/weller16b.html.Linear programming (LP) relaxations are widely used to attempt to identify a most likely configuration of a discrete graphical model. In some cases, the LP relaxation attains an optimum vertex at an integral location and thus guarantees an exact solution to the original optimization problem. When this occurs, we say that the LP relaxation is tight. Here we consider binary pairwise models and derive sufficient conditions for guaranteed tightness of (i) the standard LP relaxation on the local polytope LP+LOC, and (ii) the LP relaxation on the triplet-consistent polytope LP+TRI (the next level in the Sherali-Adams hierarchy). We provide simple new proofs of earlier results and derive significant novel results including that LP+TRI is tight for any model where each block is balanced or almost balanced, and a decomposition theorem that may be used to break apart complex models into smaller pieces. An almost balanced (sub-)model is one that contains no frustrated cycles except through one privileged variable.MR acknowledges support by the UK Engineering and Physical Sciences Research Council (EPSRC) grant EP/L016516/1 for the University of Cambridge Centre for Doctoral Training, the Cambridge Centre for Analysis. DS was supported by NSF CAREER award #1350965

    Structure in Machine Learning: Graphical Models and Monte Carlo Methods

    Get PDF
    This thesis is concerned with two main areas: approximate inference in discrete graphical models, and random embeddings for dimensionality reduction and approximate inference in kernel methods. Approximate inference is a fundamental problem in machine learning and statistics, with strong connections to other domains such as theoretical computer science. At the same time, there has often been a gap between the success of many algorithms in this area in practice, and what can be explained by theory; thus, an important research effort is to bridge this gap. Random embeddings for dimensionality reduction and approximate inference have led to great improvements in scalability of a wide variety of methods in machine learning. In recent years, there has been much work on how the stochasticity introduced by these approaches can be better controlled, and what further computational improvements can be made. In the first part of this thesis, we study approximate inference algorithms for discrete graphical models. Firstly, we consider linear programming methods for approximate MAP inference, and develop our understanding of conditions for exactness of these approximations. Such guarantees of exactness are typically based on either structural restrictions on the underlying graph corresponding to the model (such as low treewidth), or restrictions on the types of potential functions that may be present in the model (such as log-supermodularity). We contribute two new classes of exactness guarantees: the first of these takes the form of particular hybrid restrictions on a combination of graph structure and potential types, whilst the second is given by excluding particular substructures from the underlying graph, via graph minor theory. We also study a particular family of transformation methods of graphical models, uprooting and rerooting, and their effect on approximate MAP and marginal inference methods. We prove new theoretical results on the behaviour of particular approximate inference methods under these transformations, in particular showing that the triplet relaxation of the marginal polytope is unique in being universally rooted. We also introduce a heuristic which quickly picks a rerooting, and demonstrate benefits empirically on models over several graph topologies. In the second part of this thesis, we study Monte Carlo methods for both linear dimensionality reduction and approximate inference in kernel machines. We prove the statistical benefit of coupling Monte Carlo samples to be almost-surely orthogonal in a variety of contexts, and study fast approximate methods of inducing this coupling. A surprising result is that these approximate methods can simultaneously offer improved statistical benefits, time complexity, and space complexity over i.i.d. Monte Carlo samples. We evaluate our methods on a variety of datasets, directly studying their effects on approximate kernel evaluation, as well as on downstream tasks such as Gaussian process regression.EPSR

    Collocational processing in typologically different languages, English and Turkish::Evidence from corpora and psycholinguistic experimentation

    Get PDF
    Unlike the traditional words-and-rules approach to language processing (Pinker, 1999), usage-based models of language have emphasised the role of multi-word sequences (Christiansen & Chater, 2016b; Ellis, 2002). Various psycholinguistic experiments have demonstrated that multi-word sequences (MWS) are processed quantitatively faster than novel phrases by both L1 and L2 speakers (e.g. Arnon & Snider, 2010; Wolter & Yamashita, 2018). Collocations, a specific type of MWS, hold a prominent position in psycholinguistics, corpus linguistics and language pedagogy research. (Gablasova, Brezina, McEnery, 2017a). In this dissertation, I explored the processing of adjective-noun collocations in Turkish and English by L1 speakers of these languages through a corpus-based study and psycholinguistic experiments. Turkish is an agglutinating language with a rich morphology, it is therefore valid to ask if agglutinating structure of Turkish affects collocational processing in L1 Turkish and whether the same factors affect the processing of collocations in English and Turkish. In addition, this study looked at L1 and L2 processing of collocations in English. This thesis firstly has investigated the frequency counts and associations statistics of English and Turkish adjective-noun collocations through a corpus-based analysis of general reference corpora of English and Turkish. The corpus study showed that unlemmatised collocations, which does not take into account the inflected forms of the collocations, have similar mean frequency and association counts in the both languages. This suggests that the base forms – uninflected forms of the collocations in English and Turkish do not appear to have notably different frequency and association counts from each other. To test the effect of agglutinating structure of Turkish on the collocability of adjectives and nouns, the lemmatised forms of the collocations in the both languages were examined. In other words, collocations in the two languages were lemmatised. The lemmatisation brings the benefit of including the frequency counts of both the base and inflected forms of the collocations. The findings indicated that the vast majority (%75) of the lemmatised Turkish adjective-noun combinations occur at a higher-frequency than their English equivalents. In addition, agglutinating structure of Turkish appears to increase adjective-noun collocations’ association scores in the both frequency bands since the vast majority of Turkish collocations reach higher scores of collocational strengths than their unlemmatised forms. After the corpus study, I designed psycholinguistic experiments to explore the sensitivity of speakers of these languages to the frequency of adjectives, nouns and whole collocations in acceptability judgment tasks in English and Turkish. Mixed-effects regression modelling revealed that collocations which have similar collocational frequency and association scores are processed at comparable speeds in English and Turkish by L1 speakers of these languages. That is to say, both Turkish and English speakers are sensitive to the collocation frequency counts. This finding is in line with many previous empirical studies that language users process MWS quantitively faster than control phrases (e.g. Arnon & Snider, 2010; McDonald & Shillcock, 2003; Vilkaite, 2016). However, lemmatised collocation frequency counts affected the processing of Turkish and English collocations differently, and Turkish speakers appeared to attend to word-level frequency counts of collocations to a lesser extent than English speakers. These findings suggest that different mechanisms underlie L1 processing of English and Turkish collocations. The present study also looked at the sensitivity of L1 and L2 advanced speakers to the frequency of adjectives, nouns and whole collocations in English. Mixed-effects regression modelling revealed that L2 advanced speakers are sensitive to the collocation frequency counts like L1 English speakers because as the collocation frequency counts increased, L1 Turkish-English L2 speakers responded to the collocations in English more quickly, as L1 English speakers did. The results indicated that both groups showed sensitivity to noun frequency counts, and L2 English advanced speakers did not appear to rely on the noun frequency scores more heavily than the L1 English group while processing adjective-noun collocations. These findings are in conflict with the claims that L2 speakers process MWS differently than L1 speakers (Wray, 2002)
    corecore