156 research outputs found

    On Measuring the sources of Changes in Poverty using the Shapley method. An Application to Europe

    Get PDF
    For decomposable poverty measures in incidence, intensity and inequality among the poor, poverty changes between two periods can be expressed in terms of the three poverty components in the two periods. However, most of the poverty decompositions cannot be written in a linear form of the terms. We apply the Shapley decomposition approach in order to decompose the overall poverty change as the sum of the contributions of the three poverty components’ changes. We provide a method to compute the contributions for any decomposable poverty index, and specifically, the contributions formulas for the Sen index and the Foster, Greer and Thorbecke index for α=2are shown. Using EU-SILC data for 2008 and 2015 for 28 European Countries, we analyze the change over time in the FGT2poverty index and the value of the marginal contributions of the three components. ©2019 Elsevier B.V. All rights reserved.O. Aristondo gratefully acknowledges the funding support of Departamento de Educación, Política Lingüistica y Cultura del Gobierno Vasco under the project IT568-13 and she also acknowledges the funding support of the Spanish Ministerio de Educación y Ciencia under the project ECO2015-67519, cofunded by FEDER and UPV/EHU, UFI 11/46 BETS. E. Onaindia gratefully acknowledges the funding support of UPV/EHU under the project Ayudas a la innovación para la sostenibilidad de la UPV/EHU 2016

    Exploiting word embeddings for modeling bilexical relations

    Get PDF
    There has been an exponential surge of text data in the recent years. As a consequence, unsupervised methods that make use of this data have been steadily growing in the field of natural language processing (NLP). Word embeddings are low-dimensional vectors obtained using unsupervised techniques on the large unlabelled corpora, where words from the vocabulary are mapped to vectors of real numbers. Word embeddings aim to capture syntactic and semantic properties of words. In NLP, many tasks involve computing the compatibility between lexical items under some linguistic relation. We call this type of relation a bilexical relation. Our thesis defines statistical models for bilexical relations that centrally make use of word embeddings. Our principle aim is that the word embeddings will favor generalization to words not seen during the training of the model. The thesis is structured in four parts. In the first part of this thesis, we present a bilinear model over word embeddings that leverages a small supervised dataset for a binary linguistic relation. Our learning algorithm exploits low-rank bilinear forms and induces a low-dimensional embedding tailored for a target linguistic relation. This results in compressed task-specific embeddings. In the second part of our thesis, we extend our bilinear model to a ternary setting and propose a framework for resolving prepositional phrase attachment ambiguity using word embeddings. Our models perform competitively with state-of-the-art models. In addition, our method obtains significant improvements on out-of-domain tests by simply using word-embeddings induced from source and target domains. In the third part of this thesis, we further extend the bilinear models for expanding vocabulary in the context of statistical phrase-based machine translation. Our model obtains a probabilistic list of possible translations of target language words, given a word in the source language. We do this by projecting pre-trained embeddings into a common subspace using a log-bilinear model. We empirically notice a significant improvement on an out-of-domain test set. In the final part of our thesis, we propose a non-linear model that maps initial word embeddings to task-tuned word embeddings, in the context of a neural network dependency parser. We demonstrate its use for improved dependency parsing, especially for sentences with unseen words. We also show downstream improvements on a sentiment analysis task.En els darrers anys hi ha hagut un sorgiment notable de dades en format textual. Conseqüentment, en el camp del Processament del Llenguatge Natural (NLP, de l'anglès "Natural Language Processing") s'han desenvolupat mètodes no supervistats que fan ús d'aquestes dades. Els anomenats "word embeddings", o embeddings de paraules, són vectors de dimensionalitat baixa que s'obtenen mitjançant tècniques no supervisades aplicades a corpus textuals de grans volums. Com a resultat, cada paraula del diccionari es correspon amb un vector de nombres reals, el propòsit del qual és capturar propietats sintàctiques i semàntiques de la paraula corresponent. Moltes tasques de NLP involucren calcular la compatibilitat entre elements lèxics en l'àmbit d'una relació lingüística. D'aquest tipus de relació en diem relació bilèxica. Aquesta tesi proposa models estadístics per a relacions bilèxiques que fan ús central d'embeddings de paraules, amb l'objectiu de millorar la generalització del model lingüístic a paraules no vistes durant l'entrenament. La tesi s'estructura en quatre parts. A la primera part presentem un model bilineal sobre embeddings de paraules que explota un conjunt petit de dades anotades sobre una relaxió bilèxica. L'algorisme d'aprenentatge treballa amb formes bilineals de poc rang, i indueix embeddings de poca dimensionalitat que estan especialitzats per la relació bilèxica per la qual s'han entrenat. Com a resultat, obtenim embeddings de paraules que corresponen a compressions d'embeddings per a una relació determinada. A la segona part de la tesi proposem una extensió del model bilineal a trilineal, i amb això proposem un nou model per a resoldre ambigüitats de sintagmes preposicionals que usa només embeddings de paraules. En una sèrie d'avaluacións, els nostres models funcionen de manera similar a l'estat de l'art. A més, el nostre mètode obté millores significatives en avaluacions en textos de dominis diferents al d'entrenament, simplement usant embeddings induïts amb textos dels dominis d'entrenament i d'avaluació. A la tercera part d'aquesta tesi proposem una altra extensió dels models bilineals per ampliar la cobertura lèxica en el context de models estadístics de traducció automàtica. El nostre model probabilístic obté, donada una paraula en la llengua d'origen, una llista de possibles traduccions en la llengua de destí. Fem això mitjançant una projecció d'embeddings pre-entrenats a un sub-espai comú, usant un model log-bilineal. Empíricament, observem una millora significativa en avaluacions en dominis diferents al d'entrenament. Finalment, a la quarta part de la tesi proposem un model no lineal que indueix una correspondència entre embeddings inicials i embeddings especialitzats, en el context de tasques d'anàlisi sintàctica de dependències amb models neuronals. Mostrem que aquest mètode millora l'analisi de dependències, especialment en oracions amb paraules no vistes durant l'entrenament. També mostrem millores en un tasca d'anàlisi de sentiment

    A comparison of the CAR and DAGAR spatial random effects models with an application to diabetics rate estimation in Belgium

    Get PDF
    When hierarchically modelling an epidemiological phenomenon on a finite collection of sites in space, one must always take a latent spatial effect into account in order to capture the correlation structure that links the phenomenon to the territory. In this work, we compare two autoregressive spatial models that can be used for this purpose: the classical CAR model and the more recent DAGAR model. Differently from the former, the latter has a desirable property: its ρ parameter can be naturally interpreted as the average neighbor pair correlation and, in addition, this parameter can be directly estimated when the effect is modelled using a DAGAR rather than a CAR structure. As an application, we model the diabetics rate in Belgium in 2014 and show the adequacy of these models in predicting the response variable when no covariates are available

    A Statistical Approach to the Alignment of fMRI Data

    Get PDF
    Multi-subject functional Magnetic Resonance Image studies are critical. The anatomical and functional structure varies across subjects, so the image alignment is necessary. We define a probabilistic model to describe functional alignment. Imposing a prior distribution, as the matrix Fisher Von Mises distribution, of the orthogonal transformation parameter, the anatomical information is embedded in the estimation of the parameters, i.e., penalizing the combination of spatially distant voxels. Real applications show an improvement in the classification and interpretability of the results compared to various functional alignment methods

    Social Mobility in Developing Countries

    Get PDF
    Social mobility is the hope of economic development and the mantra of a good society. There are disagreements about what constitutes social mobility, but there is broad agreement that people should have roughly equal chances of success regardless of their economic status at birth. Concerns about rising inequality have engendered a renewed interest in social mobility—especially in the developing world. However, efforts to construct the databases and meet the standards required for conventional analyses of social mobility are at a preliminary stage and need to be complemented by innovative, conceptual, and methodological advances. If forms of mobility have slowed in the West, then we might be entering an age of rigid stratification with defined boundaries between the always-haves and the never-haves—which does not augur well for social stability. Social mobility research is ongoing, with substantive findings in different disciplines—typically with researchers in isolation from each other. A key contribution of this book is the pulling together of the emerging streams of knowledge. Generating policy-relevant knowledge is a principal concern. Three basic questions frame the study of diverse aspects of social mobility in the book. How to assess the extent of social mobility in a given development context when the datasets by conventional measurement techniques are unavailable? How to identify drivers and inhibitors of social mobility in particular developing country contexts? How to acquire the knowledge required to design interventions to raise social mobility, either by increasing upward mobility or by lowering downward mobility

    Copula-based methods and their application to multidimensional poverty analysis

    Get PDF
    En esta tesis, proponemos utilizar la metodología basada en cópulas para analizar la dependencia entre las dimensiones de la pobreza. Este enfoque, que ha sido recientemente introducido en el ámbito de la Economía del Bienestar, se centra en las posiciones de los individuos en las dimensiones, en lugar de en los valores específicos que esas dimensiones toman para tales individuos, y es particularmente útil cuando se mide la dependencia en contextos multivariantes, posiblemente no gaussianos y posiblemente no lineales, como los que solemos encontrar en los análisis multidimensionales de pobreza o bienestar. En particular, consideramos varios conceptos de dependencia multivariante basados en cópulas que son especialmente adecuados para el estudio de la pobreza multidimensional, a saber, los conceptos de concordancia multivariante, orthant dependence (dependencia en el ortante) y tail dependence (dependencia en las colas) multivariante.Departamento de Economía AplicadaDoctorado en Economí
    corecore