4,022 research outputs found

    Multidimensional Scaling on Multiple Input Distance Matrices

    Full text link
    Multidimensional Scaling (MDS) is a classic technique that seeks vectorial representations for data points, given the pairwise distances between them. However, in recent years, data are usually collected from diverse sources or have multiple heterogeneous representations. How to do multidimensional scaling on multiple input distance matrices is still unsolved to our best knowledge. In this paper, we first define this new task formally. Then, we propose a new algorithm called Multi-View Multidimensional Scaling (MVMDS) by considering each input distance matrix as one view. Our algorithm is able to learn the weights of views (i.e., distance matrices) automatically by exploring the consensus information and complementary nature of views. Experimental results on synthetic as well as real datasets demonstrate the effectiveness of MVMDS. We hope that our work encourages a wider consideration in many domains where MDS is needed

    LOCK-IN EFFECTS OF EU R&D SPENDING ON REGIONAL GROWTH. A NON-PARAMETRIC AND SEMI-PARAMETRIC CONDITIONAL QUANTILE REGRESSIONS APPROACH

    Get PDF
    The purpose of this paper is twofold. First, we study the allocation of European Union (EU) expenditure in Research and Development (R&D) across European regions. Second, we focus on the effects of this variable on regional per capita GDP levels, and on regional growth rates. Using non-parametric and semi-parametric conditional quantiles, we found empirical evidence in favour of different effects of R&D expenditure among conditional quantiles of the per capita income distribution, and of the growth rates distribution. Moreover, we find a ?lock-in effect? of R&D spending. A positive relation between growth rates and this component of the EU expenditure is estimated for regions with higher growth rates, with these regions tending to have a higher and common growth rate as R&D expenditure increases. Furthermore, slow growth regions seem to approach to a common but lower growth rate. The estimates relative to the relationship between the per capita regional GDP and the R&D spending confirm these findings. El objetivo de este trabajo es doble. Primero, se asignan los gastos enInvestigación y Desarrollo (I+D) de la Unión Europea (EU) entre las regiones de lospaíses miembros. Segundo, se estudian los efectos de dicha variable sobre ladistribución del Producto Interior Bruto (PIB) per cápita y sobre la distribución de lastasas de crecimiento del PIB. Utilizando estimaciones cuantílicas condicionales noparamétricasy semi-paramétricas se encuentra evidencia empírica de efectosdiferenciados de los gastos en I+D sobre los cuantiles de dichas distribuciones. Además,se encuentra un efecto ¿cerrojo¿ del gasto europeo en I+D: existe una relación positivaentre este tipo de gasto y las tasas de crecimiento del PIB para regiones con altas tasasde crecimiento. Las regiones con bajos niveles de crecimiento tienden a crecer a tasasinferiores. Las estimaciones relativas al efecto del gasto en I+D sobre la distribución delPIB per capita de las regiones europeas confirman dichos resultados.Presupuesto EU, Gasto I+D, Crecimiento, Cuantiles Condicionales. EU Budget, R&D Expenditure, Growth Rates, Conditional Quantiles.

    Monotone deep Boltzmann machines

    Full text link
    Deep Boltzmann machines (DBMs), one of the first ``deep'' learning methods ever studied, are multi-layered probabilistic models governed by a pairwise energy function that describes the likelihood of all variables/nodes in the network. In practice, DBMs are often constrained, i.e., via the \emph{restricted} Boltzmann machine (RBM) architecture (which does not permit intra-layer connections), in order to allow for more efficient inference. In this work, we revisit the generic DBM approach, and ask the question: are there other possible restrictions to their design that would enable efficient (approximate) inference? In particular, we develop a new class of restricted model, the monotone DBM, which allows for arbitrary self-connection in each layer, but restricts the \emph{weights} in a manner that guarantees the existence and global uniqueness of a mean-field fixed point. To do this, we leverage tools from the recently-proposed monotone Deep Equilibrium model and show that a particular choice of activation results in a fixed-point iteration that gives a variational mean-field solution. While this approach is still largely conceptual, it is the first architecture that allows for efficient approximate inference in fully-general weight structures for DBMs. We apply this approach to simple deep convolutional Boltzmann architectures and demonstrate that it allows for tasks such as the joint completion and classification of images, within a single deep probabilistic setting, while avoiding the pitfalls of mean-field inference in traditional RBMs

    RIFLE: Robust Inference from Low Order Marginals

    Full text link
    The ubiquity of missing values in real-world datasets poses a challenge for statistical inference and can prevent similar datasets from being analyzed in the same study, precluding many existing datasets from being used for new analyses. While an extensive collection of packages and algorithms have been developed for data imputation, the overwhelming majority perform poorly if there are many missing values and low sample size, which are unfortunately common characteristics in empirical data. Such low-accuracy estimations adversely affect the performance of downstream statistical models. We develop a statistical inference framework for predicting the target variable without imputing missing values. Our framework, RIFLE (Robust InFerence via Low-order moment Estimations), estimates low-order moments with corresponding confidence intervals to learn a distributionally robust model. We specialize our framework to linear regression and normal discriminant analysis, and we provide convergence and performance guarantees. This framework can also be adapted to impute missing data. In numerical experiments, we compare RIFLE with state-of-the-art approaches (including MICE, Amelia, MissForest, KNN-imputer, MIDA, and Mean Imputer). Our experiments demonstrate that RIFLE outperforms other benchmark algorithms when the percentage of missing values is high and/or when the number of data points is relatively small. RIFLE is publicly available.Comment: 32 pages, 10 figure

    Multiple imputation with compatibility for high-dimensional data

    Get PDF
    Multiple Imputation (MI) is always challenging in high dimensional settings. The imputation model with some selected number of predictors can be incompatible with the analysis model leading to inconsistent and biased estimates. Although compatibility in such cases may not be achieved, but one can obtain consistent and unbiased estimates using a semi-compatible imputation model. We propose to relax the lasso penalty for selecting a large set of variables (at most n). The substantive model that also uses some formal variable selection procedure in high-dimensional structures is then expected to be nested in this imputation model. The resulting imputation model will be semi-compatible with high probability. The likelihood estimates can be unstable and can face the convergence issues as the number of variables becomes nearly as large as the sample size. To address these issues, we further propose to use a ridge penalty for obtaining the posterior distribution of the parameters based on the observed data. The proposed technique is compared with the standard MI software and MI techniques available for high-dimensional data in simulation studies and a real life dataset. Our results exhibit the superiority of the proposed approach to the existing MI approaches while addressing the compatibility issue

    On some approximately balanced combinatorial cooperative games

    Get PDF
    A model of taxation for cooperativen-person games is introduced where proper coalitions Are taxed proportionally to their value. Games with non-empty core under taxation at rateɛ-balanced. Sharp bounds onɛ in matching games (not necessarily bipartite) graphs are estabLished. Upper and lower bounds on the smallestɛ in bin packing games are derived and euclidean random TSP games are seen to be, with high probability,ɛ-balanced forɛ≈0.06
    corecore